Many businesses generate a significant amount of data on a daily basis. Each transaction, each contact with a customer, supplier, or employee can yield valuable data that might become useful to the enterprise. In simple terms, a data warehouse is a repository of raw data collected and organized for use by analytical applications and user queries. Although it normally resides on a high-end server (or servers), the data can be stored on many different media including magnetic disk, magnetic tape, DVD, CD, microfiche, and many other media.
Data warehousing is composed of two phases. First, enterprise systems extract data from online transaction processing (OLTP) applications (as well as other sources). Second, the data is filtered, categorized, and stored. Data warehouse applications address each phase.
Ideally, all customer interaction should be collected, including telephone calls, U.S. Postal Service mail inquires, website, inquiries, product or service sales statistics, parts inventories, supply chain statistics, and so forth.
Another important issue is timeliness of data. If information is not kept up-to-date, a data warehouse can easily become a data tomb.
A data warehouse is like a library and the Dewey Decimal classification. The library houses books, and the Dewey Decimal classification organizes books. The classification scheme enables a library patron to find a book, but the book’s content is not assimilated until the patron reads it. Likewise, raw data from a data warehouse is not knowledge.
Although the data warehouse appears as a unified set of stored information, in reality the data warehouse may be serveral databases either centrallized or distributed across the enterprise. The data warehousing application provides a unified view into the information set.
A data warehouse is the basic foundational element of an overall business intelligence application.
|| Data warehousing|