Data Warehouse System Instance
Jump to navigation
Jump to search
A Data Warehouse System Instance is a large subject-oriented, integrated, time-varying, non-volatile analytical database system that supports data warehouse tasks.
- Context:
- It can (typically) be based on a Data Warehouse Platform (such as Snowflake datawarehouse).
- It can (typically) contain Data Warehouse Tables, such as fact table, dimension table, ...
- It can (typically) be used in Organizational Decision-Making.
- It can (often) be composed of Data Mart Instances.
- It can be queries by a Data Warehouse Querying System, often a business intelligence querying system.
- It can be updated by a Data Warehouse Engineer.
- It can be populated by a (DW) ETL Process.
- It can range from being an Operational Data Warehouse to being an Enterprise Data Warehouse.
- It can range from being an Offline Data Warehouse to being an On-Time Data Warehouse.
- …
- Example(s):
- PlayStation's corporate data warehouse instance, 2021 (based on Snowflake DW platform).
- PlayStation's corporate data warehouse instance, 2017 (based on Netezza DW platform).
- a Financial Data Warehouse.
- …
- Counter-Example(s):
- See: Multidimensional Database, Star Schema.
References
2020
- (Wikipedia, 2020) ⇒ https://en.wikipedia.org/wiki/data_warehouse Retrieved:2020-2-28.
- In computing, a data warehouse (DW or DWH), also known as an enterprise data warehouse (EDW), is a system used for reporting and data analysis, and is considered a core component of business intelligence. DWs are central repositories of integrated data from one or more disparate sources. They store current and historical data in one single place that are used for creating analytical reports for workers throughout the enterprise.
The data stored in the warehouse is uploaded from the operational systems (such as marketing or sales). The data may pass through an operational data store and may require data cleansing for additional operations to ensure data quality before it is used in the DW for reporting.
Extract, transform, load (ETL) and Extract, load, transform (E-LT) are the two main approaches used to build a data warehouse system.
- In computing, a data warehouse (DW or DWH), also known as an enterprise data warehouse (EDW), is a system used for reporting and data analysis, and is considered a core component of business intelligence. DWs are central repositories of integrated data from one or more disparate sources. They store current and historical data in one single place that are used for creating analytical reports for workers throughout the enterprise.
2020
- (Wikipedia, 2020) ⇒ https://en.wikipedia.org/wiki/data_warehouse#Hybrid_design Retrieved:2020-2-28.
- Data warehouses (DW) often resemble the hub and spokes architecture. Legacy systems feeding the warehouse often include customer relationship management and enterprise resource planning, generating large amounts of data. …
2020
- (Wikipedia, 2020) ⇒ https://en.wikipedia.org/wiki/data_warehouse#Evolution_in_organization_use Retrieved:2020-2-28.
- Offline operational data warehouse: Data warehouses in this stage of evolution are updated on a regular time cycle (usually daily, weekly or monthly) from the operational systems and the data is stored in an integrated reporting-oriented database.
- Offline data warehouse: Data warehouses at this stage are updated from data in the operational systems on a regular basis and the data warehouse data are stored in a data structure designed to facilitate reporting.
- On time data warehouse: Online Integrated Data Warehousing represent the real time Data warehouses stage data in the warehouse is updated for every transaction performed on the source data
- Integrated data warehouse: These data warehouses assemble data from different areas of business, so users can look up the information they need across other systems.[1]
2009
- (Mazón & Trujillo, 2009) ⇒ Jose-Norberto Mazón, and Juan Trujillo, (2009). “A Hybrid Model Driven Development Framework for the Multidimensional Modeling of Data Warehouses.” In: SIGMOD Record, 38(2).
- Data warehouse (DW) systems provide a multidimensional (MD) view of huge amounts of historical data from operational sources, thus supplying useful information for decision makers to improve a business process in an organization. The MD paradigm structures information into facts and dimensions. A fact contains the interesting measures (fact attributes) of a business process (sales, deliveries, etc.), whereas a dimension represents the context for analyzing a fact (product, customer, time, etc.) by means of hierarchically organized dimension attributes. MD modeling requires specialized design techniques that resemble the traditional database design methods [16]. First, a conceptual design phase is performed whose output is an implementation-independent and expressive MD model for the DW. A logical design phase then aims to obtain a technology-dependent model from the previously defined conceptual MD model. This logical model is the basis for the implementation of the DW. Therefore, there are two cornerstones in MD modeling: the development of a conceptual MD model and the derivation of its corresponding logical representation.
2008
- (Wang, 2008) ⇒ John Wang. (2008). “Encyclopedia of Data Warehousing and Mining, 2nd edition." Information Science Reference. ISBN 1605660108
1999
- (Zaiane, 1999) ⇒ Osmar Zaiane. (1999). “Glossary of Data Mining Terms." University of Alberta, Computing Science CMPUT-690: Principles of Knowledge Discovery in Databases.
- QUOTE: Data mart: A small, single-subject warehouse used by individual departments or groups of users.
- QUOTE: Data Warehouse: A system for storing and delivering massive quantities of data.
1997
- (Chaudhuri & Dayal, 1997) ⇒ Surajit Chaudhuri, and Umeshwar Dayal. (1997). “An Overview of Data Warehousing and OLAP Technology.” In: ACM SIGMOD Record, 26(1). doi:10.1145/248603.248616
- QUOTE: A data warehouse is a “subject-oriented, integrated, timevarying, non-volatile collection of data that is used primarily in organizational decision making.” 1 Typically, the data warehouse is maintained separately from the organization’s operational databases. There are many reasons for doing this. The data warehouse supports on-line analytical processing (OLAP), the functional and performance requirements of which are quite different from those of the on-line transaction processing (OLTP) applications traditionally supported by the operational databases.