3 - Data Warehouses and OLAP Flashcards
(44 cards)
Business Intelligence and Data Management
Business Intelligence
- consilidate, analyse and present data for decision support
- corresponds to descriptive analytics according to the definitions here
- relies on extensive operational data sources, usually consolidated in a data warehouse
Business Intelligence and Data Management
Steps
- collect and store data in a database system
- extract, transform and load into a central data warehouse
- analyse and present data for business intelligence
Database Systems
- Store large data sets
- Examples: all orders accepted in the last five years, all of an online bookseller’s customers’ addresses
- this data is stored in relational tables - Analyse these data sets
- SQL (Structured Query Language) can access and analyze relational data
- Examples: list all customers in Berlin who ordered at least three times a year; what products produced the highest revenue during the last five years
Databases in IT Systems
Presentation:
- user interface in an app(lication) or in a web browser
Logic:
- computation as implemented in a programming language
Storage:
- Data Bases system
Components of a Database system
data base
+ data base management system
= data base system
Tasks of the Database Management System
- interpret queries
- optimise queries
- ensure integrity
- control access
- manage simultaneous access
- manage back-ups
- manage files
Tasks of the Database Management System
Interpret Queries
Translate SQL from the logic layer into “low-level”-routines
Tasks of the Database Management System
Optimise Queries
Search for the “best” way to access data given an SQL-query, access statics, and index data
Tasks of the Database Management System
Ensure Integrity
Enforce that data can only be manipulated according to defined integrity conditions
Tasks of the Database Management System
Control Access
Test whether the user is authorized to access the data
Tasks of the Database Management System
Manage simultaneous access
Synchronise simultaneous access to data and restrict data manipulation that would lead to conflicts
Tasks of the Database Management System
Manage back-ups
Ensure that system break-downs or errors have limited consequences
Tasks of the Database Management System
Manage files
Control the allocation of memory and access to the hard drive
The relational data model
Describing a relation (=table)
- define attributes, data types, primary keys (unique identifiers) and table name: e.g. Students (Student-ID, First Name, Last Name)
- there can be multiple relations with the same type of data, e.g. when splitting up data from multiple years across data bases
Data Warehouses
What’s a data warehouse?
- A data warehouse is a data database with reporting and query functions that stores operative and historical data
Data Warehouses
Data in a data warehouse
- data is extracted from diverse operational systems and processed to support management reports and analyses (-> OLAP, Data mining)
- data warehouses are multi-dimensional and offer long-term storage for historical, cleaned, validated, synthetic, operative data from internal and external sources
The ETL Process
What does ETL stand for?
EXTRACT relevant data from diverse sources
TRANSFORM data into the format of the central database system
LOAD data into the central database system
The ETL process and data warehouses
The Data Warehouse is an organization-wide concept aiming to provide
- a central, complete, consistent data basis
- that is independent of operational databases and
- can support diverse analytics projects
Working with a data Warehouse
Steps
- select well-suited attributes from operative data
- add selected data from external sources
- transform and load data
- store data in dimensions
- manage database (as in operational data bases)
- search and analyse via reports or OLAP
Components of a Data-Warehouse-System
Data Marts
Central data warehouse
Enterprise data warehouse
Components of a Data-Warehouse-System
Data Marts
- databases targeted to the analytics requirements of a specific user group
- managed by a decentral team
- simple data model and development
- mostly self-contained
Components of a Data-Warehouse-System
Central data warehouse
- analytics database that feeds data into local data sets
- can still provide information for just a part of the organization
Components of a Data-Warehouse-System
Enterprise Data Warehouse
- provides data to support analytics across the organization
Hierarchical DWH-Architecture
- central data warehouse (CDWH) feeds and coordinates local data marts
- CDWH extracts, integrates, and allocates data
Data Marts:
- serve for queries and analyses
- are tailored to one functional part of an organization