introduction Flashcards
(21 cards)
What is a data warehouse?
A centralized storage system that integrates data from multiple sources for reporting and analysis.
What are the key benefits of a data warehouse?
Improved decision-making, historical data analysis, better reporting efficiency, and centralized data management.
What is OLAP?
Online Analytical Processing, used for analyzing data from multiple perspectives for business intelligence.
How does OLAP differ from OLTP?
OLAP is for complex analytical queries, while OLTP handles real-time transactional operations.
What does ETL stand for?
Extract, Transform, Load – the process of collecting, cleaning, and storing data in a data warehouse.
What are the three types of data models?
Conceptual (high-level business concepts), Logical (detailed entities and relationships), Physical (database implementation details).
What are the five main OLAP operations?
Drill Down, Roll Up, Slice, Dice, and Pivot.
What is the difference between MOLAP and ROLAP?
MOLAP stores data in multidimensional cubes (faster but uses more storage), while ROLAP uses relational databases (scalable but slower for complex queries).
What is the purpose of dimensional modeling?
It organizes data into dimensions (categories) and facts (measurable values) to simplify reporting and analysis.
What is a Star Schema?
A database schema where a central fact table is linked to multiple dimension tables.
What is a Snowflake Schema?
A variation of the Star Schema where dimension tables are normalized into multiple related tables.
What is Drill Down in OLAP?
An operation that provides more detailed views of data, such as viewing sales data at a monthly instead of yearly level.
What is the Roll Up operation in OLAP?
The opposite of Drill Down, summarizing data to a higher level, such as viewing yearly instead of monthly sales.
What is the Slice operation in OLAP?
Selecting a specific subset of data based on a single dimension (e.g., sales for January only).
What is the Dice operation in OLAP?
Selecting data based on multiple criteria (e.g., sales in January for a specific region).
What is the Pivot operation in OLAP?
Rotating data to change the way dimensions are viewed (e.g., switching from sales by product to sales by region).
Why is data integration a challenge in data warehousing?
Because data comes from multiple sources with different formats and structures.
What are some best practices for data warehousing?
Use incremental extraction, ensure data consistency, and optimize query performance.
What is the role of metadata in a data warehouse?
It provides context and meaning to the stored data, helping users understand data relationships.
What are stored and calculated measures in OLAP?
Stored measures are precomputed and saved, while calculated measures are computed dynamically during queries.