DBMS Flashcards

1
Q

What is OLTP and OLAP?

A

OLTP stands for Online Transaction Processing. It is an online transactional system that tracks day to day transactions of an organisation, typically banks, sales and trading. It manages database modification.

OLAP stands for Online Analytical Processing. It is an online retrieving and data analysis system.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Compare the focus of OLTP and OLAP

A

OLTP focuses on INSERT, UPDATE and DELETE information
OLAP focuses on extracting data and analysing them for decision making

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Compare data sources for OLTP and OLAP

A

The data source for OLTP is OLTP itself and its transactions
The data source for OLAP is different OLTP databases

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Compare transaction length between OLTP and OLAP

A

OLTP Has short transactions
OLAP has long transactions

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Compare queries in OLTP and OLAP

A

Queries in OLTP are simpler
Queries in OLAP are more complex

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

How is data in OLTP and OLAP normalised?

A

data in OLTP is normalised to 3NF
data in OLAP is not normalised

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is the difference regarding integrity constraints between OLTP and OLAP?

A

data in OLTP must frequently maintain data integrity constraint
data in OLAP does not get frequently modified. Hence, data integrity is not affected

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

How does OLTP and OLAP store data?

A

OLTP: traditional DBMS
OLAP: Data warehouse

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Who uses OLTP and OLAP?

A

OLTP: Clerks and IT Professionals

OLAP: Knowledge workers

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Compare the number of users between OLTP and OLAP?

A

OLTP: Thousands of users

OLAP: Hundreds of users

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is a data warehouse?

A

A data warehouse is a centralised repository designed to store, organised and manage large amounts of structured data from various sources. It is built to support complex querying and analysis, which are essential for business intelligence activities, decision-making and reporting. The warehouse is structured in a way that data from different domains, such as sales, marketing, HR or finance can be correlated and analysed together.

Key features:
Integration of Data: It integrates data from various sources and formats, ensuring consistency across different data types and systems.
Orthogonal Data Dimensions: It provides a multi-dimensional view of data, which includes dimensions like time (historical and current data) and subject areas (like sales and HR).
Objective: The primary objective is to facilitate reporting and data analysis, providing support for OLAP operations and helping organizations to make data-driven decisions.
Business Intelligence: It is a key component of business intelligence frameworks, offering a comprehensive view of an organization’s data.
Architecture: The architecture of a data warehouse often includes an operational layer of source systems, an integration layer where ETL (Extract, Transform, Load) or ELT (Extract, Load, Transform) processes and data cleansing occur, and the data warehouse itself where the processed data is stored.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Compare DBMS and Data Warehouse

A

DBMS is tuned for OLTP
Data Warehouse is tuned for OLAP

DBMS is used to run a business
Data Warehouse is used to optimise a business

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What are two forms of data loading, which is a key topic of data warehouse

A

ELT and ETL

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is ETL?

A

ETL Stands for extract, transform and load. It is a process to move raw data from one or more sources into a data warehouse, where it can be stored, queried and analysed.

Extract: The first step is to pull the data from different sources. The data extracted can be structured or unstructured and may come in different formats.

Transform: The next step is where data is cleansed, enriched and transformed into a suitable format for analysis.

Data cleansing to correct or remove corrupt or inaccurate records.
Data mapping to ensure that data from one source fits into the destination.
Data conversion for standardizing formats, such as dates and numerical values.
Joining or splitting data fields, such as combining first and last names into a full name field or vice versa.
Aggregation or summarization of data, such as calculating total sales.

Load: The last step is to load the transformed data into the DW.

This ETL process is critical to make sure we have a unified system where data is in a suitable format for comprehensive analysis and Business Intelligence.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What are the 3 layers involved in data warehousing?

A
  1. Staging Layer: This is where we store raw data from the different sources. It is a temporary storage area used for data processing during the ETL process.
  2. Integration Layer: This layer is responsible for combining data from the staging layer and organising it into a structure suitable for analysis and query in DW. Data is organised into hierarchical groups or dimensions, and facts (quantitative data) are separated from these dimensions. The schema used in data warehouse typically includes facts and dimensions, which are used to build star or snowflake schemas for organising data into a multidimensional DB structure
  3. Access layer: This layer is the front end that users interact with to retrieve data from data warehouse. it includes tools for reporting, querying and data analysis.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What are challenges faced during data warehousing?

A

Data volume (time, spike), dirty data, synchronisation.

17
Q

What are the 4 main categories of NoSQL?

A
  1. Key Value Database: Redis
  2. Document Database: MongoDB
  3. Columnar Database: Cassandra
  4. Graph Database: Neo4j
18
Q

What is key-value database?

A

Key-Value Database:

It represents the simplest form of NoSQL databases.
The core structure is a key-value store, which functions like a simple hash table.
Access to the data within the database is exclusively through keys.
Basic operations include:
get: Retrieve the value associated with a key.
put: Set the value for a key.
delete: Remove a key from the database.
It does not support complex query filters, and operations like joins are typically handled outside the database, such as in application code written in languages like Python.
Key-value stores are praised for their efficiency due to their simplicity.
They are designed to be highly distributable, making them suitable for deployment across various systems and locations.
Key-value databases are particularly well-suited for scenarios where quick read and write access to data is required without the overhead of complex data modeling. They are commonly used for session storage, caching, and situations where the dataset can be easily partitioned.

19
Q

What are unique features of Redis?

A
  1. Operations in Redis are performed in memory, which is significantly faster than disk-based storage. Redis also offers data persistency features, which means it can save the updated database to disk at specified intervals to prevent data loss.
  2. Redis supports rich data structures beyond simple key-value pairs like lists, hashes, sets, bitmaps
  3. There are many bindings available to many programming languages making Redis available from a wide range of systems. Examples are Python, Java, C, C#
20
Q

How is data storage done in documents?

A

They store data in formats like JSON, Binary JSON(BSON), or XML. They are flexible and human readable formats that are widely used for representing complex and nested data structures on the web.

20
Q

What is a Document in Document Database?

A

Documents bases do not have a fixed schema like relational databases. Documents are like schema-less. The structure of documents vary from one to the next.

21
Q

What are features of document?

A

Nesting: Documents can be nested with complex structures and can also be indexed, which improves search performance.

Faster: They provide potentially faster operations due to less translation overhead compared to converting data into tabular forms as in SQL databases.

Complexity: They offer more straightforward data assembling and disassembling, meaning data can be accessed and manipulated without need for complex joins or transactions

Flexibility: They offer flexibility with changeable data structures, accommodating evolving data models without requiring a predefined schema

In summary, documents in a document database offer a more flexible and dynamic approach to data storage, accommodating varied and evolving data structures without the constraints of a fixed schema. This makes them suitable for applications where data requirements are non-uniform and change frequently.

22
Q

What are the benefits of Star Schema

A
  1. Denormalised: this would mean that we do not need to join many times, as data will come from a few tables, NOT REQUIRING MANY TABLES.
  2. Simpler queries: simple join logic
  3. Read intensive since denormalised data
23
Q

What are the disadvantages of Star Schema

A
  1. Data integrity is not enforced: Often, data loading is highly controlled for protection. Normalisation can help us ensure data integrity, but here it is denormalised
  2. Not flexible
  3. Not suitable for complex analytics
24
Q

Snowflake schema

A

Interesting part is its dimensions can be linked to other smaller dimensions, so a lot of layers, hierarchical. It is like normalisation in data warehouse. Each dimension is decomposed into small tables.

25
Q

What are the benefits of snowflake schema?

A

Storage saving, data integrity

26
Q
A