Lecture 7: Data Warehouses, Business Intelligence and Big Data Analytics Flashcards Preview

BIM Final Exam > Lecture 7: Data Warehouses, Business Intelligence and Big Data Analytics > Flashcards

Flashcards in Lecture 7: Data Warehouses, Business Intelligence and Big Data Analytics Deck (12):

What is a transaction processing system?

System that records data on fundamental operations occurring within the company


What is batch processing?

Data is stored in temporary storage and processed as a single unit at a specific time


What is online transaction processing?

Dta is processed immediately in real-time, current state of the system is always reflected


What is a ERP, CRM and SCM system?

Enterprise Resource Planning System: Integrates core functions of the company into homogenous system

Customer Relationship Management System: Integrates customer data to be used by various departments

Supply Chain Management System: Provides a holistic overview of value chain, including flow of raw materials


What are Operational Systems and Business Intelligence tools?

Operational systems: Represent the input side of databases, data warehouses and data marts

Business intelligence tools: More sophisticated analytics systems, represent the output side


What is online analytical processing?

- Transaction-level data stored in relational databases is aggregated and summarized

- Results of analysis are steroid in data cubes

- Data cubes structure results across multiple dimensions (Space, products, time)

- Running queries on data cubes enables substantially quicker response times than running them on original database


What is data mining?

- Data mining refers to the use of algorithms to identify hidden patterns in larger data sets

- Some basic types of patterns include: Associations, clusters and sequential relationships


What are association rules?

- Associations are certain attribute values that frequently occur together within a data set

- Association rule mining seeks to identify the most frequent affinities amongst items

- Support: is the fraction of transactions that contain a certain set of items X

- Confidence: is the fraction of transactions that contain Y among those transactions that contain X


What are the four Vs of Big Data?

1. Volume
2. Velocity
3. Variety
4. Veracity


What are neural networks?

They replicate the basic functionality of the human brain to support decision making by predicting future outcomes


What is hadoop?

Open-source software framework used for (distributed) storage and analysis of big data sets


What are the four primary advantages of hadoop?

1. Flexibility - can handle any type of data from any source
2. Scalability - Works on single low-end PC that can be scaled to combine hundreds of computers
3. Cost effectiveness (open source)
4. Fault tolerance (designed to avoid singe point failure, such as computer crashing)