Exam revision Flashcards
(71 cards)
What is Data Analytics
The process of developing actionable decisions or recommendations for actions based on insights generated from historical data (involves tech, scienc,e and stats)
What is the difference between descriptive, predictive, and prescriptive analytics, and what do they all comprise of?
Look at the slides for week 1
Define data
Data comprises of particular features and can be regarded as the description of a specific phenomenon.
What are the main 2 types of data
Quantitative and Qualitiative
What are the 3 V’s of data
Volume, velocity and variety
How does BI fit with data?
BI turns data into “actionable” information.
Business analyse the information provided by these data and act on it.
Once the information is there, the BI people can use tools such as data discovery, data visualisation, OLAP analytics, dashboards, reports and more to discover the backbone of the information.
What are the 5 C’s of data
Clean, comprehensive, current, conformed and consistent
Define Data integration
Combining data from different resources
Define data warehousing
A Data Warehousing (DW) is process for collecting and managing data from varied sources to provide meaningful business insights.
A Data warehouse is typically used to connect and analyze business data from heterogeneous sources.
The data warehouse is the core of the BI system which is built for data analysis and reporting.
Define a data-driven model and some benefits
. A model is a deliberate simplification of reality.
. A good model retains the most important features of reality and ignores less important details.
. The benefit of a model is that you simplify the complex real world.
. They can be used for predict future(forecasting), making decesions based on probability, quantifying uncertatinty, showing relatiuonships between variables
State some differences between a database and a datawarehouse
. Database is a collection of related data that represents some elements of the real world, whereas Data warehouse is an information system that stores historical and commutative data from single or multiple sources.
. Database is designed to record data whereas the Data warehouse is designed to analyze data.
. Database is application-oriented-collection of data whereas Data Warehouse is the subject-oriented collection of data.
. Database uses Online Transactional Processing (OLTP) whereas Data warehouse uses Online Analytical Processing (OLAP).
. ER modeling techniques are used for designing Database whereas data modeling techniques are used for designing Data Warehouse.
State some benefits of a datawarehouse
. Delivers Enhanced Business Intelligence
. Ensures Data Quality and Consistency
. Saves Time and Money
. Tracks Historically Intelligent Data
. Generates high ROI
State some limitations of a datawarehouse
. Extra Report Work
. Inflexibility and homogenization of data
. Ownership Concerns
. Demands for large amounts of resources
. Hidden issues consume time
State some characterisitics of a datawarehouse
. Subject oriented
. Integrated
. Time-variant (time series)
. Nonvolatile
. Summarized/Not normalized
. Metadata
. Web based, relational/multidimensional
. Client/server, real-time/right-time/active.
Defined subject-oriented and application-oriented
Subject-oriented: it offers information regarding a theme instead of companies’ ongoing operations. These subjects can be sales, marketing, distributions, etc.
Application-oriented: It offers information on ongoing operations wether it be a company or anybodyelse
A data warehouse never focuses on the ongoing operations. Instead, it put emphasis on modeling and analysis of data for decision making.
What does integration mean in terms of data warehousing .
The establishment of a common unit of measure for all similar data from the dissimilar database. The data also needs to be stored in the Data warehouse in common and universally acceptable manner.
Is ETL a part of integration yes or no
Yes
What does ETL stand for
Extract, Transform, Load
When thw word time-variant comes up what should come to mind?
It contains an element of time, explicitly or implicitly.
Another aspect of time variance is that once data is inserted in the warehouse, it can’t be updated or changed.
What does non-volatile mean in terms of data warehousing?
Means the previous data is not erased when new data is entered in it.
What data operations are performed in Datawarehousing
Data loading
Data access
Activities like delete, update, and insert which are performed in an operational application environment are omitted in Data warehouse environment
Name the 3 types of datawarehouses
. Enterprise Datawarehouse
. Operational Data Store
. Data Mart
Give me some info on a Enteprise Data Warehouse
. It’s a centralized warehouse.
. Provides decision support service across the enterprise.
. Offers a unified approach for organizing and representing data.
. It provides the ability to classify data according to the subject and give access according to those divisions.
Give me some info on Operationa Data Store
. A data store required when neither Data warehouse nor OLTP systems support organizations reporting needs.
. In ODS, Data warehouse is refreshed in real time.
. It is widely preferred for routine activities like storing records of the Employees.