Terms Flashcards

1
Q

ACID Test

A

A test applied to data for atomicity, consistency, isolation and durability.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Aggregation

A

A process of searching, gathering and presenting data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Algorithm

A

A mathematical formula or statistical process used to perform analysis of data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Anomaly Detection

A

The process of identifying rare or unexpected items or events in a dataset that do not conform to other items in the dataset and do not match a projected pattern or expected behavior. Anomalies are also called outliers, exceptions, surprises or contaminants and they often provide critical and actionable information.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Big Data

A

Big data refers to datasets whose size is beyond the ability of typical database software tools to capture, store, manage and analyze.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Business Intelligence

A

The general term used for the identification, extraction and analysis of data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Classification Analysis

A

A systematic process for obtaining important and relevant information about data (metadata) and assigning data to a particular group or class.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Clustering Analysis

A

The process of identifying objects that are similar to each other and clustering them in order to understand the differences as well as the similarities within the data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Columnar Database or Column-oriented Database

A

A database that stores data by column rather than by row. In a row-based database, a row might contain a name, address and phone number. In a column-oriented database, all names are in one column, addresses in another and so on. A key advantage of a columnar database is faster hard disk access.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Comparative Analysis

A

Data analysis that compares two or more data sets or processes to detect patterns within very large data sets.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Correlation Analysis

A

A means to determine a statistical relationship between variables, often for the purpose of identifying predictive factors among the variables. A technique for quantifying the strength of the linear relationship between two variables.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Dashboard

A

A graphical representation of analyses performed by algorithms.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Data

A

A quantitative or qualitative value. Common types of data include sales figures, marketing research results, readings from monitoring equipment, user actions on a website, market growth projections, demographic information and customer lists.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Data Aggregation

A

The process of collecting data from multiple sources for the purpose of reporting or analysis.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Data Analyst

A

A person responsible for the tasks of modelling, preparing and cleaning data for the purpose of deriving actionable information from it.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Data Analytics

A

Behavioral Analytics: Using data about people’s behavior to understand intent and predict future actions.
Descriptive Analytics: Condensing big numbers into smaller pieces of information. This is similar to summarizing the data story. Rather than listing every single number and detail, there is a general thrust and narrative.
Diagnostic Analytics: Reviewing past performance to determine what happened and why. Businesses use this type of analytics to complete root cause analysis.
Predictive Analytics: Using statistical functions on one or more data sets to predict trends or future events. In big data predictive analytics, data scientists may use advanced techniques like data mining, machine learning and advanced statistical processes to study recent and historical data to make predictions about the future. It can be used to forecast weather, predict what people are likely to buy, visit, do or how they may behave in the near future.
Prescriptive Analytics: Prescriptive analytics builds on predictive analytics by including actions and make data-driven decisions by looking at the impacts of various actions.

17
Q

Data Architecture and Design

A

How enterprise data is structured. The actual structure or design varies depending on the eventual end result required. Data architecture has three stages or processes: (1) conceptual representation of business entities, (2) the logical representation of the relationships among those entities and (3) the physical construction of the system to support the functionality.

18
Q

Data Cleansing

A

The process of reviewing and revising data to delete duplicate entries, correct misspelling and other errors, add missing data and provide consistency.

19
Q

Data Integration

A

The process of combining data from different sources and presenting it in a single view.

20
Q

Data Modelling

A

A data model defines the structure of the data for the purpose of communicating between functional and technical people to show data needed for business processes, or for communicating a plan to develop how data is stored and accessed among application development team members.

21
Q

ETL (Extract, Transform and Load)

A

The process of extracting raw data, transforming by cleaning/enriching the data to make it fit operational needs and loading into the appropriate repository for the system’s use. Even though it originated with data warehouses, ETL processes are used while taking/absorbing data from external sources in big data systems.

22
Q

Exploratory Analysis

A

An approach to data analysis focused on identifying general patterns in data, including outliers and features of the data that are not anticipated by the experimenter’s current knowledge or preconceptions. EDA aims to uncover underlying structure, test assumptions, detect mistakes and understand relationships between variables.

23
Q

Latency

A

Any delay in a response or delivery of data from one point to another.

24
Q

Metadata:

A

Data about data; it gives information about what the data is about. For example, where data points were collected.

25
Q

Optimization Analysis

A

The process of finding optimal problem parameters subject to constraints. Optimization algorithms heuristically test a large number of parameter configurations in order to find an optimal result, determined by a characteristic function (also called a fitness function).

26
Q

Veracity

A

Ensuring that data used in analytics is correct and precise

27
Q

Visualization

A

A visual abstraction of data designed for the purpose of deriving meaning or communicating information more effectively. Visuals created are usually complex, but understandable in order to convey the message of data.

28
Q

Pain Points

A

Pain points are specific problems faced by current or prospective customers in the marketplace. Pain points include any problems the customer may experience along their journey.