Intro to Data and Data Science Flashcards

1
Q

What is Analysis?

1.2

A

‘how’ and ‘why’ something happened

performed on past data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What are Analytics?

1.2

A

Analytics apply logical reasoning to info obtained from analysis

Explores the future and looks for patterns

2 types:
Qualitative and
Quantitative

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What are Qualitative Analytics?

1.2

A

The use of:
intuition
experience and
analysis

to plan the next business move

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What are Quantitative analytics?

1.2

A

The application of formulas and algorithms to numbers gathered from analysis

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is Business Intellegence?

1.4

A

Process of analysing and reporting historical business data

Preliminary step to predictive analytics

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is Machine Learning?

1.4

A

Ability of machines to predict outcomes without being programmed to do so

The machines use data to:

  • Make predictions
  • analyse patterns
  • give recommendations
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What are advanced analytics?

1.4

A

all types of analytic processes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Symbolic reasoning is a type of AI that makes an exception and does not use ML and deep learning.
It is based on high-level human-readable representations of problems and logic.

True or False:
Symbolic reasoning is commonly used in practice

1.4

A

False:

Very rarely used in practice.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

5 Primary Columns om the 365 infographic

1.5

A
traditional data
big data
business intelligence
Applying traditional data science techniques
Using ML techniques
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is “Data”

2.0

A

information stored in a digital format

used for:

a) analysis
b) decision making

2 Types:

a) Traditional
b) Big Data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is traditional data?

2.0

A

Data in the form of tables containing numeric or text values;

Data that is structured and stored in databases

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is big data?

2.0

A

Extremely large data;

It can be in various formats:

  • structured
  • semi-structured
  • unstructured

often characterized by ‘V’ (volume, variety, velocity, etc.)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is Data Science?

2.0

A

an interdisciplinary field that combines:

statistical,

mathematical,

programming,

problem-solving, and

data-management tools.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What are Traditional Methods?

2.0

A

derived from stats and adapted for business

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is Raw Data?

4.1

A

AKA Primary Data

  • cannot be analysed immediately
  • accumulated and unorganized. The organization is called data collection
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is Class labelling?

4.1

A

Labelling the data point to the correct data type

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

What is data cleansing?

4.1

A

AKA Data Scrubbing

  • Deals with inconsistent data
  • -containing typos or missing info
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

What is data balancing?

4.1

A

Ensuring the sample gives equal priority to each class

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

What is Data Shuffling?

4.1

A

Shuffles data to ensure data is free from unwanted patterns from collection

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

What is a numerical variable?

4.2

A

Manipulatable numbers that provide useful information

21
Q

What is a categorical variable?

4.2

A

Numbers with no numerical value.

Dates are also considered categorical

22
Q

What is text data mining?

4.3

A

The process of deriving valuable, unstructured data from text.

23
Q

What is data masking?

4.3

A

data masking conceals the original data with random and false data,

allows you to conduct analysis and keep confidential information in a secure place.

24
Q

What is a metric?

4.5

A

a value derived from obtained measures

aims at gauging business performance/progress (has business meaning)

25
What is a measure? 4.5
simple stats of past performance (no business meaning)
26
What is a KPI? 4.5
Key Performance indicator metrics + business objective
27
What is clustering? 4.7
grouping the data in neighbourhoods to analyse meaningful patterns
28
What is a time series? 4.7
used in economics and finance shows the development of certain values over time (i.e. stock prices, sales volume)
29
What is a model in machine learning 4.9
an algorithm to recognize certain patterns
30
What is an objective function? 4.9
The specification of a machine learning problem; a function to be maximized or minimized depending on the task
31
What is an optimization algorithm? 4.9
Algorithm that compares previous solutions until reaching the reaching the optimal solution
32
What are the three main types of machine learning?
Supervised Unsupervised Reinforcement
33
What is supervised learning? 4.10
Provides feedback whether they did ‘good’ or whether they need to improve Uses labelled data
34
What is unsupervised learning? 4.10
In this case, the algorithm trains itself algorithm uses unlabelled data
35
What is reinforcement learning? 4.10
A reward system is introduced. maximize a reward (not minimize an error)
36
What is deep learning? 4.10
modern state-of-the-art approach to machine learning – leverages the power of neural networks can be both supervised and unsupervised
37
Python and R have their limitations. They are not able to address problems specific to some domains. One example is ‘relational database management systems’. In these instances, ______ works best 5.
SQL
38
Data architect 6
designs the way data will be retrieved processed and consumed
39
Data engineer 6
processes the data for analysis
40
database administrator 6
– handles this control of data; works with traditional data
41
BI analyst 6
performs analyses and reporting of past historical data
42
BI consultant 6
– ‘external BI analyst’
43
BI developer 6
performs analyses specifically designed for the company
44
Data scientist 6
employs traditional statistical methods or unconventional machine learning techniques for making predictions
45
Data analyst 6
prepares advanced analyses
46
Machine learning engineer 6
applies state-of-the-art ML techniques
47
200,000 lines of data constitute big data -- TRUE or FALSE?
FALSE -It is not just volume that defines a data set as ‘big’ – variety, variability, velocity, veracity and other characteristics play an important role as well
48
Qualitative analysis such as SWOT are not used for quantitative analysis. Hence, they are not part of business intelligence --TRUE or FALSE
False