Week 1 & 2 Flashcards

(30 cards)

1
Q

Why do we need to know data analysis?

A

There is a problem that needs to be solved and we need data and analytics to properly act on it

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is a population?

A

All entities of interest in a study

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is a sample?

A

A subset or portion of the populations that is randomly chosen

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is a dataset?

A

Table of data containing variables in the column section (horizontal), and observations in the row sections (vertical)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What are some examples of variable?

A

height, gender, income

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What are some data types?

A

Numeric vs categorical; Ordinal vs nominal

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is numeric?

A

Meaningful arithmetic that can be performed on

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is categorical

A

otherwise, non numeric (not numbers (?))

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is ordinal?

A

There is a natural ordering of categories

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is nominal?

A

No natural ordering

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is a binary decision?

A

0/1 - a categorical variable with n different categories (n-1) (?)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is binning or discretizing

A

Categorizing a numeric variable into discrete (not specific)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What are some more data types?

A

Discrete vs continuous; Cross sectional vs time series

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is discrete?

A

Count data (e.g. # of children)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is continunous?

A

Continuous measurement like weight

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is cross sectional?

A

Cross section of a population in a FIXED time

17
Q

What is time series?

A

Data that are collected overtime

18
Q

What is an outlier?

A

An observation that lies outside of the norm (doesn’t mean it’s wrong)

19
Q

What is missing values?

A

Value of a variable is missing for observation

20
Q

What to do with missing values?

A

Ignore, average value, or estimate

21
Q

What to do with an outlier?

A

Run analysis and report with and without the outlier

22
Q

What is the most useful numeric system measure?

23
Q

What is the most useful graph?

24
Q

What is the tool to compare numerical variables across two or more subpopulations?

A

Side-by-Side Boxplots

25
Tools to study relationships among numeric variables?
Scatterplot, correlation, and covariance
26
What is a scatterplot?
2D graph to plot pairs from 2 numerical variables often used to examine relationships (e.g. temperature and sales)
27
What is correlations and covariance?
Measuring the strength and direction of a LINEAR relationship between 2 numerical variables: X & Y Note: X&Y should be paired variables Xi and Yi for observation i n: Number of observations
28
What is a perfect positive correlation?
An upward trend scatterplot graph that almost formed a straight line (Value = 1)
29
What is a perfect negative correlation?
A DOWNWARD trend scatterplot graph that almost formed a straight line (Value = -1)
30
What is and what value is a NO CORRELATION?
A scatterplot that is spread out and has no line trend. Value = 0