Descriptive stats Flashcards

(50 cards)

1
Q

What’s the difference between descriptive and inferential statistics?

A

Descriptive –> describe sample data based on sample statistics
Inferential –> use sample statistics to learn on population parameters

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is micro data?

A

Data collected on individuals

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is macro data?

A

Data collected on a group of units

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is a population?

A

The set of all statistical units object of interest

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is the sample?

A

A subset drawn from the population

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is non probability and probability sampling?

A

Non –> units are drawn from the population according to the judgement of the researcher
Probability –> units are drawn at random from the population, and every unit has the same probability to be drawn

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is the inferential process?

A

It consists in drawing conclusions that concern the entire population from the information provided by a sample

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What are the two broad variable categories?

A

Numerical and categorical

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What are the subsets of numerical variables?

A

Discrete and continuous

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What are the subsets of categorical variables?

A

Ordinal and nominal

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What are the columns of a frequency distribution table?

A

Classes/groups; absolute frequencies and relative frequencies

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

How is a histogram composed?

A

Horizontal axis –> intervals
Bars –> have an area equal to its relative frequency
Vertical axis –> interval density = relative frequency/interval width

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

How can we calculate an interval density?

A

Relative frequency/interval width

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

How does the number of intervals relate to the accuracy?

A

The higher the # of intervals, the higher the detail of the description.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What are the three measures of central tendency?

A

Mode, median and mean

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is the mode?

A

The level/value of a variable that is observed with the highest frequency

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

What is the unique measure of central tendency for nominal variables?

A

Mode

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

What is the median?

A

It is the central value. If odd–> (n+1)/2, if even it’s the median of the two central values

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

What is the mean?

A

The arithmetic average of the values. (x1+x2+….+xn)/n

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

What is the deviation?

A

It’s the difference of an observed value and the mean

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

What are the properties of deviation?

A
  • It’s positive when the value is higher than the mean and negative when not
  • The sum of all deviations is equal to 0
22
Q

Do strange values have an impact on the median?

A

No, because it’s based on frequencies

23
Q

Do strange values have an impact on the mean?

A

Yes, because it’s computed using all values

24
Q

What are the measures of location? And to which type of data can they be computed for?

A
  • Quartiles and percentiles;

- Ordinal categorical and numerical

25
What are quartiles?
They divide the observation in four Q1- 25% of values are smaller than it Q2- it's the median Q3- 75% of the observations are smaller than it.
26
What is the percentile?
The value that pth observations fall below it
27
What are the 5 number summary? And how can it be represented?
Minimum, Q1,Q2,Q3 and maximum | By means of a boxplot
28
How is the boxplot composed?
Height --> it's the IQR (Q3-Q1) Upper edge --> Q3 Lower edge --> Q1 Whiskers --> connect the outliers (1.5xIQR)
29
What are the properties of a symmetrical/bell-shaped distribution?
Q1-min = Q3-max; Median-q1=Q3-median; | median-Q1
30
What are the properties of a not bell-shaped distribution?
Q1-min = Q3-max; Median-q1=Q3-median; | median-Q1>Q1-min; Q3-median>Max-median
31
What are the properties of a right-skewed distribution?
It's high on low values and low on high values Median - Q1 > Q3-median Mean > median
32
What are the properties of a left-skewed distribution?
It's high on high values and low on low values The mean is not affected by low frequency values Median - Q1 < Q3-median
33
What are the 4 measures of variability?
Range, IQR, variance and standard deviation
34
What does thee IQR measure?
The spread of the central 50% of the observations
35
What is the variance?
It's the average of the squared deviations. It measures the dispersion of a variable around its mean. It's always positive
36
What is the coefficient of variation?
CV=s/mean, it expresses the standard deviation as percentage of the mean and allow for a comparison of the behavior of two variables when they have a different mean
37
What does it mean to analyze the concentration of a variable?
It means to assess how far from the extremes the actual distribution is
38
What does it mean a variable is very concentrated?
It's very far from being perfectly concentrated
39
What does it mean a variable has a low concentration?
It's very close from being perfectly concentrated
40
Can a concentration analysis be carried out for variables with negative values?
No
41
What is the property needed for a variable so we can carry a concentration analysis?
It needs to be transferable
42
How is qi distributed in a case of maximum concentration?
Q0-Qn-1=0 and Qn=1
43
What is the coordinates of the maximum concentration?
{(n-1)/2n,0}
44
How is qi distributed in a case of minimum concentration?
qi=fi for every i, that is, the concentration is always the same
45
What are the properties of a concentration curve?
- Continuos - Convex - crosses (0,0) and (1,1)
46
What is the gini index?
R= concentration area/maximum possible area (n-1)/2n
47
When are the pietra and gini index equal to zero?
When fi=qi for all i, that is, when it has a minimum concentration
48
If for two variables their high observations tend to occur with high values of the other too, what kind of linear association is there?
Positive
49
If for two variables their high observations tend to occur with low values of the other too, what kind of linear association is there?
Negative
50
What is the formula for the pearson's correlation in dex?
r = cov(X,Y)/ sx sy