Lectures 1-4: Datasets, Variables, Distributions, Estimation, Quantitative Methods Flashcards

1
Q

What is a variable?

A

A variable represents a characteristic for each case within a dataset, which can be described using more than one value – e.g. a variable might record incomes or unemployment rates.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is a nominal (a.k.a. categorical) variable?

A

A variable with distinct categories that does not tell you anything about the relationship between them, and cannot be ranked in terms of value or order – e.g. birthplace, religion.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Are nominal variables and categorical variables the same thing?

A

Yes.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is an ordinal variable?

A

A variable with categories than can be ordered or ranked according to some sort of criterion (but where you cannot specify the precise size of the interval between any two categories) – e.g. a ranking of low-, semi- or high-skilled workers.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is an interval (a.k.a. ratio) variable?

A

A variable on a scale, with an exact distance between any pair of values. It may be either continuous (e.g. height, income) OR discontinuous/discrete (e.g. indivisible units such as numbers of factories or people).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Are interval variables and ratio variables different?

A

No. They are the same thing.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is a dummy variable?

A

A variable that cannot be measured but can still be used by assigning values that represent two (or more) categories – e.g. where 0 = no and 1 = yes.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is an independent (a.k.a. explanatory) variable?

A

A variable that explains your dependent variable.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is a dependent (a.k.a. response) variable?

A

It represents a phenomenon that you want to understand through comparison to other variables (i.e. your independent variables).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What are univariate (a.k.a. descriptive) statistics?

A

They capture the distribution of an individual variable; univariate analysis is the simplest form of statistical analysis.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What types of methods would you use to investigate univariate statistics?

A

For qualitative variables: frequency, mode and median. For quantitative variables: mean, median, mode, standard deviation, etc.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What are bivariate statistics?

A

They capture the relationship between 2 variables – e.g. racism and income.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What types of methods would you use to investigate bivariate statistics?

A

For qualitative variables: crosstabulate, Cramer’s V, logistic/multinomial regression. For quantitative variables: correlation (if the independent variable is quantitative), simple regression.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What are multivariate statistics?

A

They capture or model the relationships among 3 or more variables.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What types of methods would you use to investigate multivariate statistics?

A

For qualitative variables: logistic/multinomial regression. For quantitative variables: multiple regression.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is statistical inference?

A

The process of analysing data to deduce the properties of an underlying distribution.

17
Q

What is a dataset?

A

A series of units (individuals, household..) with one or more characteristics (variables). For each variable there is a sequence of observations, each with its own value

18
Q

What do cross-sectional dataset capture?

A

They capture the characteristics of a comparable unit at a single point in time. The observations vary by the characteristics, not by unit type. Example: 1 round of ESS

19
Q

What do times-series dataset capture?

A

Time series capture repeated observations at different time periods. For example: inflation in one country over several decades

20
Q

What do cross-sectional times-series (CSTS) dataset capture?

A

They capture fixed and non-sampled units at different time periods. For example: inflation in several countries over several decades

21
Q

What is a panel data?

A

A dataset that captures sampled units at different time intervals. Example Electoral Panels (same individuals over several elections)

22
Q

What is a rolling cross-section dataset?

A

Dataset that captures sampled units (called cohorts) at different time intervals. Ex: Several rounds of ESS (different individuals but with same characteristics of age, nationality…)

23
Q

What is the mean?

A

It is the average of all values

24
Q

What is the median?

A

It is the value in the middle. There is the same number of observations above and below it

25
Q

What is the mode?

A

The value that occurs the most

26
Q

What is a quartile?

A

The n/4 quarter of the values that fall below the median. Ex. The first quartile has ¼ of the values below the median. The 3rd quartile has ¾ of the values below the median

27
Q

What is an interquartile range?

A

The distance between the 1st & 3rd quartile

28
Q

What is the mean deviation?

A

It measures how much each value differs from the mean. It is the average of all differences

29
Q

What is the variance?

A

It is the mean of the squared deviations. The variance gets rid of the possible negative signs when calculating the mean deviation. It captures better the greater spread of the values. There is a disadvantage: it is expressed in “square units”

30
Q

What is the Standard deviation?

A

The typical deviation from the mean

31
Q

What is the coefficient of variation?

A

It allows to compare standard deviation of series measured in different units

32
Q

What is a normal distribution or normal curve?

A

It takes the form of a bell-shaped. The curve is symmetrical and unimodal, so the mean, the median and the mode are identical

33
Q

What is the skewness?

A

It measures the extent to which a distribution is asymmetrical. It is given by the relationship between the mean and the median: the greater the skewness, the greater the difference between the mean and the median

34
Q

When is a distribution skewed to the right (positive skewness)?

A

When the tail of the curve stretches to the right containing a small number of very large values.

35
Q

When is a distribution skewed to the left (negative skewness)?

A

When the tail of the curve stretches to the left with a small number of very low values.

36
Q

What is the Kurtosis?

A

It measures the size of the distribution tails. It approaches to 3 in quasi-normal distributions.

37
Q

What is the empirical law of the normal distribution?

A

A constant proportion of all the cases lie in a given distance from the mean measured in terms of standard deviation