Lecture 1 Flashcards

1
Q

What is the mode?

A

In categorial variables; the mode is the most frequent level. ( It is sometimes used for numerical variables as well when there are only a few different values.)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is the Variation Ratio:

A

Only in categorial variables: The fraction of cass different from the mode.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is entropy?

A

In categorial variables : The amount of disorder/uncertainty of information: is the data equally spread (high uncertainty, high H) or are the result extremely biased towards one answer (low uncertainty, low H)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What does a contengency table show?

A

The levels of one or two categorial variables withe their amounts numerically displayed in either propotions or relative shares. With two cat. variables the table becomes 2D (matrix style)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is the mean?

A

Only in numerical variables; the mean is the sum of all values devided by the number of data (aka total average)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is the median?

A

Only in numerical variables; it is the data point that is seperating the top and bottom 50%

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is meant with the dispersion of data?

A

With dispersion we talk about spread. More indepthly: Range, Variance, Standard deviation, Quantiles.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is the range of data?

A

The (absolute) difference between the minimum and maximum of numerical data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is the downside of using range to descibe data?

A

As it only reflects the difference between minimum and maximum, it doesnt acount for the dispersion of data and is heavily influenced by outliers.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is variance?

A

Only in numerical variables: It is the expected squared deviation from the mean.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is the standard deviation?

A

Only in numerical variables: It is the squareroot of the variance, which is easier to understand as this is now in the same units as the variable.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What are quantiles?

A

Only in numerical variables: Cut points that divide the distribution of intervals in equal probability ( 16% of observations is below the 16th quantile)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is Inter-Quartile-Range (IQR)?

A

In numerical Variables: Quartile is a quantile of 4 segments. The IQR is the rnage between the 3rd and 1st quartiles. so the inner 50% of data points.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is the influence of bandwith in Density plots?

A

The smoothing of the density plot : Higher bandwith yields higher smoothing of the data, and vise versa.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What do the conditional (marginal) proportions show in a contingency table?

A

Instead of relating the shares/proportions of TWO categorial variables to the total, we make it relative to one of the two categorial variables. Hence each row/column of the table now equals to 100% instead of the whole table totaling to 100%.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is the advantage of using conditional marginal proportions in contingency tables?

A

It may reveal relations between categorial variables or relations within a single categorial variable which were not (obviously) appearent before.

17
Q
A