QM Flashcards

Question 1

Q

Sampling distribution

Answer

A

A sampling distribution is the probability distribution of a given statistic (such as the
sample mean or sample proportion) based on random sampling from a population.

Question 2

Q

Central Limit Theorem

Answer

A

The Central Limit Theorem (CLT) states that, for a large enough sample size, the
distribution of the sample mean will be approximately normal (or t-distributed), regardless
of the shape of the population distribution.

Question 3

Q

Power of a Test

Answer

A

The power of a hypothesis test is the probability of correctly rejecting a false null hypothesis.
In other words, it is the probability of avoiding a Type II error.

Question 4

Q

If correlation coefficient is close to -1

Answer

A

strong negative relationship

Question 5

Q

If correlation coefficient is close to +1

Answer

A

strong positive relationship (random variables moving in the same direction away from the mean.

Question 6

Q

If correlation coefficient is close to 0

Answer

A

no relationship

Question 7

Q

If random variables are independent X and Y the covariances and correlations…

Question 8

Q

Why correlation coefficient might be preferred over covariance

Answer

A

Covariance is dependent on the units of the variables.
The correlation coefficient is standardised and ranges between -1 and 1.
It provides unit-free measurement.

Correlation allows for a more meaningful interpretation and comparison of the strength and direction of the relationship.

Question 9

Q

Ad & Disad of using the mean as a measure of central location

Answer

A

The mean is a useful measure of central tendency because it takes into account all observations in a dataset.
The mean is sensitive to extreme values or outliers, meaning a few very low or very high prices can significantly affect it.

Question 10

Q

Ad & Disad of using the median as a measure of central location

Answer

A

The median is the middle value in a
dataset (ordered data).
The median has the advantage of being less affected by outliers and skewed data.
However, the median ignores most of the data points, as it only considers themiddle value in the distribution.
It disregards the distribution of data.

Question 11

Q

IQR

Answer

A

measures the spread of the “middle 50%” of data. It’s the difference between the third quartile and the first quartile. It captures the dispersion while mitigating the impact of outliers, thus providing a more robust measure of variability.

Question 12

Q

Variance

Answer

A

Measures deviation of data from the mean value.
Variance measures the average of the squared differences between each observation and the mean. Variance quantifies the overall spread of the data.
Because it uses squared differences, it is in squared units of the original data, making it less interpretable in terms of the original units.

Question 13

Q

Coefficient of Variation

Answer

A

The ratio of the standard deviation to the mean.
It standardises the measure of dispersion relative to the mean, allowing comparisons across different variables when the unit of measurement is different. CV is also sensitive to outliers.

Question 14

Q

Standard deviation

Answer

A

Square root of the variance.
Indicates how data typically deviates from its mean. It provides a measure of variability in the same units as the original data. The lower the standard deviation the lower the spread of data.

Question 15

Q

Coefficient of determination

Answer

A

Measure the goodness of fit (equal to R^2).

R^2 is a proportion, i.e. 0 <= R^2 <= 1

R^2 = 0 Regression explains none of the variation in Yi
R^2 = 1 Regression explains all the variation in Yi

Thus we want R^2 to be higher. The factor determining prices more is the one with the higher R^2.

QM Flashcards

(15 cards)