Exam Flashcards

(21 cards)

1
Q

Explain the Central Limit Theorem, and why it is useful in statistical inference

A

The Central Limit Theorem (CLT) states that as the sample size increases, the distribution of sample means tends to be normal, regardless of the underlying population distribution. I.e., when we take multiple samples from a population, the average of those samples will be normally distributed, even if the population itself is not normally distributed.

The CLT is useful in statistical inference because it allows us to make inferences about a population based on a sample mean.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Explain what is meant by Type I error

A

a Type I error, also known as a false positive, is the incorrect rejection of a null hypothesis when it is actually true. In other words, it occurs when we conclude that there is a significant effect or difference when there is not.

This error is often denoted by the Greek letter alpha (α) and is commonly set at a significance level of 0.05 or 0.01, which means that we are willing to accept a 5% or 1% chance of making a Type I error.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

explain what is meant by the Significance level

A

The significance level, also known as the alpha level, is the probability of making a Type I error.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Define the term parameter

A

In statistics, a parameter is a numerical value that describes a characteristic of a population, such as the mean or standard deviation. Parameters are typically unknown and must be estimated from sample data. For example, the population mean (µ) of the heights of all students in a school is a parameter.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Define the term statistic

A

A statistic, on the other hand, is a numerical value that describes a characteristic of a sample, such as the sample mean or standard deviation. Statistics are calculated from sample data and are used to estimate population parameters. For example, the sample mean (x̄) of the heights of a random sample of 100 students from a school is a statistic.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Explain P value in hypothesis testing.

A

A p-value (short for probability value) is the probability of observing sample data that are at least as extreme as the observed sample data, assuming that the null hypothesis is true

If the p-value is smaller than the significance level, the null hypothesis is rejected. If it’s larger, the null hypothesis is accepted.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Annotate “confidence level”

A

the confidence level is the probability or percentage that a confidence interval will contain the true population parameter. A confidence interval is a range of values that we estimate will likely contain the population parameter, based on a sample of data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Explain briefly the concept of index numbers and their purpose.

A

Index numbers are a way to measure changes in related things, like prices or production levels, over time and or between groups. They make it easy to compare changes over time or between groups by combining multiple variables into a single measure. Index numbers show us the direction and amount of change in a variable or group of variables.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Explain what is meant by Type II error,

A

Type II error is a statistical term that refers to the error of failing to reject a null hypothesis when it is actually false. In other words, it is the error of accepting the null hypothesis when the alternative hypothesis is true.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

what is meant by the “Goodness of Fit” of a regression

A

the term “goodness of fit” refers to how well a regression model fits the observed data. It is a measure of how closely the predicted values from the regression model match the actual values of the dependent variable.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

very briefly describe the key features of the Laspeyres price index and discuss its advantages and disadvantages

A

The Laspeyres index uses quantity from the base year, so it is easy to collect data for and therefore, one disadvantage is the possibility of being out of date

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

very briefly describe the key features of the Paasche price index and discuss its advantages and disadvantages

A

The Paasche index applies current weights. As a result, it needs more data to be collected andthere is a possibility of underestimation.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

very briefly describe the key features of the Fischer price index and discuss its advantages and disadvantages

A

The Fischer price index, also known as the Fisher ideal price index, is a method used to calculate price indices that accounts for changes in both the quantity and quality of goods and services in the basket. It uses both base and current-year prices and quantities, as well as quality adjustments.

The Fischer price index has the advantage of being able to capture changes in both the quantity and quality of goods and services in the basket, making it more accurate than the Laspeyres and Paasche indices. It is also less likely to overstate price changes, as it accounts for changes in demand and substitution between goods.

However, the Fischer index can be more difficult and expensive to calculate, as it requires data on both prices and quantities, as well as quality adjustments. It may also be more difficult to interpret, as it includes quality adjustments that can be subjective and difficult to measure.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Briefly explain how an interval estimate differs from a point estimate.

A

A point estimate is a single value that is in some sense the best estimate of the parameter of interest.
An interval estimate gives a range of values that give an idea of the likely accuracy of the estimate with some level of confidence.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What factors determine the width of a confidence interval?

A

The intervals are influenced by sample size (i.e. the larger the sample, the narrower the interval) and confidence level (i.e. the larger the confidence level, the wider the interval)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What impact, if any, would an increase in the sample size have on the interval estimate and why?

A

If the sample size increases, n will increase, so the t-value will decrease. The interval will be narrower. As the sample size increases, the t-distribution becomes closer to the standard Normal distribution, as indicated in the last row of the t-distribution table (if degrees of freedom is ∞).

17
Q

explain what is meant by “one-tailed test” and “two-tailed
test” in hypothesis testing and explain when to use each test

A

In hypothesis testing, a one-tailed test is a statistical test in which the null hypothesis is rejected if the sample data falls entirely in one tail of the probability distribution. The alternative hypothesis is only tested in one direction. One-tailed tests are used when there is a specific directional prediction or hypothesis, such as whether a certain treatment will increase or decrease the value of a variable. For example, a one-tailed test could be used to determine if a new fertilizer increases crop yield, as the hypothesis is that the fertilizer will increase yield, and the alternative hypothesis is that there will be a significant increase.

On the other hand, a two-tailed test is a statistical test in which the null hypothesis is rejected if the sample data falls in either tail of the probability distribution. The alternative hypothesis is tested in both directions. Two-tailed tests are used when there is no specific directional prediction or hypothesis, and we want to determine if there is a significant difference between two groups or variables. For example, a two-tailed test could be used to determine if there is a significant difference in test scores between two schools.

18
Q

Define covariance

A

Covariance is a statistical measure that quantifies the degree to which two random variables vary together. It is a measure of the joint variability of two variables and indicates the direction and strength of the relationship between them. A positive covariance indicates that the two variables tend to increase or decrease together, while a negative covariance indicates that one variable tends to increase as the other decreases. Covariance is calculated as the average of the products of the deviations of each variable from their respective means. A high covariance indicates a strong linear relationship between the two variables, while a low covariance indicates a weak or no relationship.

19
Q

How do you calculate a simple index

A

It=(xt/x0)*100

20
Q

how do you calculate the coefficient of variation

A

standard deviation/mean

21
Q

define the coefficient of variation

A

A measure of relative dispersion (independent of units of measurements)