Topic 13 Statistics Flashcards

Revision (45 cards)

1
Q

What is the definition of statistics?

A

Statistics is the branch of mathematics dealing with data collection, analysis, interpretation, presentation, and organization.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

True or False: A population includes all members of a specified group.

A

True

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is a sample in statistics?

A

A sample is a subset of a population used to represent the whole population.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is the difference between descriptive and inferential statistics?

A

Descriptive statistics summarize and describe the characteristics of a dataset, while inferential statistics use a sample to make predictions or inferences about a population.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Fill in the blank: The _____ is the average of a set of numbers.

A

mean

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is the median?

A

The median is the middle value in a list of numbers ordered from least to greatest.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is the mode?

A

The mode is the value that appears most frequently in a data set.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

True or False: The range is the difference between the highest and lowest values in a data set.

A

True

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is a frequency distribution?

A

A frequency distribution is a summary of how often each different value occurs in a dataset.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Define ‘outlier’ in statistics.

A

An outlier is a data point that differs significantly from other observations in a dataset.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is a histogram?

A

A histogram is a graphical representation of the distribution of numerical data, using bars to show the frequency of data within certain intervals.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What does a box plot display?

A

A box plot displays the median, quartiles, and potential outliers of a dataset.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is variance?

A

Variance is a measure of how much values in a dataset differ from the mean.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is standard deviation?

A

Standard deviation is the square root of the variance and measures the dispersion of a dataset.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

True or False: A lower standard deviation indicates that data points tend to be close to the mean.

A

True

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is a probability?

A

Probability is a measure of the likelihood that an event will occur, expressed as a number between 0 and 1.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Fill in the blank: The _____ of an event is the number of favorable outcomes divided by the total number of possible outcomes.

A

probability

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

What is a random variable?

A

A random variable is a variable whose possible values are numerical outcomes of a random phenomenon.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Define ‘normal distribution’.

A

Normal distribution is a probability distribution that is symmetric about the mean, showing that data near the mean are more frequent in occurrence.

20
Q

What is the Central Limit Theorem?

A

The Central Limit Theorem states that the distribution of sample means approaches a normal distribution as the sample size increases, regardless of the population’s distribution.

21
Q

What is a null hypothesis?

A

A null hypothesis is a statement that there is no effect or no difference, and it is the hypothesis that researchers aim to test.

22
Q

What is a p-value?

A

A p-value is the probability of obtaining results at least as extreme as the observed results, assuming that the null hypothesis is true.

23
Q

True or False: A smaller p-value indicates stronger evidence against the null hypothesis.

24
Q

What is a confidence interval?

A

A confidence interval is a range of values derived from a data set that is likely to contain the value of an unknown population parameter.

25
What does '95% confidence interval' mean?
A 95% confidence interval means that if we were to take many samples and build a confidence interval from each sample, approximately 95% of those intervals would contain the true population parameter.
26
Fill in the blank: The _____ is the value that divides a dataset into two equal halves.
median
27
What is skewness?
Skewness is a measure of the asymmetry of the probability distribution of a real-valued random variable.
28
What does a positive skew indicate?
A positive skew indicates that the tail on the right side of the distribution is longer or fatter than the left side.
29
What does a negative skew indicate?
A negative skew indicates that the tail on the left side of the distribution is longer or fatter than the right side.
30
Define 'bimodal distribution'.
A bimodal distribution is a distribution with two different modes, or peaks.
31
What is a scatter plot?
A scatter plot is a graph that shows the relationship between two quantitative variables by displaying points for each observation.
32
What is correlation?
Correlation is a statistical measure that describes the strength and direction of a relationship between two variables.
33
True or False: A correlation coefficient of 1 indicates a perfect positive correlation.
True
34
What is the range of the correlation coefficient?
The range of the correlation coefficient is from -1 to 1.
35
What is a regression line?
A regression line is a line that best fits the data points in a scatter plot, used to predict the value of a dependent variable based on the value of an independent variable.
36
What is the purpose of a chi-squared test?
The chi-squared test is used to determine whether there is a significant association between two categorical variables.
37
What does it mean if the chi-squared test results in a significant p-value?
It means that there is enough evidence to reject the null hypothesis of no association between the variables.
38
Fill in the blank: The _____ is the sum of the squared differences between each data point and the mean.
variance
39
What is a simulation in statistics?
A simulation is a method for modeling the behavior of a system or process using random sampling to generate data.
40
What is the purpose of a hypothesis test?
The purpose of a hypothesis test is to determine whether there is enough statistical evidence to support a specific hypothesis about a population parameter.
41
What is the difference between Type I and Type II errors?
Type I error occurs when the null hypothesis is incorrectly rejected, while Type II error occurs when the null hypothesis is not rejected when it is false.
42
What is the significance level in hypothesis testing?
The significance level is the probability of making a Type I error, commonly denoted by alpha (α).
43
True or False: In a two-tailed test, the critical region is split between both tails of the distribution.
True
44
What is a non-parametric test?
A non-parametric test is a statistical test that does not assume a specific distribution for the data.
45
What is the purpose of ANOVA?
ANOVA (Analysis of Variance) is used to compare means among three or more groups to determine if at least one group mean is significantly different from the others.