Statistics Flashcards

1
Q

What is a population

A

The whole set of items that are of interest

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is a census

A

Measures every member of a population

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is a sample

A

A selection of observations taken from a subset of the population to find out information about the population as a whole

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What are the advantages of a census

A

Represents total population

Provides all relevant data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What are the advantages of sampling

A

Quicker
Easier
Cheaper

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What are the disadvantages of a census

A

Time consuming
Difficult
Expensive
May be impossible to get everyone

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What are the disadvantages of sampling

A

May be incomplete or may not be representative

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

How does convenience/ opportunity sampling work

A

Taking a sample of people who are available at the time and fit the criteria

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is a random sample without replacement called

A

Simple random sampling

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is a random sample with replacement called?

A

Unrestricted random sampling

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

How does stratified random sampling work (basically)?

A

The population is divided into strata, random samples are taken from each strata in proportion to the size of each strata

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is quota sampling?

A

Similar to stratified but sample is not random

Population is divided into groups with a given characteristic and the size of the groups determines the proportion of the sample that should have that characteristic. The most convenient people with that characteristic are chosen until the quota is filled

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What must a sampling method be for it to be random?

A

Each unit must have an equal chance of being chosen

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Is systematic sampling random

Why

A

No

It is impossible for consecutive names in the sampling frame to both be in the same sample

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

How do you take a systematic sample

A

Work out the ‘skip size’ by dividing total population by the desired size of the sample, rounding the nearest integer
Use a RNG to select starting point which will be the first sampling unit
Add ‘skip size’ to this number and continue. Taking the members of the population who correspond with the numbers generated
This continues until sample size has been obtained

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Give the strengths and weaknesses of random sampling

A

Strengths:
Free of bias
Cheap/easy for small samples
Each sampling unit has equal chance of being chosen

Weaknesses:
Not suitable for larger populations
Sampling frame needed

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Strengths and weaknesses of stratified sampling

A

Strengths:
Accurately reflects structure of population
Guarantees proportional representation of groups within a population
Weaknesses:
Population must be clearly classified into distinct strata
Random selection within strata suffers same disadvantages as random sampling

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Strengths and weaknesses of quota sampling

A

Advantages :
Allows small sample to be representative
No sampling frame needed
Quick/easy/cheap
Allows comparison between different groups

Disadvantages:
Can be biased
Division of population can be costly & inaccurate
Increasing scope of study increases number of groups

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Disadvantages and advantages of systematic sampling

A

Advantages:
Simple/ quick
Suitable for large populations

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Advantages and disadvantages of opportunity sampling

A

Inexpensive
Easy
Quick

Disadvantages:
Unlikely to be representative
Highly dependant on individual researcher

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

What is qualitative data

A

Non numerical eg colour

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

What are the different kinds of quantitative data

A

Discrete- only takes specific values eg shoe size, number of people (NB can still be infinite)

Continuous - can take any decimal value

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

3 measures of centre

A

Mean, median, mode

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

What is the mean

A

The sum of the data divided by the number of values

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Define median
The middle value when data is ordered from smallest to largest If there are an even number of values, the median is halfway between the two central values
26
Define mode
Most common value There can be one mode, two modes (bi-modal) or no mode
27
Advantages and disadvantages of mean
Advantages: Includes all data Disadvantages: Susceptible to outliers When data is grouped it is an estimate of the mean
28
Advantages and disadvantages of of median
Advantages: Less sensitive to outliers Disadvantages: Positional only Grouped data requires interpolation
29
Strengths and weaknesses of mode
Strengths: Can be used for qualitative Weaknesses: Only relevant if there are high frequencies Can be misleading Doesn’t consider the numerical value of the data
30
Name the types of measures of spread
Standard deviation Interquartile range Range
31
Formulas for standard deviation
Square root {[sum(x-u)squared] divided by n} Or Root [(sigma x squared over n) minus x bar squared]
32
IQR method
Upper quartile - lower quartile
33
Positives and negatives of standard deviation
Advantages: Includes all data Disadvantages: Susceptible to outliers
34
Advantages and disadvantages of IQR
Advantages Less sensitive to outliers Disadvantages Positional only and 50% is arbitrary Grouped data requires interpolation
35
Disadvantages of range
Highly susceptible to outliers
36
What is variance
Standard deviation squared
37
How does adding/ subtracting affect the mean
Increases/ decreases by that amount
38
How does multiplying/dividing affect mean
Multiplied/ divided by that factor
39
How does Adding/ subtracting on standard deviation
No effect
40
How does multiplying/dividing affect the standard deviation
Multiplied/ divided by that factor
41
What are the first 13 square numbers
1, 4, 9, 16, 25, 36, 49, 64, 81, 100, 121, 144, 169
42
What is r for PMCC
Sample PMCC
43
What is rho
Population PMCC
44
When comparing median/ means what should you reference
Compare size with reference to actual values and context Larger value suggests larger sample % difference should be calculated if >2 marks
45
If mean/ median and IQR/ standard deviation are close what does this suggest
Samples from the same population
46
Define population
All the data of a given group
47
Define Sample
A selection of some parts of the population
48
If a sampling method is random what does this mean
Each member has an equal and fair chance of being selected
49
What does independence mean
One outcome is unaffected by another outcome
50
What is discrete uniform distribution
``` A random variable with an equal chance for each outcome P(X=x) = k Where k is: 1 —————————- Number of variables ```
51
What are the requirements for a binomial distribution
Fixed number of trials Two possible outcomes per trial Constant probability Independence
52
What is the area under a normal distribution curve
1 or 100%
53
3 standard deviations stat
99.7% lies within 3 sd of the mean
54
What does it mean if a sample is truncated? What can we do with this
Zero lies less than 2 standard deviations below the mean Reject as not normally distributed
55
What must you remember for normal approximations
p is close to 0.5 n is large CONTINUITY CORRECTIONS
56
Why do we do continuity corrections
To change discrete (binomial) data into continuous (for normal)
57
Define H0
Population parameter you are comparing sample to
58
H1
The claim of how the sample might differ from population parameter
59
Population parameter
The value that defines a distribution (For binomial is is ‘p’) (For normal it is Mew and variance )
60
Define critical value
The first value in the critical region for which sample results would have a chance below significance level of occurring
61
Define critical region
The range of values for which H0 is rejected
62
Define p value
Probability of the result from your sample in relation to assumed population
63
Define significance level
The percentage for which any results below significance level suggests an unlikely outcome and therefore reasonable to conclude that the sample is unusual enough to reject H0
64
Define test statistic
The value you get from your sample to compare with the critical value
65
When does the critical region start exactly for binomial and normal
Binomial: critical value will be first value within critical region Normal: critical region will always represent exact significance level
66
When is PMCC not a good estimator
Outside original sample PMCC is weak Used to make a prediction about a different population
67
P(A’)
Probability of A not occurring (compliment)