Year 1 Stats Flashcards

(54 cards)

1
Q

Discrete vs continuous data

A

Discrete is countable - shoe size (binomial distribution)

Continuous is measurable - shoe length (normal distribution / histograms)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Target population

A

All the members of the population that would ideally take part in your study

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Sample

A

A subset of a target population

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Sampling frame

A

A list or database of the target population

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Census

A

Measures or observes every member of the population

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Advantages and Disadvantages of census

A
  1. Completely Accurate - collects data from everyone
  2. Expensive/time consuming
    2 Cannot be used in testing which destroys the item
    3 hard to process large quantities of data
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Steps to simple random

A
  1. Have a sampling frame and have a number on every member of sample
  2. Use random number generator to pick members
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Advantages of using a sample (2)

A
  1. Less time consuming/cheaper than census

2. Less data to process

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Disadvantages of using a sample (2)

A
  1. Inaccurate

2. May not give any information about small sub groups of the population

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Advantages of simple random sampling (2)

A
  1. Minimises bias

2. representative of whole population

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Disadvantages of simple random sampling (2)

A
  1. Need sampling frame

2. Time consuming/ expensive

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is simple random sampling

A

When every possible sample has the same probability of being selected.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is stratified sampling

A

When the population divided into mutually exclusive strata proportional to population and a simple random sample is taken from each strata

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Advantages of systematic (2)

A
  1. Quick and easy to use

2. Assures that the population will be evenly sampled

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Disadvantages of systematic (2)

A

Need sampling frame

There may be missing values in the population

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is systematic sampling

A

When you chose a starting point at random then systemically select groups at a certain number apart

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Advantages of stratified (2)

A
  1. Minimises selection bias by making sure no strata are over/under represented
  2. Frequencies for each group in the sample proportional to each group in the population
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Disadvantages of stratified (2)

A

Need sampling frame

Strata must be clearly defined

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

What is quota sampling

A

When the population is split into groups or strata, then you select members from the group. Is non random and biased

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Advantages of quota (2)

A
  1. Don’t need sampling frame

2. Frequencies for each group in the sample can be proportional to each group in the population

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

What is opportunity sampling

A

Taking a sample from the population who are available at the time the study is carried out. Is non random and biased

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

Advantage of opportunity sampling

A

Easy to select sample

23
Q

Formula for stratified

A

Target population/ whole population * sample size

24
Q

Measuring outliers

A

LQ - 1.5(IQR)

UQ + 1.5(IQR)

25
In a box plot diagram what does it mean if group A median is larger than group B median
On average group A gets higher results
26
In a box plot diagram what does it mean if group A IQR is larger than group B IQR
Group A is less consistent in the results as data more spread out
27
Frequency density
Frequency / Class width In Area, F = kA
28
Independent vs dependant
Independent does not rely on the other variable whilst dependant does. Independent on x axis
29
What an upwards very straight line says about correlation
It’s a strong positive correlation. When one variable increases so does the other.
30
What is correlation
Describes a linear relationship between two variables
31
What is bivariate data
Data which has pairs of values for two variables
32
PMCC
Measures how correlated a data set is
33
What does ‘b’ tell you in formula | y =a+bx
The change in y for each unit change in x
34
Why extrapolation unreliable
Doesn’t take into account limits to data
35
mean calculation
Sum x / n or Sum fx / n
36
How to find the median point and quartiles from 8 values of discrete data
8+1=9 9/2 = 4.5 so halfway between the 4th and 5th value To find lower quartile find the median of the lowest half (4 values) of the data
37
If grouped continuous data, how would you find mean
Find midpoint of each class width and plug into calc with frequency then press 1-Var
38
If grouped continuous data, how would you find median
Frequency / 2 then use interpolation
39
Rule of thumb for choosing which set to use for linear interpolation
Always go set up unless right on the boundary, then use set down
40
What is standard deviation
A way of measuring how varied the data is from the mean
41
What does it mean if group A standard deviation from the mean higher than group B
data points are on average further apart and so less consistent
42
Meaning of Sxx and Sx
Sxx is the sum of the squares Sxx = sum of (x - x(bar))^2. Sx tells us the standard deviation of the sample. Sx = Square root Sxx / n-1
43
Standard deviation from summary statistics
Square root Sum of x^2 / n - x (bar)^2
44
Boundaries for outliers using standard deviation
X(bar) - 2sd | X(bar) + 2sd
45
What to do if you have a constant k in a discrete random variable distribution P(X=x) = 3k(4-x)(x^2+1). x = 0,1,2
Substitute 0,1,2 into function for x All equations add to 1 Work out k from that
46
What does a uniform distribution mean
All variables have same probability
47
What is a probability distribution
Describes the probability of any outcome in a sample space
48
Random variable
A variable whose value is determined by a random experiment
49
How to do binomial distribution on calculator for multiple values
Go bpd and plug in numtrial and probability values. Then press List 1 rather than variables to find individual values.
50
What is hypothesis testing
Building evidence for a case against the nil hypothesis
51
What does reducing the significance level on a hypothesis test mean
less evidence is needed to pass hypothesis test
52
What is the significance level
The probability of incorrectly rejecting the nul hypothesis
53
What does it mean if PMCC gets closer to 1 or -1
It’s getting closer to perfect positive correlation and perfect negative correlation
54
The conditions under which it is appropriate to assume a random variable has a binomial distribution
There are n independent trials