Stats Key Info Flashcards

1
Q

Population

A

Entire set of items (sampling units) in the group being studied.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Census

A

Measuring every member of a population

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Evaluation - Census

A

+ Accurate
- Expensive
- Some testing destroys the item

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Sampling frame

A

List of sampling units

(It is not always possible to create this, thus can be a disadvantage of some techniques)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Simple Random Sampling

A

Equal chance of being selected - done using random number generator/ lottery sampling alongside sampling frame.

Type of RANDOM Sampling

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Evaluation - Simple Random Sampling

A

+ Bias-free
- Sampling frame required

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Systematic Sampling

A

Taking every k^th unit (k = population / sample), pick random number between 1 and k for start point

Type of RANDOM Sampling

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Evaluation - Systematic Sampling

A

+ Quick to use
- Sampling frame required

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Stratified Sampling

A

Sample is proportionally representative of the strata (groups) of the population.
Formula: Sample / population x strata (for each strata)
(use either simple random/systematic to fill groups)

Type of RANDOM Sampling

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Evaluation - Stratified Sampling

A

+ Reflects Population
- Need clearly classified strata (groups) for population

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Opportunity Sampling

A

Sample based on who/what is available at the time.

Type of NON-RANDOM Sampling

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Evaluation - Opportunity Sampling

A

+ Easy, cheap
- Unlikely to be representative

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Quota Sampling

A

Similar to stratified sampling, but strata are filled by the researcher using opportunity sampling, thus are not necessarily representative of the population.

Type of NON-RANDOM Sampling

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Evaluation - Quota Sampling

A

+ No sampling frame needed
- Not random, potential bias

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Data Types

A

Qualitative: Non-numerical
Quantitative: Numerical

Types of Qualitative:
Discrete: Can only take certain values (often integers) => e.g. shoe size
Continuous: Can take any value in a range, must be grouped. => e.g. foot length

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Median (Location)

A

LQ: n/4 th term
Median: n/2 th term
UQ: 3n/4 th term

xth percentile = x/100 n th term
Decile = 10% Chunk = Percentile/10

17
Q

Mean (Location)

A

x̄ = ∑fx / ∑f (or ∑x / n)

18
Q

Variance (Spread) σ^2

A

(∑f)(x^2) / ∑f) - x̄^2
MSMSM
Mean of the Squares Minus Square of the Mean
(Also = Sxx / n)

19
Q

Coding

A

If y = ax + b…
then mean of y
= a(mean of x) + b

AND
σ of y = a x (σ of x)

20
Q

Linear Interpolation (Location)

A

Using the assumption that all data values are evenly spread throughout each class, using proportion to find how far through each class the data value should be.

Remember to add on the lower-class boundary after finding the correct data value.

e.g.
Class Limits: 12.5 Q1 15.5
|——–|—–|
Cumulative Freq.: 5 10 13

12.5 + (10-5 / 13 - 5) x 3 = 14.375 = Q2

21
Q

Sampling Units

A

Individuals of a population

22
Q

Finding Quartiles (Location)

A

n/4 or n/2 or 3n/4
If decimal: round UP ALWAYS
If whole number, find midpoint with next value.

23
Q

Outlier Boundaries (Representation)

A

Q1 - 1.5(IQR) or Q3 + 1.5(IQR)
(USUALLY)

24
Q

Interquartile Range (Spread)

A

Q3 - Q1
+ Ignores extremes

25
Interpercentile Range (Spread)
e.g. 10th to 90th IPR P90 - P10
26
Histograms
- Continuous Data - No gaps
27
Frequency Density (Histograms)
Frequency / Class Width
28
Area (Histograms)
Frequency x k
29
Comparing Diagrams
3 Features Compare... 1) 1 measure of Location 2) 1 measure of Spread Use... 3) Context from question
30
PMCC (Correlation)
Measure strength and +/- of correlation -1 ≤ r ≤ 1
31
Regression Line
i.e. Line of best fit e.g. y = a + bx a = y when x = 0 b = how much y changes with x
32
Interpolation (Regression)
Estimating inside the data range (usually using regression line) + Usually more reliable
33
Extrapolation (Regression)
Estimating outside the data range (usually using regression line) - Usually less reliable
34
Exponential/ Non-Linear Models
y = ab^x lny = lna + xlnb y = ax^n lny = lna + nlnx
35
Normal Distribution
X~N(μ , σ^2) μ = Pop. Mean = np σ = SD = np(1-p) [For reference: BD: X~B(n, p)]
36
Binomial Distribution
X~B(n, p) Rules: - Fixed number of trials, n - Trials are independent - Fixed probability of success, p - 2 possible outcomes (usually success/failure)
37
Standard Normal Distribution
Z ~ N (0, 1^2) Z = (X - μ) / σ (Z = No. of SDs above mean)
38
Z Distribution
-1 <=> 1 SD = 68% -2 <=> 2 SD = 95% -3 <=> 3 SD = 99.7%
39