Stratified sampling Flashcards

(26 cards)

1
Q

What is stratified random sampling?

A

a) a form of random sampling which involves splitting the population into distinct, non-overlapping sub-groups called strata
b) strata are homogenous

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

How do you perform stratified sampling?

A

a) identify potentially relevant strata
b) take independent random samples from each stratum
c) combine results to estimated T, p, or X

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What are the advantages of stratification?

A

a) potential to reduce variance of estimates
b) more representative as each stratum is represented
c) costs may be lower
d) estimates for separate strata can be compared

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

How do you calculate 𝑠𝑖 sqrd? (sample variance of i-th stratum)

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What does the sum of the Wi’s add to in stratified sampling?

A

1

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Why do we use weighted average in stratified sampling estimates?

A

a) sample means evaluated from larger strata have more importance (weight) and vice versa

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

How do you calculate the variance of the total in stratified sampling?

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What do we assume when estimating in stratified sampling?

A

that the estimators are normally distributed

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

How would you calculate confidence intervals with stratified sampling?

A

a) z- or t-statistics for estimating margins of error/confidence intervals for a mean or total.
b) non-normal = a z-statistic can only be used given that each 𝑛𝑖 > 30.
c) t-statistic degrees of freedom are (π‘›βˆ’π‘˜), i.e. the (overall sample size – number of strata).
d) use Z-statistic for estimating margin of error/confidence interval for proportion.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is sample allocation in stratified?

A

splitting the overall sample size between the different strata to decide the sample size 𝑛𝑖 for each stratum.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What are the methods of sample allocation?

A

a) proportional
b) neyman
c) optimal

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is proportional allocation?

A

a) simplest method
b) involves sampling each stratum in proportion to its size or weight respective to the population
c) sample more from larger strata, less from smaller

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

what are the requirements of proportional allocation?

A

a) means that sampling fractions are equal for each stratum (up to rounding errors)
b) ensures a representative sample;
c) straightforward and commonly used;
d) requires the stratum sizes 𝑁𝑖 or at least the stratum weights π‘Šπ‘– to be known.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

When allocating sample size, how do you round the estimate?

A

the rounding of the overall sample size and sample sizes within strata should be to the nearest whole number

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is Neyman allocation?

A

a) chooses the 𝑛𝑖 to directly minimise the variance of the estimator π‘₯̅𝑆𝑇
b) sample more from strata with higher variability and/or larger strata.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

what are the requirements of Neyman allocation?

A

a) Requires more information than proportional allocation
b) As well as the 𝑁𝑖 (or the π‘Šπ‘–), it requires 𝑆𝑖 to be known or estimated
c) For equal 𝑆𝑖, the allocation reduces to proportional allocation, so the Neyman method is more general;
d) Where the 𝑆𝑖 differ, this method is more precise than proportional allocation.

17
Q

What is relative efficiency?

A

a) measurement of the improvement in the performance of one unbiased estimator over another
b) variance of estimator a over variance of estimator b

18
Q

How do you interpret relative efficiency?

A

a) 𝑅𝐸(π‘₯̅𝑆𝑇/π‘₯̅𝑆𝑅𝑆 ) = 2
b) then stratified sampling was twice as efficient as SRS.

19
Q

What is the design effect?

A

a) the reciprocal of the relative efficiency
b) another way to assess which estimator, and sampling scheme, is more efficient to use.
c) variance of parameter 1 over variance of parameter 2

20
Q

How do you interpret design effect?

A

a) the design effect of stratified random sampling to SRS, when estimating a population mean would be:
𝑑𝑒𝑓𝑓(π‘₯̅𝑆𝑇/π‘₯̅𝑆𝑅𝑆 ) = (π‘‰π‘Žπ‘Ÿ(π‘₯̅𝑆𝑇)) / (π‘‰π‘Žπ‘Ÿ(π‘₯̅𝑆𝑅𝑆))
b) Design effects less than 1 indicate an efficient design for the sampling scheme on numerator

21
Q

What is optimal allocation?

A

a) chooses the sample sizes 𝑛𝑖 to minimise the variance
b) takes into account situations where there is a budget for conducting a survey
c) sampling costs will be different in each strata
d) sample from strata which are larger and/or have greater variability 𝑆𝑖 and/or have lower sampling costs.

22
Q

How do you calculate total cost of sampling C?

A

where ci is the cost of sampling a single population element from the i-th stratum

23
Q

what is the optimal allocation sample size formula?

A

a) where 𝛾𝑖 are the fractions of the sample size 𝑛 to be taken from the 𝑖-th stratum and βˆ‘π›Ύπ‘– =1.

24
Q

What are the optimal allocation results under certain circumstances?

A

a) For equal costs 𝑐𝑖, this method gives the same as Neyman allocation.
b) For equal costs and equal 𝑆𝑖, this method gives the same as proportional allocation
c) so optimal allocation is the most general of the 3 methods of allocation

25
Whats the difference between the different types of allocation?
a) Proportional: Use when 𝑆𝑖 are equal and 𝑐𝑖 are equal across the strata, when stratified random sampling is considered. b) Neyman: Improves on proportional allocation when the 𝑆𝑖 differ (with equal sampling costs across strata), when stratified random sampling is considered. c) Optimal: Should be used when known sampling costs differ between strata (regardless if 𝑆𝑖 are equal or different).
26