Stat 354 Flashcards

(88 cards)

1
Q

sampling theory vs. classical statistical theory

A
  • concerned w/ finite populations
  • different goals and restrictions
  • no density function, limited use of models
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

If N = n

A

complete enumeration

census

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

why survey? (survey vs census)

A
time
cost
speed
scope
accuracy
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Principle steps for surveying

A
Objectives
Resources
Population
Units of observation
Data to collect
Method of measurement
organization of field work
summary and analysis
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

steps for surveying, Objectives

A

precise statement of objectives

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

steps for surveying, resources

A

quantity of information “purchased” , cost of information for whole survey

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

resources (quantity) depend on

A

number of observations made (items sampled)

design of survey

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Determining/setting resources

A

determine sample design to obtain:

  • most information (lowest SE) for a given budget
  • most observations/cost for a given level of precision (SE)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

If resources can not meet the objective

A

do not survey

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Target population

A

population of interest

collection of elements about which we wish to make inference

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Element

A

object from which we take a measurement

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Target population example

A

collection of voters in a community

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Element example

A

a registred voter

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Sample population

A

population sampled from

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Discussing the target population

A

be aware of assumptions made to make the leap from sample population to target population

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Example of sample population

A

collection of registered* voters in a community

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

observational unit

A

element

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

sampling unit

A

unit selected for a sample

  • may contain 1+ observational units
  • non-overlapping collection of elements from the population
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

sampling unit example

A

a classroom

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

observational unit example

A

a student in a classroom

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

sampling frame

A

list of all sampling units in the population

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

sampling frame example

A

list of all students in the school

list of all registered voters in the community

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

reduced data quality

A

if you ask too many questions

-focus questions, be concise

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

measurement methods

A

self-administered questionaires

telephone, email, door-to-door, internet

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
very important step in methods
test questionare on small-scale - pilot study, pre-test | improve and re-assess
26
steps for surveying, organization of field work
- train people in goals and methods - early quality checking - plan for non-response
27
steps for surveying, summary and analysis
- edit questionnaire, record errors - methods for handling non-response - different estimation methods - estimation of precision
28
Non-response
some elements of sample fail to provide responses to survey
29
Non-response bias
if non-responders have differing opinions/ measurement from responders, bias occurs
30
non-response bias especially important when
non-response rate is high
31
selection bias
some units more likely to be included in sample than other | -cannot be overcome by increased n
32
sample
collection of sampling units drawn from sampling frame (single or multiple frames)
33
Literary digest poll, 1936
predicted 57% for Landon highest response in history, 2.4million Roosevelt won 62%
34
why did Literary digest fail
SRS from phone book and club membership -- selection bias (only rich 1/4 of pop. had phones)
35
what to learn from Literary digest poll
when selection procedure is biased, no size of n will help
36
personal vs mailed surveys
personal ca. 65% | mailed ca. 25%
37
how to find out if a sample is any good
ask how it was taken
38
Gallup poll, 1936
George Gallup n = 50,000ppl predicted Roosevelt victory (56% vs truth 62%) predicted Digest results (44% vs truth 43%)
39
Quota sampling
- interviewer assigned fixed number (quota) of subjects to interview - #s w/i categories are fixed
40
example of quota categories
residence age sex economic status
41
goal of quota sampling
aims to be representative based on census data | ex. design sampling based on % men vs women in population
42
problems with quota sampling
- while sample controls for certain variables, not the one of interest (ex. can't control of republican vs democratic) - interviewers are free to choose who they want within quota
43
sources of error in surveys
Errors of non-observation | Errors of observation
44
Error of non-observation
sampling error coverage error non-response
45
sampling error
deviation between sample estimate and true population value
46
coverage error
sampling frame does not match perfectly w/ target population
47
errors of observation
interviewers | respondents
48
Interviewer error
effect response of respondent in some way
49
example of interviewer error
body language
50
how to reduce sampling error
- sampling design - sample size - investigator
51
coverage error example
people who are unlisted in telephone book
52
Respondent error
differ in their ability and motivation to answer correctly | -response error
53
Response errors
recall bias prestige bias intentional deception incorrect measurement
54
Recall bias
different responders recall differently
55
prestige bias
exaggerate to appear more prestigious
56
example of prestige bias
exaggerate income
57
Intentional deception example
don't want to admit to breaking the law
58
incorrect measurement
respondent doesn't understand measurement units | ex. report on cm vs m; cups of coffee vs travel mugs
59
how to reduce non-response in data collection
reward for responding inform ahead of time shortened, concise, focused questionnaire callback, persistence marketing - train interviewers to 'sell it' data cleaning - check for errors
60
sampling distribution of ȳ
distribution of values of ȳ over repeated samples of same size
61
characteristics of ȳ sampling distribution
- mean = µ - standard deviation σ/n - approximately bell-shaped - assumes population is infinite
62
sampling distribution if n is too big
shorter tails than normal truncated non-normal
63
covariance
large | Cov(y1, y2) | = greater dependence btw y1, y2 depends on scale of measurement (units) standardize by correlation
64
SRSWR
n independent samples of size 1 | may include duplicates
65
SRSWOR
every possible subset of n from N equally likely to be chosen
66
what is the probability of selecting an individual sample in SRSWOR
1/ (N choose n)
67
N choose n
(N!) / n!(N-n)!
68
n!
product of all positive integers less than or equal to n | ex. 5 ! = 5 × 4 × 3 × 2 × 1 = 120
69
what is the probability that the ith unit is in the sample (πi)?
n/N | P(ith unit in sample) = n/N = πi
70
πi =
samples that contain i / total number of possible samples
71
ways to draw a SRS
- haphazard sampling - list all (N choose n) subsets, choose at random - random number generator - blind sampling - draw elements at random, include if not duplicates
72
haphazard sampling
using own judgement to draw a sample | ≠ random sample
73
fpc
finite population correction | 1 - (n/N)
74
when N is large, fps is
ca. 1 | 1 - (n/N) = 1 - (ca. 0)
75
CLT for SRSWOR
n --> N --> ∞ n/N --> C less than 1 n, N, N-n must be 'sufficiently large' n ≥ 50 usually ok
76
in experimental design, what is used to reduce variability
blocking (analogous to stratification)
77
strata
division of population into a number of non-overlapping groups
78
stratified random sample
SRS drawn from each stratum
79
advantages of stratification
- if different means in sub pop.'s may be more precise - administrative advantages - can obtain separate estimates of each parameter for each strata
80
ai
proportion sampled in each stratum
81
how do we decide ai
small variance | lowest cost
82
Best allocation is affected by
Ni (# of elements in each stratum) Si^2 (variability in each stratum) Cost of obtaining an observation in each stratum
83
How do factors that affect allocation impact sample size
larger sample sizes to strata w/ larger pop.'s larger sample sizes to strata w/ larger variability smaller sample sizes if costs are high
84
Types of allocation models
Optimal allocation Neyman allocation Proportional allocation
85
Optimal allocation
most information for least cost choose ni to minimize V(yst) for a fixed C or minimize C for a fixed V(yst) C = Co + E cini
86
Neyman allocation
special case of optimal allocation | used when costs are equal in all strata
87
Proportional allocation
split sample into strata w/ same proportion as population ni/n = Ni/N the stratified estimator (yst) is the average of all observations
88
rounding rules
always round up for n, except for optimal allocation (don't cross budget)