Sampling Techniques, Probability, and Distributions Flashcards

(62 cards)

1
Q

target population

A

the entire group of individuals or entities to which researchers intend to generalize the findings of their study

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

study population

A

a more specific group within the target population that is accessible and practical for the researchers to study

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

sample

A

a subset of the study population that is actually observed or analyzed

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

representative sample

A

a sample whose characteristics correspond to, or reflect, those of the original population or reference population

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

probability sampling

A

method in which every member of the population has a known, non-zero chance of being selected

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

simple random sampling

A

every member of the population has an equal chance of being selected

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

what are advantages of simple random sampling?

A

minimal knowledge of population needed

external validity high, internal validity high

statical estimation of error

easy to analyze data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

what are disadvantages of simple random sampling?

A

high cost, low frequency of use

requires sampling frame

does not use researcher’s expertise

larger risk of random error than stratified

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

systematic sampling and formula

A

every nth member of the population is selected after a random starting point

f=N/sn

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

stratified sampling

A

the population is divided into subgroups (strata) based on certain characteristics, and random samples are taken from each stratum

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

cluster sampling

A

the population is divided into clusters (usually based on geographical areas) and a random sample of clusters is selected. All members of the selected clusters are then surveyed

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

non-probability sampling

A

not all members of the population have a known or equal chance of being included in the sample

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

what are common types of non-probability sampling?

A

convenience sampling

judgmental or purposive sampling

quota sampling
snowball sampling

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

convenience sampling

A

participants are selected based on ease of access or availability

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

judgmental or purposive sampling

A

researchers select participants based on their judgement about who will provide the best information

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

quota sampling

A

the population is segmented into mutually exclusive subgroups, and a non-random sample is taken from each subgroup, ensuring that specific characteristics are represented in the sample

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

snowball sampling

A

you ask your friend and they ask their friend

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

sampling errors

A

deviations from the true population parameters due to the nature of selecting a sample rather than a complete census

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

random sampling error

A

occurs to the natural variability that arises when a sample is drawn from a population

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

systematic sampling error

A

occurs when there is consistent bias in the sampling process that leads to a sample that is not representative of the population

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

coverage error

A

when some members of the population are not included in the sampling frame

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

non-response error

A

when individuals for the sample do not respond or participate in the study

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

deterministic process and example

A

total certainty, all data is known beforehand

if you know the initial deposit and interest rate of a bank account, you can determine amount of money in it after one year

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

probabilistic process and example

A

random, stochastic

rolling a die until it comes up 5 (you know the odds are 1/6 but you don’t know when that will happen)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
What is the probability as relative frequency formula
P(A)=F(A)/F(E) P(A) = probability of outcome A occurring F(A) = absolute frequency of outcome A F(E) = absolute frequency of all outcomes for event E
26
multiplication rule of probability formula
P(A and B) = P(A)*P(B)
27
addition rule of probability
P(A or B) = P(A) + P(B)
28
statistical independence
probability of one event is not influenced by whether another even has occurred
29
The Law of Large Numbers
as the number of repetitions of a probability experiment increases, the proportion with which a certain outcome is observed gets closer to the probability of the outcome
30
probability distribution
a statistical function that describes all possible values and likelihoods that a random variable can take within a given range
31
Probability Mass Function (PMF)
a function that gives the probability of each possible outcome for a discrete random variable.
32
purposes of PMF
- quantifying specific outcomes - constructing probability models - understanding distribution
33
how does PMF quantify specific outcomes?
PMF allows us to calculate the probability of specific outcomes e.g. in a dice roll, the PMF gives the probability of rolling a 1, 2, 3, etc.
34
how does PMF construct probability models
by providing a structured way to assign probabilities to each possible outcome of a discrete random variable the PMF models the probability of different outcomes, enabling the construction of a complete probability model for a discrete scenario
35
how does the PMF help with understanding distribution
by knowing the PMF, we understand how the probabilities are distributed across different outcomes, which is crucial for further analysis and decision-making
36
cumulative mass function (CMF)
a function that tells you the probability that a random variable is less than or equal to a certain value
37
What is the difference between PMF and CMF
The PMF gives you the probability of an exact outcome, while the CMF gives you the probability that the outcome is up to a certain point
38
probability density function (PDF)
a function that describes the likelihood of different outcomes for a continuous random variable
39
why does PDF give ranges, not discrete values?
For continuous random variables (like height or weight), the Probability Density Function (PDF) doesn’t give you the probability of an exact value because there are infinitely many possible values
40
probability density
refers to a value that describes how concentrated the probability is at a point in the continuous case, but actual probabilities for continuous variables are found over intervals
41
cumulative distribution function (CDF)
applies to continuous random variables; represents the probability that the random variable X takes on a value less than or equal to x
42
what are the purposes of CDF?
- aggregating probabilities - analyzing distribution - comparing distributions - facilitating calculations
43
uniform distribution
describes the situation where the probability of all outcomes is the same e.g. flipping a coin
44
binomial distribution
provides information about the probability of the repetition events where there are two only two possible outcomes e.g. heads or tails, left or right
45
Bernoulli trials
random experiments that have exactly two possible outcomes: success or failure
46
what are the four steps of the Bernoulli trials
1. N independent trials of an experiment (i.e. an event like a coin toss) 2. every trial must have the same set of possible (e.g. heads and tails) 3. the probability of each outcome must be the same for all trials 4. the resulting random variable is determined by the number of successes in the trials (successes is one of the two outcomes)
47
Poisson distribution
can be used to analyze how frequently an outcome occurs during a certain time period or across a particular area
48
poisson distribution formula
P(k events)= (λ^k * e^−λ)/k! λ is the average number of events in the interval (mean), 𝑒 is Euler’s number (approximately 2.718), 𝑘! is the factorial of 𝑘
49
for a Poisson distribution, the probability that an event will occur with a given unit must be the ____ for all units
same
50
how do we observe the frequency of rate events
1. find its mean occurrence 2. construct a Poisson distribution and compare our observed values to those from the distribution to see the degree to which out observed phenomenon is obeying the Law of Small Numbers
51
law of small numbers
describes how the Poisson distribution can model the probability of rare events happening over a given time or space
52
large number of trials law
assumes that the number of trials or opportunities for the even to occur is large, which makes it possible for the rare event to happen multiple times, despite the low probability per trial
53
how does the law of small numbers relate to the Poisson distribution
it's used as a foundation for the Poisson distribution, because it models the probability of a given number of rare events happening over a fixed interval of time or space
54
why do you keep np constant in a Poisson distribution
to ensure that the expected number of successes remains meaningful and mixed, regardless of changes in n and p
55
in a normal distribution, all 3 measures of central tendency are _____
equal
56
what are probability estimates based on in a normal distribution?
area under the curve
57
normal distribution is defined by its _____ and _____
mean and SD
58
what does standardizing mean?
refers to the process of transforming data to have a mean of 0 and a standard deviation of 1. This transformation makes different datasets comparable by placing them on the same scale
59
because normal tables are standardized, how to do we transform data to a standard normal distribution
z- score
60
z-score and formula
measures how many standard deviations a data point is from the mean of a dataset. It helps you understand the relative position of a specific value within the distribution of the data. z= (data point - sample mean)/sample SD
61
t-distribution
probability distribution used in statistics to estimate population parameters when the sample size is small, and the population standard deviation is unknown similar to normal distribution but has heavier tails
62
chi-square distribution
used primarily in hypothesis testing and inferential statistics, especially in tests related to variance and categorical data