Midterm Flashcards

Question

Null Hypothesis (H₀)

Answer 1

The hypothesis that there is no effect or no difference. It is assumed to be true until evidence suggests otherwise.

Answer 2

The hypothesis that there is an effect or a difference

Answer 3

Probability of observing data w/ the "weakest evidence" in favor of Ha assuming Ho is true. A low p-value (typically < 0.05) suggests that the null hypothesis should be rejected.

Answer 4

Rejecting the null hypothesis in favor of Ha, when Ho is actually true. This is determined by the significance level (α). Importance: Controlling Type I errors is crucial to avoid false conclusions.

Answer 5

Failing to reject the null hypothesis when it is actually false. This is influenced by the sample size and the effect size. Importance: Minimizing Type II errors is important to ensure that real effects are not missed.

Answer 6

Analytical methods are used when the data meets certain conditions, such as being normally distributed or having a sufficiently large sample size. Importance: Analytical methods provide precise calculations of probabilities and are essential when the conditions are met. However, if the conditions are not met, simulation methods may be more appropriate.

Answer 7

1) Assume Ho initially and take the general probability (this is done by shuffling the predictor variable to mimic natural variation *hold one column and shuffle the other* 2) Simulate multiple trials to get different proportion values 3) plot # of trials vs (sample/test statistic) (mean1-mean2) 4) Find the p-value (which is the probability of observing data with the "weakest evidence" in favor of Ha assuming Ho is true) *find from prob of finding a value >= to the test statistic 5) If p-value < preset threshold (alpha), Reject Ho in favor of Ha 6)*alpha is known as the significance level*

Answer 8

Represents the rarity of our test statistic to reject Ho in favor of Ha (difference in the proportions or means) that we use to compare to the p-value 0.05

Answer 9

States that the distribution of samples means will approximate a normal distribution (Gaussian) as the sample size increases, regardless of the original population's distribution

Answer 10

normal distribution/bell curve with symmetry and "common" standard dev zones

Answer 11

A discreteb(not continuous) probability distribution that models the number of events (independent of one another) occurring in fixed intervals given a constant rate of occurrence

Answer 12

Poissons counts the number of events and is discrete whreas the normal distribution is continuous and counts the range of mean values

Answer 13

* St. Dev is known so accurate *difference in the two proportions A z-test compares a sample mean (or proportion) to a population parameter under the assumption that the population standard deviation (σ) is known (or the sample is large enough to estimate σ very accurately). Narrow Tall curve

Answer 14

* St. Dev of population is unknown so an estimate * Difference in the two means A t-test compares sample means when the population standard deviation (σ) is unknown, and it uses the sample’s standard deviation (s) as an estimate. Flatter curve which get pointer with an increased sample size

Answer 15

z-test if: The population standard deviation is known, or The sample is large thus gaussian (n(sample size) ≥ 30) and you trust the normal approximation. t-test if: The population standard deviation is unknown, or The sample is small, but you can assume (approximately) normal data.

Answer 16

95% confidence interval and standard error

Answer 17

It is a resampling technique that draws samples with random replacements from the original data set to estimate the distribution of a dataset (the same observation can be sampled multiple times) to calculate confidence intervals from simulated proportions from resampled data that don't rely on assumptions about the base population

Answer 18

It acts as the basis for which we use bootstrapped data to produce a range/interval around

Answer 19

1 standard deviation ~68% chance of capturing the true population parameter Narrower range Represented by +- More Commonly used

Answer 20

95% chance of capturing the true population parameter Wider Range, and thus typically shown as a range Less commonly used

Answer 21

Bootstrapping is used to estimate the standard error (uncertainty) for each measurement by resampling the data. These uncertainties are then propagated through the function or model that calculates the desired 'output,' resulting in a range of possible values for the output that reflects the combined uncertainty of all inputs.

Answer 22

********also can act as a control for some reason (Primary Definition) Matching: Pairing subjects based on similar characteristics to control for differences. *DONE IN OBSERVATIONAL STUDIES*

Answer 23

Replication: Using a larger sample size to increase the accuracy of the results.

Answer 24

Blocking: Grouping subjects into blocks based on certain characteristics and then randomizing within each block. *DONE IN EXPERIMENTAL STUDIES*

Answer 25

False **** we can also compare the medians because of flexibility **** False Explanation: Hypothesis testing is a versatile tool in statistics and is not limited to comparing means or proportions. It can also be used to: Compare variances (e.g., F-test). Test for independence in contingency tables (e.g., Chi-square test). Assess goodness-of-fit (e.g., Chi-square goodness-of-fit test). Evaluate correlation (e.g., testing if a correlation coefficient is significantly different from zero). Test regression coefficients in linear models. And much more!

Answer 26

*** For this class's examples its true*** False Explanation: Hypothesis testing is not limited to comparing two groups. It can be used to compare: More than two groups: For example, ANOVA (Analysis of Variance) is used to compare means across three or more groups. Single groups: For example, a one-sample t-test compares the mean of a single group to a known value. Relationships between variables: For example, regression analysis tests the relationship between a dependent variable and one or more independent variables.

Answer 27

*** True for this class***** False Explanation: While hypothesis testing often results in a binary decision (reject or fail to reject the null hypothesis), it is not limited to binary outcomes. Hypothesis testing also provides: p-values: A measure of the strength of evidence against the null hypothesis. Confidence intervals: A range of plausible values for the parameter being tested. Effect sizes: A measure of the magnitude of the observed effect. These additional outputs provide more nuanced insights beyond a simple "yes or no" decision. For example, a p-value of 0.06 might not lead to rejecting the null hypothesis at the 0.05 significance level, but it still suggests some evidence against the null hypothesis.

Answer 28

The significance level (alpha)

Answer 29

lower alpha or increase the sample size

Answer 30

increase the sample size and increase alpha

Answer 31

the curve becomes more "pointy" as it becomes more Gaussian

Answer 32

The samples must be drawn from a pop. whose values are distributed along a Gaussian curve If the above condition is not satisfied then we must have a sufficiently large number of samples to thus fit the Gaussian Profile

Answer 33

(Number of trials in bin/Number of total trials in sample)/bin width

Answer 34

Simulated methods can utilize monte carlo simulations which are insentive to the shape of the population distribution, dont need to be a Gaussian fit and are more flexible to perform hypothesis testing on other kinds of numerical summaries of sample data, like using the median and such

Answer 35

No it does not, replacement is for bootstrapping only

Answer 36

The predictor variable

Answer 37

when participants have poor ability to recall past events accurately as needed for a study

Answer 38

participants selected by ease of access instead of random sampling

Answer 39

Rare event

Answer 40

95CI>90CI>SE

Midterm Flashcards

(65 cards)