2 Inferential statistics Flashcards
(26 cards)
What is the main goal of Inferential Statistics?
To use sample data to make generalizations, estimates, predictions, or decisions about a larger population. Key methods include confidence intervals and hypothesis testing.
What is a Sampling Distribution of a Statistic?
The probability distribution of a statistic (like the sample mean or sample proportion) obtained from a large number of samples drawn from a specific population. It’s crucial for understanding how much sample statistics vary.
What is a Confidence Interval?
A range of values, calculated from sample data, that is likely to contain the true value of an unknown population parameter (e.g., population mean). It’s expressed with a certain confidence level (e.g., 95%).
What is Statistical Hypothesis Testing?
A formal procedure for comparing observed data with a claim (hypothesis) whose truth we want to assess. It involves deciding between two competing hypotheses based on sample evidence.
Define Null Hypothesis (H₀) and Alternative Hypothesis (H₁ or Hₐ).
Null Hypothesis (H₀): A statement of ‘no effect’ or ‘no difference,’ often representing the status quo or a baseline assumption. We test against this hypothesis.
Alternative Hypothesis (H₁ or Hₐ): A statement that contradicts the null hypothesis, representing what we might believe to be true or hope to find evidence for (e.g., there is an effect or a difference).
What are the two types of errors in Hypothesis Testing?
Type I Error: Rejecting the null hypothesis (H₀) when it is actually true (False Positive). The probability is denoted by α (alpha), the significance level.
Type II Error: Failing to reject the null hypothesis (H₀) when it is actually false (False Negative). The probability is denoted by β (beta).
What is the Power of a Test?
The probability of correctly rejecting a false null hypothesis (H₀). It is equal to 1 - β (1 minus the probability of a Type II error). Higher power is desirable.
When is the Student’s t-distribution typically used?
It’s used in hypothesis testing and constructing confidence intervals for the population mean when the population standard deviation (σ) is unknown and the sample size is relatively small. It resembles the normal distribution but has heavier tails.
What is the Chi-square (χ²) distribution often used for?
Used in hypothesis tests concerning population variance, goodness-of-fit tests (checking if sample data fits a specific distribution), and tests of independence (checking if two categorical variables are related).
What statistical tests are mentioned for comparing means between populations/groups?
t-test: Used for comparing the means of two groups.
ANOVA (Analysis of Variance): Used for comparing the means of two or more groups.
What is ANOVA (Analysis of Variance)?
A statistical method used to test for significant differences between the means of two or more groups by comparing the variance between the groups to the variance within the groups.
What is Tukey’s Multiple Comparison Test used for?
A post-hoc test performed after an ANOVA has found a significant difference among group means. It identifies which specific pairs of group means are significantly different from each other while controlling the overall error rate.
Konfidensinterval
Et beregnet interval [nedre grænse, øvre grænse] baseret på stikprøvedata. Formålet er at give et estimat for, hvor en ukendt populationsparameter (fx middelværdi μ) sandsynligvis ligger. Konfidensniveauet (fx 95%) angiver sandsynligheden før dataindsamling for, at den metode, man bruger, vil “fange” den sande parameter i intervallet.
Hypotesetest
En formel procedure til at vurdere, om stikprøvedata giver tilstrækkeligt bevis til at afvise en specifik påstand (nulhypotese) om en population. Man sammenligner data med, hvad man ville forvente, hvis nulhypotesen var sand.
Nulhypotese (H₀)
Udgangspunktet eller “standardantagelsen” i en hypotesetest. Det er ofte en påstand om “ingen effekt”, “ingen forskel” eller en specifik værdi for en populationsparameter (fx μ = 10). Det er den hypotese, man forsøger at finde bevis imod.
Standardscore / Teststørrelse (fx z, t, χ², F)
Et standardiseret mål for, hvor “ekstrem” eller usædvanlig en stikprøvestatistik er, under antagelse af at nulhypotesen (H₀) er sand. Den måler afstanden mellem den observerede statistik og den forventede værdi under H₀ i standardiserede enheder (fx standardafvigelser eller standardfejl).
p-værdi
Sandsynligheden for at observere en teststørrelse (standardscore), der er mindst lige så ekstrem som den, man har beregnet fra stikprøven, hvis nulhypotesen (H₀) er sand. En lille p-værdi (< α) indikerer, at de observerede data er usandsynlige under H₀, hvilket taler for at afvise H₀.
Signifikansniveau (α)
En forudbestemt grænseværdi (tærskel), typisk 0.05 (5%). Hvis p-værdien er mindre end eller lig med α (p ≤ α), afviser man nulhypotesen. α repræsenterer den acceptable risiko for at begå en Type I fejl.
Type I Fejl
Fejlen ved at afvise nulhypotesen (H₀), når den i virkeligheden er sand. Man konkluderer fejlagtigt, at der er en effekt eller forskel. Sandsynligheden for at begå denne fejl er lig med signifikansniveauet (α).
Kaldes også “falsk positiv”.
Type II Fejl (β)
Fejlen ved ikke at afvise nulhypotesen (H₀), når den i virkeligheden er falsk. Man overser fejlagtigt en reel effekt eller forskel. Sandsynligheden for denne fejl betegnes β.
Kaldes også “falsk negativ”.
Teststyrke (Power)
Sandsynligheden for, at en test korrekt afviser nulhypotesen (H₀), når den er falsk (dvs. når alternativhypotesen er sand). Styrken er lig med 1 - β. En test med høj styrke er god til at opdage reelle effekter eller forskelle.
Hypotesetest-processen
Hele idéen med at opstille en nulhypotese (H₀), indsamle data, beregne en teststørrelse (standard score) for at måle “ekstremhed”, finde p-værdien (sandsynligheden for dataene givet H₀), og sammenligne p-værdien med et signifikansniveau (α) for at træffe en beslutning (afvis/afvis ikke H₀).
p-værdiens betydning
At forstå, at p-værdien ikke er sandsynligheden for, at H₀ er sand, men derimod sandsynligheden for dataene (eller mere ekstreme data) hvis H₀ er sand. Det er et mål for, hvor godt dataene stemmer overens med H₀.
Signifikansniveau (α) og Type I Fejl
At forstå, at α er en forudbestemt risikovillighed for at begå en Type I fejl (afvise en sand H₀). Det er den tærskel, vi bruger til at træffe beslutningen.