Statistics Flashcards
The Scientific Method
A logical, systematic approach to the solution of a scientific problem
- Develop theory; observations, literature review, prior research
- Construct a hypothesis
- Design a study
- Analyse data
- Draw conclusions
What is a parameter?
A numerical summary of a population. Such as mean, median, range… of a population
What are the types of data?
- Quantitative; numeric i.e. age, height, weight
- Qualitative; descriptive i.e. favourite colour, suburb, type of car
What does discrete data include?
Only limited set of values
- Nominal: values where order is arbitrary (i.e. gender, ethnicity, etc), unordered, categorical also known as binary, dichotomous, indicator variable (qualitative)
- Ordinal: scale where ranking matters but are not consistently correlated (i.e. NYHA), ordered categorical (e.g. level of education, high-school, under/post degree) (qualitative OR quantitative)
What does continuous data include?
Unlimited values
- Interval: have legit mathematical values (i.e. temperature), numeric scale with consistent differences between points (i.e. standardist IQ) (quantitative)
- Ratio: equal intervals and meaningful zero point (i.e. height, wt, time, length), numeric scale with consistent differences between points and absolute zero (weight in kilos) (quantitative)
When does experimental manipulation occur?
Between subjects -> independent groups
- Within subjects/repeated measures: related groups
What is measurement error?
An error that occurs when there is a difference between the information desired by the researcher and the information provided by the measurement process
What are extraneous and confounding variables?
Extraneous: another variable that is not the IV or DV
Confounding: An extraneous variable that can potentially explain the relationship between the IV and DV
- Example: age reading ability, year of school in children
- IV: age, DV: reading ability, Confound: year of school
What is the measurement type of the variables?
Categorical data: discrete categories or groups
- Frequency tables and bar chart / pie chart
- Numeric data: a score on a scale
- Numeric summary statistics (mean/median/mode, standard deviation) and a histogram
Consider the following aspects when summarising data:
- Typicality (mean, median and mode)
- Variability (range, IQR, std dev, variance)
- Shape (skew, kurtosis)
What features does a normal distribution have?
- Variability
- Unimodality
- Central tendency
- Symmetrical
- Mesokurtic
What is a z-score?
Z-scores are standardised scores, measuring the difference between a score and the mean, expressed in std dev units
z = score - mean / std dev
What is the central limit theorem?
- Distribution of sample means will be approximately normal
- Mean of the sample means will be the same as the population mean
What is standard error?
- Standard deviation of sample means = Standard Error
- Standard Error = std dev / square root N
What is null hypothesis significance testing?
- Analysing data from a sample, to see whether it can make a contribution to a field of knowledge
- Conservative approach: begin by assuming the null hypothesis is true, then
test whether we have evidence against that - Summarise data and compute a test statistic
What is the hypothesis testing procedure?
- Decide on alpha
- Calculate test statistics
- Compare obtained with critical statistic
obtained >= critical -> reject H0
obtained < critical -> don’t reject H0
What are T-tests?
Inspecting mean scores on a numeric variable
T-test = signal-to-nose ratio
What are one-sample t-tests?
Average score on variable in the population from which the sample is drawn signficantly different to a known number?
- Is the population’s mean score different to another population?
t = sample mean - test value / SE (sd / root N)
What are assumptions?
Conditions that need to be met for the test to be valid
What are assumptions of a one-sample t-test?
- Variable is on a numeric scale (interval or ratio)
- Variable is normally distributed in the population
- Observations are independent
What are t-statistics?
Is the ratio of signal (difference between means) to noise (variance around the mean)
- The bigger the t-statistics = 1: signal to equivalent to noise
- Null hypothesis significance testing: how likely is it that we obtained this t-statistic if the null hypothesis is true
What are independent samples t-test?
- Comparing 2 group means
- Is there a difference between means of 2 (independent) groups?
t = sample mean of group 1 - sample mean of group 2
/ square root of std dev 1/n1 + stddev2 / n2
What are the assumptions of independent-samples t-test?
- DV is on a numeric scale
- DV is normally distributed in the population within groups
- Variance of DV is equal between groups
- Observations are independent (within and between groups)
What are paired t-tests?
Is there a difference in average scores between 2 related groups?
- Same person over 2 time points or 2 conditions
- Related people
- 2 observation (Scores) are non-independent (related)
one numeric DV, one categorical IV