intro to stats Flashcards
categorical variables
Variable varies by type
- Levels are usually string-based (character-based)
- Can be numerical if the numbers are used as names (no numerical value associated with the number)
integer variables
- Numerical variable consisting of whole numbers
- Numbers have real numerical meaning
continuous variables
- Numerical variables which can theoretically have infinite decimal places
Dichotomous variables
- only 2 levels
- 0/1, Ctrl/Treatment, TRUE/FALSE
- Can be categorical or integer
Variables defined by data type
nominal variables
categorical variables
ordinal variables
ranked data
ex. 1st, 2nd, 3rd
interval and ratio scale variables
can be integers or decimal places
ratio : true zero, ratios can be meaningfully calculated (ex. 0K is absence of heat)
interval: does not have true zero,(0C is not absence of heat)
variables defined by casual relationship
In an experimental setting, we manipulate the independent variable, and measure scores for the dependent variable
nuisance variables
- confounding variables can potentially change the value of the outcome variable, and vary systematically with the predictor variable
- obscuring variables can also potentially change the value of the outcome variable, but do not vary systematically with the predictor variable
experimental designs 1
Different individuals are in different experimental conditions
* Between-subjects designs
* Independent groups designs
experiment designs 2
The same individuals are in different experimental conditions
* Within-subjects designs
* Repeated-measures designs
mixed designs
some predictor variables are between-subjects and some are within-subjects
inference
based on various methods such as hypothesis testing,
confidence interval estimation and parameter estimation
inferential statistics
uses sample statistics to estimate the value of a population parameter
parameter
a constant numerical characteristic of a population
- can include shapes (normal distribution), as shapes can be defined numerically
statistic
corresponding value calculated for a sample
population parameter and sample statistics symbols
standard deviation
- sigma: population parameter
- s: sample statistic
mean
- mu: population parameter
- M: sample statistic (or x bar)
i
index or individual
- refers to each score
statistics are invented tools
- statistical tools are invented to estimate probabilities that guesses are correct
characteristics of popular statistical methods
- common sense
- ease of use
- inertia (being good enough)
x^2
test statistic is a single number that represents how well the observed data fits your null hypothesis
-needs to produce a single number that incorporates two properties of the data(number and proportion)
probability distribution
divide counts (y) by the total number of simulations
- used to obtain probabilities associated with specific outcomes
p value
probability of obtaining our observed results if H0 was true
- p value low = low probability of obtaining our observed results, if H0 is true
classical statistics
calculates the theoretical probability distribution that would be obtained if the null hypothesis is correct
ex. df= k-1