exam 2 Flashcards
(44 cards)
descriptive, differences, correlation/regression, association projects (differences, projects)
descriptive - do not carry out hypothesis, the goal is to describe the situation (various statistical measures may be important), histogram, density and boxplots
differences - compares two or more sets of data (hypothesis will relate to differences you believe may exist), bar charts, side by side boxplots
correlation/regression - attempts to link variables (looking for strength and direction of links between variables), scatterplots, line plots
association - emphasis on links between variables that are categorical, bar charts, pie charts
null and alternative hypothesis
we test the null hypothesis, data is gathered to test null
we do not prove the alternative hypothesis, the most we can do is find support for it
possible outcomes from hypothesis testing
reject and fail to reject null - reject = null is not accurate, fail to reject = null is accurate
p-value - the probability that the null hypothesis is correct from the data gathered
histogram
descriptive test
boxplots
descriptive, difference (side by side)
bar charts
differences, association
scatterplots
correlation and regression
line plots
correlation and regression
pie charts
association
histogram in r
hist(object)
boxplot from object in r
boxplot(object)
set scale of axis in r
ylim=c(0,0) xlim=c(0,0)
add axis label and graph title in r
xlab = “Title”
ylab=”Title”
main=”Title”
change colors of bars or boxes in r
col=”Color”
popperian philosophy
we learn by being wrong, no amount of evidence can prove something is true (empirical falsification)
testing a null/ reshuffling
to determine what no change would look like, create data that would be reasonable for the system (after plenty of research about what is realistic) to come up with more data
level of probability that scientists use as a threshold for deciding how to interpret hypothesis
.05 p-value
basic study set up for a t-test
create hypothesises, collect data, data must be normally distributed, each data point must be independent
what happens to t when variables are changed
when t increases, mean difference increases, when t decreases, standard deviation increases, when t increases, n increases
what test to do to determine if data are appropriate for t-test, how to interpret
find if the data are normal (boxplot or shapiro-wilk test)
greater than .05 = the data is normal and a t-test can be done
t-test in r
t.test(object)
one tailed vs two tailed t-test
one tailed - more power to detect directional effect (greater than or less than)
two tailed - shows evidence that the difference between means is greater than expected
paired t-test
repeated observations collected for a single variable with 2 levels (differences between sample point 1 and sample point 2 are compared for the same sample unit)
non-parametric test
use the rank of data and rank from smallest to largest, compare the ranks
mann-whitney (two sample) and wilcoxon (paired) tests