research stats midterm Flashcards

1
Q

what is biostatistics?

A

the statistics of medicine, health sciences and public health

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

define target population

A

larger population to which results will need to be generalized

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

define accessible population

A

actual population of subjects available

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

define sample

A

subgroup of accessible population which allows results to be generalized

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

define parameter

A

statistical characteristic of population

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

define statistic

A

statistical characteristic of sample

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

define descriptive statistic

A

describes sample shape, central tendency, variability

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

define inferential satistic

A

used to make inferences about a population

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

define central tendency

A

the central value
best representative value of target population
single value

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

define variability

A

spread of the data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

define frequency distribution

A

the pattern of frequencies of a variable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

3 measures of central tendency

A

mean - average
median - two equal halves
mode - most frequent score

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

describe skewed to the right

A

tail faces right
positive skew
mean > median/mode

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

describe skewed to the left

A

tail faces left
negative skew
mean < median/mode

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

when is mean best to use?

A

numeric, symmetric data

not good for skewed

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

when is median best to use?

A

skewed data
not effected by extremes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

when is mode best to use?

A

nominal or ordinal
common in surveys

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

advantages to mean

A

easy to calculate and interpret
dont need to arrange values
all values represented
all algebraic formulas possible

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

disadvantages to mean

A

cant be used with categorical data
cant calculate if data missing
affected by extremes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

advantages to median

A

easy to calculate
not affected by extremes
can be used with ranked data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

disadvantages to median

A

tedious in large data set
problematic with even number of observations
doesnt account for all values

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

advantages of mode

A

easy to understand and fine
not affected by extremes
easy to ID in data set and in frequency distribution
mode is useful for categorical data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

disadvantages of mode

A

not defined if no repeats
not based on all values
unstable when data has small number of values
sometimes could have 2+ or no modes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

when would you choose median over mode?

A

distribution is skewed
researcher is using ordinal data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
define range, percentiles, quartiles
R - max-min P - divides into 100 parts Q - four parts
26
define interquartile range
difference between 25th and 75th percentile used with median
27
describe box plot
min 1st quartile median 3rd quartile max
28
define standard deviation
reported same units as raw scores mean +/- SD
29
define variance
square of SD
30
coefficient of variation
used for interval and ratio data only expressed as percentage unitless so good for comparing scales
31
constant and predictable characteristics
68% +/- 1SD 95% +/- 2 SD 99% +/- 3 SD
32
define a z-score
standardized score based on normal distribution z = SD units z = score - mean / SD
33
define sampling error
sample mean will not equal the population mean. the difference is called sampling error how well does the sample represent the population?
34
z scores for CI calculations
90% = z 1.65 95% = z 1.96 99% = z 2.58
35
central limit theorem
will approach mean is N increases
36
define point estimate
single value the is best estimate
37
define confidence interval
range of values that we are confident contains parameter
38
how would you increase precision (narrow) in CI?
larger sample size less variance (lower SD) lower selected level of confidence to 90%
39
CI equation
CI = mean +/- (z) SEM
40
define null hypothesis
no difference or relationship will with reject or fail to reject
41
define alternative hypothesis
is a difference or relationship
42
error: liar or blind
type 1: liar, p value type 2: blind
43
if p value is less than or equal to alpha,
reject the null
44
if p value is greater than alpha,
fail to reject the null
45
what happens if we fail to reject the null?
attribute any observed difference to sampling error only
46
what p value and CI are analogous to each other?
95% CI .05 p value
47
significance of type 1 error
mistakenly finding difference p value tells probability
48
significance of type 2 error
mistakenly finding no difference statistical power = 1-B power is probability of rejection
49
critical values for two tailed test
2.5% of critical region on each side of non critical nondirectional hypothesis
50
critical values of one tailed test
all 5% of critical region on the side hypothesis supports directional hypothesis
51
which (one or two tailed) is more powerful
one tailed
52
define statsical power
probability of finding a statistically significant difference if such difference exists in the real world
53
what are the four powers of power?
alpha effect size variance sample size
54
best way to increase power
increase sample size
55
determinants of statistical power
p = power a = alpha level n = sample size e = effect size
56
what is A priori
before data collection
57
what is Post hoc
after data collection only an issue of you fail to reject null
58
CI analysis
if upper boundary excludes important benefit of treatment, trial is definitively negative if CI includes important benefit, treatment might still be worthwhile
59
define parametric statistics
assumes that sample data comes from population that follows a probability distribution based on a fixed set of parameters
60
what are the 4 assumptions of parametric tests?
scale data - ratio or interval random sampling equal variance - roughly equivalent before starting normality - normal distribution
61
what does a t-test do?
determines if the difference in sample represent a real difference in the population or is if just sampling error
62
what are examples of two levels of one independent variable?
two different groups one single group with two interventions one single group with pre and posttest measurements
63
conceptual bias of comparing means
sample means will be different variance comes from two sources ~the IV and everything else
64
conceptual bias with independent groups
t= difference between means / variability within groups
65
conceptual bias with repeated measures
t = mean of differences between pairs / SD error of the difference scores
66
what if t > 1?
you have a greater difference between groups
67
what if t< 1?
you have more variability within groups
68
what is the most simple t test equation?
t = treatment effect + error / error
69
what are degrees of freedom?
the number of independent pieces of information that went into calculating the estimate number of values that are free to vary
70
independent (unpaired t-test)
numerator is difference between group means denominator represents the variance within groups
71
assumptions for unpaired t-tests
data from interval or ratio samples are randomly drawn from populations homogeneity of variance - equal variances population is normally distributed
72
are unequal variances an issue?
not a major issue when sample sizes are equal
73
effect size for t-test
use cohen's d small d = 0.20 medium d = 0.50 large d = 0.80 extra large d = 1.0 or 1.1
74
paired t-test
numerator is mean of paired difference scores denominator is standard error of difference scores
75
3 assumptions for paired t-test
data from ratio or interval samples are randomly drawn from populations population is normally distributed
76
what is an inappropriate use of multiple t-tests
to compare more than 2 means within the same sample "family wise error" increase chance of type I error
77
which t test is used for independent groups with one IV
independent
78
which t test is used for repeated measures with one IV
paired
79
levene's test
for equal variances for independent groups tests the null: no dig difference in variance between
80
what statistic does the ANOVA use?
the F statistic
81
(ANOVA) if variance between samples is small,
F will be small
82
(ANOVA) if variance within samples is small,
F will be large
83
what is an ANOVA for?
compare 3+ groups
84
one way ANOVA
one IV with 3+ levels
85
one way repeated measures ANOVA
one IV with 3+ levels
86
comparison of group means in ANOVA
looks at distance of each group from the grand mean
87
what is the F test called?
omnibus test will tell that a difference exists, but not where
88
what tells where a difference exists?
multiple comparison tests
89
ANOVA effect size small
eta squared: .01 cohen's f: .10
90
ANOVA effect size medium
eta squared: .06 cohen's f: .25
91
ANOVA effect size large
eta squared: .14 cohen's f: .40
92
increased power in RM ANOVA
less variance
93
define sphericity
homogeneity of variance of differences test with mauchly's test
94
what is another name for multiple comparison tests?
pairwise comparisons
95
describe post hoc MCT
performed after ANOVA most common test every difference
96
describe planned comparisons MCT
instead of ANOVA focused on specific comparisons
97
what is the goal of MCT
decrease family wise error rate
98
what is a solution of of family wise error?
bonferroni correction divide alpha by the number of statistical tests
99
describe fisher's least significant difference
essentially unadjusted t-tests (LSD) least conservative most power
100
describe tukey's honestly significant difference
IG only middle of the road in terms of risk most common best balance of type I and II error
101
describe bonferroni t-test
divides alpha by # of comparisons most conservative high type II error
102
describe sidak
RM adjusted alpha good balance of type I and II error most common