Midtern Review Flashcards

(82 cards)

1
Q

How are variables classified?

A

Value
Numerical or categorical.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Continuous variables: give an example

A

infinite, usually containing fraction or decimals, uncountable Ex: cow weight, core body temp in dogs

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Discrete variables

A

finite, usually integers, countable ex: # of eggs in a nest, # of star around a planet

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What are categorical variables?

A

isn’t numeric, data fits into categories

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

How can quantitative variables be broken broken down?

A

As either continuous or discrete

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Nominal variables, give an example

A

Have values that are named categories, ex: coat colors, biological sex

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

How are categorical variables broken down?

A

Nominal or ordinal

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Ordinal variables, give an example

A

ordered name categories. ex: stages of disease (cancer), levels of pain, BMI category

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Independent variable

A
  1. effect, predictor or explanatory variable
  2. exert an influence on outcome you wish to measure
  3. can be actively manipulated
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Depdendent variable

A
  1. Outcome or response variable
  2. What your measure or record
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Frequency

A

how often a data point shows up

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What does a histogram show you?

A

Center, spread, shape

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Taxonomy of frequency histogram shapes (6)

A

a. symmetric, bell-shaped
b. symmetric, not bell-shaped
c. skewed to the right (positively skewed)
d. skewed to the left (negatively skewed)
e. negative exponential
f. bimodal

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Why look at frequency distributions?

A
  1. insight into sample
  2. detect outliers
  3. check assumptions of statistical tests
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What does a bivariate scatterplot show?

A

The relationship between 2 quantitative variable, shows strength and direction

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What are the three measures of central tendency?

A

Mean, median, mode,

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

4 Measures of Dispersion

A
  1. Range
  2. Mean deviation
  3. Standard deviation
  4. Variance
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Define mean

A

average of the data set

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Median

A

Middle measurement in set of observations

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Draw and label a box plot

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

What are the advantages of a box plot (make 4 points)

A
  1. visual representation
  2. comparison
  3. identify central tendency and spread
  4. identify outliers
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

What is the standard deviation (s)

A

The data spread, measures how far from the mean the observations typically are. Large = observations farther from mean.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

Variance = s^2

A

Used to calculate the SD

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

Statistical population

A

Aggregate of all units under study, has the actual mean, SD, population parameters

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Sample population
The specific group you will collect data from
26
Define blocking in experiments, examples
Grouping experimental units into similar subsets, ex: location, family, genotype
27
Describe two-step blocking procedure
1. divide experimental unit in homogenous subsets 2. randomly assign treatments
28
What are poor sampling desgins?
1. Haphazard sampling 2. Convenience or opportunity sampling 3. Pseudoreplication
29
Discuss pseudoreplication
when observations are not statistically indepdent but are treated as if they are Results in altering of the sample size (n) ex: treating multiple cells from the same animals as independent
30
2 benefits of random sampling
1. unbiased 2. high precision
31
Discuss high bias
Repeated samples give estimates that systematically diverge from the population parameter in the same fashion, aiming in the wrong place
32
Frequency distribution
how often a specific value show up in a data set
33
What is a **probability distribution**
all possible values and distributions for a random variable in a given range
34
Normal distribution (make 4 points)
1.most common 2.symmetry around the mean, 3. bell shaped, 4. 68-95-99.7 rule
35
IQR
interquartile range, range of middle 50% of sample
36
What does variance measure?
Variability from the average mean
37
Standard deviation
Measure of how dispersed the data is about the mean
38
Coefficient variation
measure of disperson of data points around the mean expressed as a percentage
39
Confounding variable
Unmeasured third variable affecting both the independent and dependent variable
40
Spurious Association
When two variables are correlated but don't have a causal relationship
41
Extraneous variables
Not measured, effects dependent variable
42
Estimation
using sample data to make inferences about the population
43
Point estimate
an exact value
44
Interval estimate
A range of values for a parameter; gives an interval as an estimate for a parameter
45
Confidence Interval
Likelihood interval estimate contains the true population parameter being estimated
46
Central Limit Theorum
The distribution of the sample means approaches normal the larger the sample gets, regardless of the population's distribution
47
68-95-99.7 rule
68% w / in 1 SD 95% w/ in 2 SD 99.7% w/ in 3 SD
48
Steps for Hypothesis Testing (for a t-test with pooled variances)
1. State formal statistical hypothesis a)Biological question b )null hypothesis c) alternate Hypothesis 2. Choose an appropriate statistical test, justify your choice a) 1 sample t -test H0: u = specifiec mean value HA: u ≠ specified mean value b) independent samples t-test H0: u1 = u 2 HA: u1 ≠ u2 c) paired dependent t-test H0: udiff = 0 HA udiff ≠ 0 3. Check normality of assumptions Test of normality: H0: sample data comes from a normal population distribution (p>0.05) HA: sample data does not come from a normal population distribution Shapiro-Wilk Kolmogorov-Smirnov Check homogenity of variances 2 sample dependent t-test a) box plot b) variance ratio c)Levene's test H0: variance of 2 groups is equal HA: variance of 2 groups is not equal 4) run analysis, comprare to reference set 5) evaluate evidence against null p>0.05 reject p<0.05 reject 6) write summary statement
49
Type 1 error
Incorrectly reject true null hypothesis false positive
50
Type 2 error
H0 accepted but it's false, false negative
51
How to limit type 1 errors
Only reject H0 if alpha <0.05
52
How to limit type 2 errors
Maximize statistical power
53
When do you use a one-sample t-test?
when you want to compare the mean of a sample to a known or hypothesized population mean, and you only have data from a single sample
54
What does the Independent samples t-test compare?
Compare means between two unrelated samples
55
What does the paired sample dependent t-tests compare?
the means of two variables for a single group
56
What do 2-tailed tests allow you to detect?
Allow you to detect differences in either direction
57
Discuss 1-tailed tests
not common, must be specified before data is collected, detect difference in only one direction
58
Name two tests of normality and discuss what they tell you
Test how well sample data fits a normal distribution Shapiro-Wilk Kolmogorov-Smirnov
59
How to check the homogeneity of variances?
1. side by side box plots 2. calculate variance ratio (largest/smallest variance in spss) 3. levene's test
60
What does Levene's test compare? What is the H0 and HA for levene's test? When do accept and reject the null hypothesis?
checks to see if samples to be compared come from population with same variance H0 - the variance (the spread) of the two groups is the same HA - the variance (the spread) of the two groups is not the same p <0.05 reject null (two samples do not have equal variances) p>0.05 accept null (two samples do have equal variances)
61
how are degrees of freedom calculated?
sample size (n) minus the number of parameters estimated
62
Why not do lots of t-tests?
Inflate type 1 error
63
What is the statistical hypothesis for ANOVA?
A0: all means are equal HA: not all means are equal
64
2 teps of ANOVA
1) Global F-Test 2) Post-hoc tests
65
What is the ANOVA test statistic ratio
between group variation:within group variation
66
ANOVA summary statement
name of test, degrees of freedom, f statistic, p value
67
The Nonparametric independent 2-sample t-test twin
Wilcoxon Mann-Whitney Rank Test
68
What does Tukey's HSD compare?
compares all possible pairs of means, tells which specific groups means are different (from each other)
69
Transformation only changes the what?
Distribution of the values
70
What tests to do if your data passes assumptions
1) t-test 2) ANOVA
71
4 Qualities of Non-Parametric Test
1) no mean 2) no or fewer assumptions 3) not sensitive to outliers 4) based on ranks of data value
72
Which has more statistical power: parametric or nonparametric
Parametric
73
What type of error is a nonparametric test more likely to have?
Type II error - reject a false H0, less likely to detect true effect
74
The nonparametric equivalent of the Dependent t-test
Wilcoxon-Signed Ranks Test
75
The non-parametric equivalent of the ANOVA
Kruskal-Wallis Dunn's (post-hoc)
76
For ANOVA, what does it mean when F = 0 F = 1 F is large
F = 0 groups are identical F = 1 small difference among groups means F is large = big among between groups means
77
Advantages of Non-parametric tests (make 3 points)
1) more widely applicable, 2) not sensitive to outliers 3) generally any sample distribution OK
78
Disadvantages of non-parametric tests
1) lower statistical power 2) if assumptions of parametric test mets, parametric tests more powerful
79
State H0 and HA for a 1-sample t-test
H0: population mean = specified value HA: population mean ≠ specified value
80
State hypothesis for independent samples t-test
H0: u1 = u2 HA u1 ≠ u2
81
State hypothesis for paired dependent t-tests
H0 udiff = 0 HA udiff ≠ 0
82
H0 and HA for ANOVA
H0 = there is no difference between the means of the populations being studied HA = there is a difference between the means of the population being studied