Stats Flashcards

(84 cards)

1
Q

z-test

A

variance is known
(y-mu)/(sigma/sqrt(n))
(y1-y2)/sqrt(sigma2/n1+sigma2/n2)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

t-test

A

variance is not known

y1-y2)/(s/sqrt(n)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

CLT

A

zn = (x-nmu)/(nsigma2)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

95 percentile

A

y +/- z(SE Mean)

SE Mean = s/sqrt(n)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

ANOVA

A

SS(treat), df = a-1, MS, F
SS(E), df = N-a, MS, F
SS(T), df = N-1, MS, F

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

a in ANOVA

A

of treatments

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

n in ANOVA

A

of blocks

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

i in ANOVA

A

treatment

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

j in ANOVA

A

block

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

residual

A

yij - average(yi)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

3 model adequacy checking graphs

A

(1) normal prob plot
(2) predicted values plot
(3) time series plot

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

normal prob plot

A

catches outliers, need to transform
x = residual
y = normal % probability

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

predicted values plot

A

tests homogeneity; control by control, randomize, transform
x = predicted yi
y = residual

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

time series plot

A

tests independence
x = run order time
y = response

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

tests for equality of variance

A

(1) bartletts

(2) modified levines

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Box Cox

A

selects transform

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Contrasts

A

(1) orthogonal

(2) scheffe - don’t need to specify in advance

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Comparing Means

A

(1) Fischer LSD - does not use overall error rate
(2) Tukey’s test - uses overall error rate
(3) Dunnett’s test - when you have a control

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Determining sample size

A

(1) operating characteristics of curves

(2) specifying std dev

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Random Effects Model

A

Randomly selects levels

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

Random Control Block Design

A
  • blocks represent a restriction on randomization

- control of nuisance

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

SS(treat)

A

(1/b) sum(yi2 - y2/N)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

SS(block)

A

(1/a) sum(yj - y2/N)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

SS(E)

A

SS(T) - other SS’s

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
SS(T)
sum(yij2 - y2/N)
26
df for RCBD
``` SS(Treat) = a - 1 SS(blocks) = b-1 SS(E) = (a-1)(b-1) SS(T) = N-1 ```
27
Latin Square
- blocking in 2 directions - 2 restrictions on randomization - disadvantage - small DF, control by replicating operators
28
Latin Square setup
``` SS(Treatments), df = p-1 SS(Rows), df = p-1 SS(columns), df = p-1 SS(E), df = (p-1)(p-2) SS(T), df = p2-1 ```
29
Crossover
- eliminate issue of time | - may still have residual effect (mixing of results)
30
Graeco Latin Square
blocks in 3 directions
31
Main effect
sum(A+)/2 - sum(A-)/2
32
Interaction
diff(A's at B+)/2 - diff(A's at B-)/2
33
SS(A)
1/bn(sum(yi2 - y2/abn)
34
SS(int)
1/n(sum(yij2 - y2/abn) - SS(A) - SS(B)
35
df for factorial design
``` A = a-1 B = b-1 error = ab(n - 1) T = abn - 1 ```
36
SS(blocks)
1/(ab) sum(yk2 - y2/abn)
37
SS(A) for factorial
[a + ab - b - (1)]^2/4n n is number of replicates 4 represents 2^2, would be 8 for 2^3
38
SS(T) df for factorial
4n - 1
39
Main effect for factorial
A = 1/2n [a + ab - b - (1)] | 2 represents 2^2, would be 4 for 2^3
40
Coefficient for regression
SS/2
41
R^2
ss(model)/SS(Total)
42
Orthogonality
(1) = number of + and - (2) sum of elements in column = 0 (3) I * col -> unchanged (4) products of any 2 columns yields a column already on table
43
VIF
1/(1-R^2)
44
Types of error
- standard error (for regression coefficient) - pure (from replication) - lack of fit (from pooling) - residual (PE + LOF)
45
Dispersion effect
look at ranges
46
Half normal
plot of coefficients
47
Defining relation
I = ...
48
Design generator
A = BC (aliasing)
49
Resolution
Shortest word in a defining relation
50
Family
I = +/- ABC
51
Confirmation Experiment
Set factors at levels and compare -> regression model
52
Choosing a design
highest resolution
53
Number of treatment combinations
2^(5-2) = 8
54
Folding
change signs for all factors, odd become negative
55
Combined defining relation
multiply - words, copy + words
56
Aliases
1/2([i] + [i]')
57
Plackett Burman
different class of III design - needs to be a multiple of 4 - non-regular - non-geometric - not flexible - cannot be represented by cubes
58
Super saturated
P-B and sort on last row, delete all - or + | - k>N-1
59
k
number of factors
60
Treatment design
- know how design is confounded - prevent nuisance variables - signal what we know and don't know
61
Experimental design
- Randomize to prevent bias | - Figure out execution
62
Estimate correct alias
- prior knowledge of system - interaction plot - p-values for each individually - run other half
63
Empirical vs Mechanistic
derived vs. theoretical law
64
Regression
no statement of effect, not causal
65
Missing data point
Slightly different regression
66
Standard dev versus Confidence Interval
Variability in raw data versus variability in means
67
Prediction interval
CI around confirmation run
68
Lack of fit
how well points fit regression
69
2 error terms for regression
pure, lack of fit
70
Response Surface Methodology
sequential process, method/path of steepest ascent
71
Procedure for method of steepest ascent/descent
(1) 1st order model (2) check error, interactions, quadratic effects (curvature) (3) Ax1 = 1; x2 = something (4) x = something (5) test with new factor levels and keep stepping (6) perform new factorial with region of exploration centered around optimal points
72
Why use center point?
- help check if don't want to replicate - check for curvature - add df for error
73
Central composite design
- n(f) factorial runs, n(c) centerpoint runs, 2k axial
74
Sequential central composite design
(1) 1st order -> lack of fit | (2) introduce axial points to allow quadratic terms
75
Rotatable CCD
- indicates good model | - similar variances for points of interest when rotated
76
Box-Behnkin
- one factor is always at the center - all points equidistant from center point, leads to = var - spherical, no points at vertices
77
If you need a "-" value for time
- don't collect, missing value - change other factor -> shift design - constrained region - D-optimal - inscribed CCD (inside of box) - face-centered->replace corner with face points
78
Evolutionary operation
- constant monitoring and improving - slight changes - more data to find smaller differences\ - longer period of time, lurking
79
Mixture
- factor levels not independent - lattice simplex - centroid simplex
80
Lattice Simplex
{p, m} p = components of mixture (sugar, cream) m = all positive combinations of mixture (sugar = 0, 1/3, 2/3, 1) p = 3 means 2D, m = 2 means 3 points on edge
81
Centroid simplex
2^p - 1 runs
82
Lattice vs. centroid
lattice is more flexible than centroid
83
Axial blends
axial points in the interior
84
Model Adequacy
checked 2nd time around