Experimental Design Flashcards

Question

Basic idea of local control of error

Answer 1

Reduce the random error among the experimental units. Control or account for anything which might affect the response other than the factors.

Answer 2

RegrSS/CTSS = 1 - (RSS/CTSS)

Answer 3

RSS/ (n-p) sig hat

Answer 4

sqrt (MSE/ Sxx)

Answer 5

It is proportional to the reciprocal of the volume of the confidence ellipsoid for the estimated coefficients

Answer 6

Occam's razor: "entities should not be multiplied beyond necessity". So choose fewer variables with sufficient explanatory power. This is a desirable modeling strategy.

Answer 7

A single-factor experiment with k levels (treatments)

Answer 8

When there are k types of observations but regression parameters are greater than k. When fitting the model, (X'X)^-1will not exist because it is not full rank since X'X is singular. Constraints will be needed to make X'X a nonsingular matrix.

Answer 9

1) Allowing the sum of the treatments be equal to zero (zero sum) 2) Allowing one of the treatments to be zero (dropping it from the model matrix, X, called a baseline constraint)

Answer 10

After a global F-test of the treatments and rejecting the null hypothesis, the multiple comparisons test identifies which pairs of treatments are statistically significant. 1) Bonferroni Method- alpha/2 level for the t test of N-k degrees of freedom is divided by k' = kchoose2 2) Tukey Method- measurement of statistical difference between treatments if t critical value exceeds the upper alpha quantile of the studentized range distribution with k and N-k degrees of freedom

Answer 11

Basically a one-way fixed effects model but now the operators considered come from a pool (population) of operators. This gives rise to the variance components, variance between the operators and within the operators

Answer 12

No because of the two error terms. You should find the expected value of these two terms

Answer 13

1) Have all important effects been captured? 2) Are the errors independent and normally distributed? 3) Do the errors have constant variance?

Answer 14

E(r)=0 r and yhat are independent r ~ Multi Norm (0, sig^2( I - H ) ) Var(r) = sig^2 (1-h_ii)

Answer 15

Plot r_i vs yhat_i Plot r_i vs x_i Plot r_i vs time sequence, i Plot r_i vs replicates grouped by treatment

Answer 16

Use a Box-whisker plot. It enables the location, dispersion, skewness, and extreme values of the replicated observations to be displayed in a single plot

Answer 17

IQR = Q₃ - Q₁ Whiskers= [Q₁ - 1.5\*IQR, Q₃ + 1.5\*IQR] Anything outside the whisker bounds is considered an outlier. If Q₁ and Q₂ are not symmetric about the median then this implies skewness.

Answer 18

Purpose: to test if the residuals follow a normal distribution Process: Obtain ordered residuals which each have probability p_i = (i - .5)/N. Then plot p_i vs r_(i) which should be relatively S shaped if the residuals are somewhat normally distributed. However, typically there is a transformation of these probabilities that makes the desired shape to be a straight line (think qq-plot).

Answer 19

Paired comparison design, randomized block design, two-way and multi-way layout, latin and graeco latin square design, balanced and incomplete block design (BIBD), split-plot design, ANCOVA

Answer 20

Paired comparison design: can be looked at as a RBD with block size 2. Considers two homogenous units and within each block two treatments are randomly assigned. Unpaired design: The treatment size is still two, but now the units are not homogenous and therefore the experiment will have more degrees of freedom. Because it acounts for between sample variance, this design has lower power than the paired comparison design

Answer 21

k treatments are randomly assigned to each block (of k units) with b blocks and bk=N total sample size. For effective design, the units within each block should be more homogenous than units between blocks.

Answer 22

It involves two treatment factors with fixed levels. There is an interest in assessing the interaction effect between the two treatments

Answer 23

Like the two-way layout but expanded to 2 or more factors (treatments) and 2 or more treatments levels for each factor

Answer 24

Each of the k Latin letters (ie treatments) appears once in each row and once in each column (these are the two blocking factors of the experiment)

Answer 25

Basically a super position of two Latin square designs. Useful for studying four factors (3 blocking 1 treatment or 2 blocking 2 treatment)

Answer 26

The number of treatments, t, is greater than the block size, k. Also this is balanced because each pair of treatments (or trio, quadruplet, etc) appears the same number of times (denoted by lambda)

Answer 27

bk=rt r(k-1)=l(t-1)

Answer 28

A split plot should be used for situations where certain factors are hard to change. These hard to change factors would be considered whole plot factors and within each whole plot factor level would have subplot factors. Advantages include cost/time effectiveness. Disadvantages include loss in precision in the whole plot treatment comparison

Answer 29

ANCOVA should be used when auxillary covariates are available. In an experiment, it may be impractical to create blocks (think continuous variables) so ANCOVA can be used if correlation between covariate and treatment is high. Essentially we know the covariate term is important, but it is an uncontrollable source of error. In application can be viewed as a fusion of one-way treatment comparisons and simple linear regression. Advantages include reducing bias and improving sesitivity/reducing error from originial models. Disadvantages include fitting two models where covariate term was accounted for (this might not always be the case).

Answer 30

Used to see the inclusion or absense of levels for 2-level factors and their collective effect on a response. Would be used for exploratory analysis where linear trends are expected. Advantages include reporducibility and wider inductive basis because of symmetry of experiments. 2-level full factorial experiments are great for preliminary studies and are cost effective. They highlight interactions as well as isolotory effects. Disadvantages include when there are multiple factors (10 factors means 2^10 - 1 runs) and also the inability to observe polynomial terms because of only two levels.

Answer 31

Balance- each factor level appears in the same number of runs Orthogonality- all paired level combinations for factors appear the same number of times Replication- identical treatments applied to similar experimental units

Answer 32

The diffence in average value for all observations between the maximum range levels of a factor

Answer 33

The change in average response, when changing the level of one factor, depends on the level setting of another factor. There are synergistic and antagonistic interactions.

Answer 34

M E(BIA+) = z(B + IA+) - z(B -IA+).

Answer 35

(i) It gives the most parsimonious model, that is, with the fewest terms, particularly the omission of higher-order terms like cubic effects and interactions. (Ii) There are no unusual patterns in the residual plots. (iii) The transformation has good interpretability.

Answer 36

1) Effect Hierarchy principle: lower order effects are more likely to be important than higherorder effects and effects of the same order are equally likely to be important 2) Effect Sparsity principle; the number of relatively important effects in a factorial experiment is small 3) Effect heredity principle: in order for an interaction to be significant, at least one of its parent factors should be significant

Answer 37

1) Order factorial effect estimates 2) Plot ordered factorial effect estimates against corresponding inverse normal coordinates for (i-.5)/N for i=1,...,N 3) Under Ho al factorial effects=0 so normal plot should be a straight line 4) Any point which falls off the line is considered significant

Answer 38

Used because the log transformation transforms multiplicative relationships to additive ones, making them easier to model statistically. Its also easy to transform the sample variance back to its orignial value by exponentiating it.

Answer 39

For the 2^q blocks the block size should divide into the run size of the experiment Usually one of the higher ordered factorial effects needs to represent the assignment of blocks because of the effect hierarchy principle The block effect estimate will be the main effect of the blocks For more blocks create more blocking equations One major assumption is that the block-by-treatment interactions are negligible. The assumption generally states that the mean response when considering a certain treatment do not depend on the block. Without this, factorial effects would not be estimable by blocking relations.

Answer 40

Confounding: Setting up a relation which connects one design factor with another (in our case a block, eg: B=123 is a confounding relation). Literally means "confused". Abberation: For any blocking scheme b, let g_i(b) be the number of i-factor interactions tahat are confounded with block effects. Let r be the smallest i for any 2 blocking schemes such that g_r(b₁) does not equal g_r(b₂). Then if g_r(b₁)r(b₂) then blocking scheme 1 has less aberration than b₂ Estimability: Estimability of order e is determined by finding the lowest order of interactions confounded by block effects, named e+1. Therefore estimability of order e ensures that all factorial effects of order e are estimable in the blocking scheme. The best blocking schemes are ones that ensure estimability of order 1 and minimum abberation among all blocking schemes.

Answer 41

2 level fractional factorial designs are a subset of full factorial designs. They have less run size and must use aliasing equations to account for loss in balance/orthogonality achieved by the full factorial designs. We write this as 2^k-p where k represents the number of factors and p represents the fraction of reduced runs. Advantages include efficiency both in cost and time. Like a full factorial design it is reproducible and uses symmetry as the basis of its design. Disadvantages include complexity of aliasing and scheme selection and the full space of the experiment is not explored.

Answer 42

Aliasing relation: Describes whatever factor combination is being confounded. Denoted, for example, I=ABC=BCD. There are 2^k-p-1 aliasing relations as well as degrees of freedom. Word: Any confounded factor combination Resolution: The smallest word in the defining contrast subgroup. It is desireable to have maximum resolution for fractional factorial designs. There are 2^k-p runs in a 2^k-p experiment.

Answer 43

Clear: A factorial effect is clear if none of its aliases are main effects/ interactions Strongly clear: a factorial effect is strongly clear if none of its aliases are main effects, 2 way, or 3 way interactions 1) In any Res IV design, all main effects are clear 2) In any Res V design, all main effects are strongly clear and 2 factor interactions are clear 3) Among Res IV designs, those with largest number of clear 2-factor interactions are best

Answer 44

Aliased ambiguities occur when factorial effects are significant but they cannot be distinguished from the experimental data because they are confounded with one another. Plans include: Using domain knowledge to see some effects are not actually likely to be significant, use hierarchy principle to assume away higher order effects, to explore follow up experimentation using fold-over techniques and optimal design criterions.

Answer 45

The fold-over technique flips over the design matrix and finds the new aliasing relations (this doubles run size). A new factor represents the two halves of the combined designs (+,-). Use the augmented design matrix to dealias the effects believed to be important. Then analyze this design. This method is effective for analyzing all the main effects or one main effect and all its interactions for a resolution III design from the original experiment. There are problems since this is sort of a limited scope of dealiasing and also the number of runs must be doubled. There are more effecient ways to accomplish this.

Answer 46

An optimal design apporoach is a technique for follow up experiments to dealias ambiguities for the best model identified using a particular optimal design criterion. The model in use for optimal design should contain 1) All effects and their aliases judged significant a priori 2) A block variable that accounts for differences in average value of the response over different time periods from the original experiment and the follow up experiment 3) An intercept D-optimal criterion: max_d |X_d'X_d| where d=1,....,2\*2^p where p is the number of regressors in the regression equation D_s-optimal criterion: max_d|X₂'X₂-X₂X₁(X₁'X₁)^-1X₁'X₂| Can think of these in terms of regression. |X'X| is proportional to the reciprocal of the volume of the confidence ellipsoid for the estimated coefficients so that maximizing d is proportional to minimizing the volume of this confidence ellipsoid (ie more precise estimation).

Answer 47

Minimum aberration criterion supplemented by the number of clear effects

Answer 48

Larger the better problems: 1) Find factor settings that maximize E(y) 2) Find other factor settings that minimize Var(y) Smaller the better problems: 1) Find factor settings that minimize E(y) 2) Find other factor settings that minimize Var(y)

Answer 49

1) Factors may effect the response in a non-monotone fashion. More levels allow the curvature effect to be understood. 2) If a qualitative factor has multiple levels that need to be understood (eg three separate settings on a machine) 3) If there is an initial setting in an optimization problem, then it would make sense to study the space around that setting. Therefore multiple levels would be needed.

Answer 50

The pair of effects has an angle between 0 and 90 degrees.

Answer 51

Orthogonal arrays have better run size economy (less runs) and flexibility of factor level combinations

Answer 52

Response surface methodology uses experimentation, modeling, data analysis, and optimization to understand the surface of the response

Answer 53

Central composite design: Corner points, axial points, center points

Answer 54

Choose control fator settings to make response less sensitive (ie more robust) to noise variation, exploiting control-by-noise interactions

Experimental Design Flashcards

(112 cards)