Lab 8 Flashcards
(36 cards)
There are five components to an experiment:
hypothesis, experimental design, experimental execution, statistical analysis, and interpretation.
experimental design
By experimental design is meant only “the logical structure of the experiment”
A full description of the objectives of an experiment should specify the nature of the experimental units to be employed, the number and kinds of treatments (including “control” treatments) to be imposed, and the properties or responses (of the experimental units) that will be measured.
manner in which treatments are assigned
Once these have been decided upon, the design of an experiment specifies the manner in which treatments are assigned to the available experimental units, the number of experimental units (replicates) receiving each treatment, the physical arrangement of the experimental units, and often, the temporal sequence in which treatments are applied to and measurements made on the different experimental units.
successful execution
requires that the experimenter avoid introducing systematic error (bias) and minimize random error.
In experimental work,
the primary function of statistics is to increase the clarity, conciseness, and objectivity with which results are presented and interpreted.
Statistical analysis and interpretation are the least critical aspects of experimentation, in that if purely statistical or interpretative errors are made, the data can be reanalyzed. On the other hand, the only complete remedy for design or execution errors is repetition of the experiment.
Two classes of experiments may be distinguished:
mensurative and manipulative.
Mensurative experiments
involve only the making of measurements at one or more points in space or time; space or time is the only “experimental” variable or “treatment.”
Tests of significance may or may not be called for.
Usually do not involve the imposition by the experimenter of some external factor(s) on experimental units.
If they do involve such an imposition, (e.g., comparison of the responses of high- elevation vs. low- elevation oak trees to experimental defoliation), all experimental units are “treated” identically.
Example 1. We wish to determine how quickly maple (Acer) leaves decompose when on a lake bottom in 1 m of water.
So we make eight small bags of nylon netting, fill each with maple leaves, and place them in a group at a spot on the l-m isobath.
After 1 mo we retrieve the bags, determine the amount of organic matter lost (“decomposed”) from each, and calculate a mean decomposition rate.
This procedure is satisfactory as far as it goes. However, it yields no information on how the rate might vary from one point to another along the 1-m isobath; the mean rate we have calculated from our eight leaf bags is a tenuous basis for making generalizations about “the decomposition rate on the 1- m isobath of the lake.”
Such a procedure is usually termed an experiment simply because the measurement procedure is somewhat elaborate, often involving intervention in or prodding of the system.
If we had taken eight temperature measurements or eight dredge samples for invertebrates, few persons would consider those procedures and their results to be “experimental” in any way.
Example 2. We wish, using the basic procedure of Example 1, to test whether the decomposition rate of maple leaves differs between the 1-m and the 10- m isobaths.
So we set eight leaf bags on the 1- m isobath and another eight bags on the 10- m isobath, wait a month, retrieve them, and obtain our data.
Then we apply a statistical test (e.g., t test or U test) to see whether there is a significant difference between decomposition rates at the two locations.
We can call this a comparative mensurative experiment. Though we use two isobaths (or “treatments”) and a significance test, we still have not performed a true or manipulative experiment.
We are simply measuring a property of the system at two points within it and asking whether there is a real difference (“treatment effect”) between them.
using ex 1 and 2 to make a proper mensurative experiment.
To achieve our vaguely worded purpose in Example 1, perhaps any sort of distribution of the eight bags on the 1- m isobath was sufficient.
In Example 2, however, we have indicated our goal to be a comparison of the two isobaths with respect to decomposition rate of maple leaves.
Thus we cannot place our bags at a single location on each isobath.
That would not give us any information on variability in decomposition rate from one point to another along each isobath. We require such information before we can validly apply inferential statistics to test our null hypothesis that the rate will be the same on the two isobaths.
So on each isobath we must disperse our leafbags in some suitable fashion.
There are many ways we could do this. Locations along each isobath ideally should be picked at random, but bags could be placed individually (eight locations), in groups of two each (four locations), or in groups of four each (two locations).
Furthermore, we might decide that it was sufficient to work only with the isobaths along one side of the lake, etc.
Assuring that the replicate samples or measurements are dispersed in space (or time) in a manner appropriate to the specific hypothesis being tested is the most critical aspect of the design of a mensurative experiment.
Example 3. Out of laziness, we place all eight bags at a single spot on each isobath.
It will still be legitimate to apply a significance test to the resultant data.
However, and the point is the central one of this essay, if a significant difference is detected, this constitutes evidence only for a difference between two (point) locations ; one “happens to be” a spot on the 1- m isobath, and the second “happens to be” a spot on the 10- m isobath.
Such a significant difference cannot legitimately be interpreted as demonstrating a difference between the two isobaths , i.e., as evidence of a “treatment effect.”
For all we know, such an observed significant difference is no greater than we would have found if the two sets of eight bags had been placed at two locations on the same isobath.
Pseudoreplication
If we insist on interpreting a significant difference in Example 3 as a “treatment effect” or real difference between isobaths, then we are committing what I term pseudoreplication.
Pseudoreplication may be defined, in analysis of variance terminology, as the testing for treatment effects with an error term inappropriate to the hypothesis being considered.
In Example 3 an error term based on eight bags at one location was inappropriate.
Pseudoreplication in mensurative experiments
In mensurative experiments generally, pseudoreplication is often a consequence of the actual physical space over which samples are taken or measurements made being smaller or more restricted than the inference space implicit in the hypothesis being tested.
In manipulative experiments, pseudoreplication most commonly results from use of inferential statistics to test for treatment effects with data from experiments where either treatments are not replicated (though samples may be) or replicates are not statistically independent.
Pseudoreplication thus refers not to a problem in experimental design (or sampling) per se but rather to a particular combination of experimental design (or sampling) and statistical analysis which is inappropriate for testing the hypothesis of interest.
MANIPULATIVE EXPERIMENTS
Whereas a mensurative experiment may consist of a single treatment (Example 1), a manipulative experiment always involves two or more treatments, and has as its goal the making of one or more comparisons.
The defining feature of a manipulative experiment is that the different experimental units receive different treatments and that the assignment of treatments to experimental units is or can be randomized.
Note that in Example 2 the experimental units are not the bags of leaves, which are more accurately regarded only as measuring instruments, but rather the eight physical locations where the bags are placed.
Critical features of a controlled experiment
Manipulative experimentation is subject to several classes of potential problems.
listed these as “sources of confusion”; an experiment is successful to the extent that these factors are prevented from rendering its results inconclusive or ambiguous.
It is the task of experimental design to reduce or eliminate the influence of those sources numbered 1 through 6.
For each potential source there are listed the one or more features of experimental design that will accomplish this reduction.
Most of these features are obligatory.
Refinements in the execution of an experiment may further reduce these sources of confusion.
However, such refinements cannot substitute for the critical features of experimental design: controls, replication, randomization, and interspersion.
One can always assume that certain sources of confusion are not operative and simplify experimental design and procedures accordingly.
This saves much work.
However, the essence of a controlled experiment is that the validity of its conclusions is not contingent on the concordance of such assumptions with reality.
Against the last source of confusion listed (Table 1), experimental design can offer no defense.
table 1 I. potential source’s of confusion in an experiment and means for minimizing their effect
- temporal change
- control treatments - procedure effects
- control treatments - experimenter bias
-randomized assignment of experimental units to treatments
- randomization in conduct of the other procedures “blind” proceduresusually employed only where measurements involves a large subjective element - experimenter-generated variability (random error)
- replication of treatments - initial or inherent variability among experimental units
- replication of treatments
- interspersion of treatments
- concomitant observations - nondemonic intrusion
- replication of treatments
- interspersion of treatments - demonic intrusion
- eternal vigilance, exorcism, human sacrifices, etc
Controls.
is another of those unfortunate terms having several meanings even within the context of experimental design.
In Table 1, I use control in the most conventional sense, i.e., any treatment against which one or more other treatments is to be compared.
It may be an “untreated” treatment (no imposition of an experimental variable), a “procedural” treatment (as when mice injected with saline solution are used as controls for mice injected with saline solution plus a drug), or simply a different treatment.
At least in experimentation with biological systems, controls are required primarily because biological systems exhibit temporal change.
If we could be absolutely certain that a given system would be constant in its properties, over time, in the absence of an experimentally imposed treatment, then a separate control treatment would be unnecessary.
Measurements on an experimental unit prior to treatment could serve as controls for measurements on the experimental unit following treatment.
In many kinds of experiments, control treatments have a second function:
to allow separation of the effects of different aspects of the experimental procedure.
Thus, in the mouse example above, the “saline solution only” treatment would seem to be an obligatory control.
Additional controls, such as “needle insertion only” and “no treatment” may be useful in some circumstances.
A broader and perhaps more useful (though less conventional) definition of “control” would
include all the obligatory design features listed beside “Sources of confusion” numbers 1- 6 (Table 1). “Controls” (sensu stricto) control for temporal change and procedure effects.
Randomization controls
Randomization controls for (i.e., reduces or eliminates) potential experimenter bias in the assignment of experimental units to treatments and in the carrying out of other procedures.
Replication controls
Replication controls for the stochastic factor, i.e., among-replicates variability inherent in the experimental material or introduced by the experimenter or arising from nondemonic intrusion.
Interspersion controls
for regular spatial variation in properties of the experimental units, whether this represents an initial condition or a consequence of nondemonic intrusion.
In this context it seems perfectly accurate to state that, for example, an experiment lacking replication is also an uncontrolled experiment; it is not controlled for the stochastic factor.
The custom of referring to replication and control as separate aspects of experimental design is so well established, however, that “control” will be used hereafter only in this narrower, conventional sense.
A third meaning of control in experimental contexts is
regulation of the conditions under which the experiment is conducted.
It may refer to the homogeneity of experimental units, to the precision of particular treatment procedures, or, most often, to the regulation of the physical environment in which the experiment is conducted.