Flashcards in Stats Deck (104):
How are the mean and standard deviation affected if a constant is subtracted from every score?
All operations (add, subtract, x, divide) affect mean but only multiplication and division affect standard deviation.
What happens to statistical significance when there is a large sample size?
Large sample size makes it more likely to find statistical significance. As size grows, significance can be found for even very small differences.
Criterion contamination. Refers to:
1. Hi validity coef bc ratings contaminated by knoweledge of predictor.
2. Underestimate of validity coef bc criterion rating contaminated by knowledge of predictor
3. Carryover effects
4. Due to low reliability
1. It occurs when the rating on criterion is affected by knowledge of score on predictor.
Ie may inflate grades of hi IQ kids causing hi correlation between iq and grades.
Using an ANOVA a pooled error term is justified when:
1. Sample size is unequal
2. Variance is equal
3. All cells have the same number of subjects
4. Homiscedasticity is violated.
2. Pooled error term is used when there is homogeneity of variance (equal). Homoscedacity is also equal variance.
When things Re equal they can be pooled. When unequal, treat them separately.
Match definitions below with std deviation, std error of measurement, std error of estimate, std error of the mean
1. Ave amt of error in predicting each persons score
2. Ave amt of error in calculating each score
3. Ave amt of error in grp means iq score
4. Ave amt of spread in grp scores.
1. Predicting is std error of estimate
2. Std error of measurement...ave amt of error in each persons score.
3. Std error of the mean...in relation to the population mean
4. Std deviation.
What is the most significant problem in using a series of t tests to analyze a data set?
1. Experimenter wise error Type I
2. Difficulty distinguishing intx and main effects
3. Low power beta - 1
4. Violating parametric assumptions
ANOVA reduces type I error, which is additive.
More tests ya do more chance of error.
How are the mean and std deviation effected if a constant is subtracted from every score?
1. Both decrease
2. Both increase
3. Mean decreases, std dev same
4. Mean same, std dev decreases
Single subject research design, which is the most significant problem?
3. Regression to the mean
4. Practice effects
The association between two variables, when ea variables association w another variable has been removed is known as:
A.analysis of covariance
B. partial correlation
C. Semi-partial correlation
D. Coefficient of determination
B. this is correlation between 2 variables when the association between a third variable and ea of the two original variables has been partialed out.
Cluster sampling involved what kind of clusters?
Naturally occuring groups and then randomly selecting from the clusters. Typically all the subjects within the clusters are sampled.
Standard errors of mean, measurement, and estimate express error in terms of:
1. Standard deviation
2. Sampling error
3. Systematic error
4. Testing situation
1. All std error express in terms of std deviation.
Sampling error applies to std error of mean only
Systematic...applies to none
Testing situation...source of error for std error of measurement only.
Std error of measurement is significantly influenced by reliability coefficient, the range of std error of measurement is:
A. -1 to 1
B. 0 to 1
C. 0 to sdx
D. 0 to sdy
Range of validity coefficient is -1 to 1
Reliability coefficient is 0 to 1.
D is range of std error of estimate.
Taylor Russell tables evaluate incremental validity: base rate, selection ratio, criterion related validity
A. Construct validity
B. criterion related validity
C. Test retest
D. Internal consistency reliability
Selection ratio...ratio of number of openings to number of apps
Low optimizes incremental validity
Higher criterion validity, better incremental.
2. Criterion related validity
For each a score, variability of b scores is equal to total variability of b scores. Conclude:
A. Error; unlikely
B. moderate positive correlation
C. Mod negative correlation
D. No correlation
D...for any a, end up w all possible b. scatter plot
Knowing a tells u nothing of b
Which will increase std error of mean?
Increase sd of population and decrease sample size
Decrease sd of population and increase sample siZe
1. Std error of mean has direct relationship w sd of population Nd indirect relationship w sample size
Std error of mean increases when sd of pop is increased and n is reduced.
Increasing test length:
A. Affects reliability only
B. reliability more than validity
C. Equal effect
B. effect on both but validity has a ceiling due to the reliability.
Two way ANOVA find differences. U conclude:
A. Main effects and may:not have intx effects
B. intx effects and may/not have main effects
C. Can be any combo of main effects and interactions
D. Neither main effects or interaction effects because may be due to chance.
C. 3 F ratios so 4 possibilities of significance. Two possible main effects and a possible intx effect. Can be any combo. Can have intx but no main effects etc..if one way ANOVA can't detect intx effect.
Abab design the concern is:
A. History and maturation
B. regression and diffusion
C. Failure of IV to return to baseline
D. Failure of dv to return to baseline
Which circumstance would it be problematic to use chi square?
A. When looking for differences between groups
B. ordinal data
C. Repeater observations made
D. More than one iv
C. One of the main assumptions is independence of observations. Can't use when repeated observations are made, like a pre and post test
Chi square is non parametric test of differences used for nominal or categorical data. Can use with ordinal. Use multiple chi when more than one IV
Shape of a z score distribution is:
Can't be determined
Shape follows the raw score distribution which is not given.
Flat is for percentile ranks.
Single subjects design involve an approach:
Idiographic describes single subject approaches
Nomothetic group approach
Normative...data compared with in and between subjects
Ipsitive forced choice format. Only gives strengths and interests within a subject and can't be used for comparisons.
3 levels of an IV and a continuous dv should be analyzed using what stats?
One way ANOVA
1. One IV w 3 levels; 1 DV continuous or scored numerically
One way ANOVA used w 1IV and 1DV
Chi is nominal data or categorical
Manova had more than one dv
Two way ANOVA is 2 IV and one dv
Factorial ANOVA more 1 IV and 1 dv
Relationship between education Nd income for clinical psychologists is?
Correlation between education and income in general?
1. Zero. Restricted range
2. Broader .3 to .5
Changes in the Variable causes changes in the
IV is input and causes changes in
Dv is output
IV is manipulated
Dv is measured
Correlational research variables are not manipulated. Input variable is IV . Called predictor variables.
Outcome variables are dv or criterion variables.
Regardless...what effect does (IV) have on (Dv).
What is a factorial experimental design?
Adv of factorial?
More than one IV where every level of one IV is combo w every other level of IV .
What are the adv ? Statistical in nature.
What is internal validity?
What are the threats to internal validity?
How do you control for threats to internal validity?
Allows the conclusion that there is a causal relationship between the IV and dv.
Or if can conclude that no effect.
Threats: (hims teds)
Factors other than IV are responsible for changes in dv:
History (external event)
Maturation (internal event..fatigue, bored, hunger)
Testing (experience w pretest)
Instrumentation (change nature of it)
Statistical regression (less extreme scores when retested)
Selection (preexisting subject characteristics)
Differential mortality (diff of drop outs and non drop outs)
Experimenter bias (expectation or other bias)
1. Random assignment (equivalent on extraneous factors)
2. Matching or grp similar subjects on extraneous and randomly assign
3. Blocking or study as if extraneous is another IV
4. Hold extraneous variable constant or use only homogeneous subjects
5. Ancova...like post hoc matching
What does a confound mean?
Experiment contaminated by an extraneous variable is confounded.
What is a threat to internal validity for a pretest/post test design? This is a one group before and after design.
Testing...when pre and post are similar may show improvement due to experience w the test. Test wise
Instrumentation...raters may have improved by post test
Pygmalion in the classroom:
1. Experimental expectancy
2. Impact of maturation
3. Confounding variable
4. Unequal selection of students
Another name for it?
1. Correct! Teachers preconceived ideas of a students abilities resulted in the graded and even iq scores moving in the expected direction even though the students hadnt changed.
3. It is a confounding variable! Yes
Also called rosenthal effect. Behavior of subjects changes due to expectancies.
Overcome w double blind study
What is the difference between random assignment and random selection?
Random selection or random sampling is selecting subjects into a study. All members of the population under study have equal chance to be selected to participate. (External validity)
Random assignment is after they have been selected. For subjects already selected the probability of being assigned to ea grp is the same. Great equalizer!
When is matching used?
Controls for effects of a specific extraneous variable. Identify subjects thru a pretest who are similar on an extraneous variable, group and randomly assign.
Good when sample size is small and random assignment cannot be counted on to ensure equivalency.
How is blocking done to control for confounding variables?
Involves studying the effects of an extraneous variable, usually a preexisting characteristic (gender, iq) to determine if and degree acct for scores on dv.
Make extraneous a IV
Ie. divide into blocks...hi and lo iq then randomly assign to IV. Now have iq and tx as 2 IV.
Different from matching bc
Matching ensures equivalence. Number of Ivs stay the same.
Blocking determines the effects of the extraneous variable. Also adding a IV.
Discuss holding the extraneous variable constant as a way to manage internal validity and Ancova.
Holding the extraneous variable constant eliminates the effects of an extraneous variable.
Include only homogeneous subjects
Ie only the high iq peeps
Ancova is a post hoc matching after data are obtained. Dv scores adjusted so subjects are equalized
Disadv..like matching..can't control for what has not been identified or measured.
What is external validity?
What are some threats to external validity?
Some may overlap w internal validity. Understand concept not need to know which classification they go to.
Generalizability of the results to other settings, times, people...
Interaction (some variables have one effect under one set of conditions but a different effect under another set).
Intx between selection and tx:
Given tx not generalize to other members of population (ie use college kids may not go to rest of population).
Intx between hx and tx:
Effects of tx don't generalize beyond setting or time pd expt done.
Intx between testing and tx:
Pretest sensitization..can't generalize to sit where pretests not used. Pretest may sensitize to purpose or increase susceptibility to respond to the tx.
Cues in research setting that allow to guess research hypothesis. Due to cues, subjects act different than real world (try disprove..)
Respond different just due to mere fact being studied. Study..workers increased output following any change in the environment.
Order effects or carry over or multiple treatment interference
Problem in repeated measures design or same subjects exposed to more than one tx. Last tx may have greater effect bc it followed previous interventions.
How do you increase external validity?
1. Random selection or random sampling. Often use experimentally accessible population and assumption is made that subjects similar in relevant ways to rest of target population.
Stratified random sampling.. sample from several subgroups of total population. To ensure proportionate rep of defined pop
Cluster sampling. Natural occurring group of individuals vs individual.
2. Naturalistic research
Controls for hawthorn and demand characteristics but will lack internal validity (always a trade off).
3. Single and double blind
Reduce demand and hawthorn effects
Controls for order effects.
Diff subjects or grps receive tx in diff order. Type is Latin square design...order administration of tx so ea appears once and only once in every position.
Experiemental vs quasi experimental vs correlational research.
Exptal..random assignment; manipulate variables
Quasi ..NO random assignment
Pre existing grps. Naturally occurring. Manipulable variable studied (decide which grp gets which tx) so experimenter control. Use w preexisting intact grps like classroom, ward..
Correlational..grps measured only
Not manipulated. No internal validity. Only associations.
Used for prediction.
Variables like age, gender, ses, eye color...
Discuss the differences between the developmental designs. Longitudinal, cross sectional, and cross sequential. Review drawbacks of each.
Longitudinal..study one grp of subjects over a long pd of time.
Disadv Time, money, dropout rate.
Cross sectional..study two or more grps of ppl at one time
Disadv...cohort effects..experience effect vs their age
Cross sequential..combo of the above..look at ppl of diff ages at 2 plus times
Controls for cohort. Cheaper, less expensive, decreases drop out.
What is a time series design?
Interrupted time series design?
Advantage of multiple measurements?
Multiple measurements over time to assess IV
Usually multiple pre and post test measures
Interrupted time series design is series of measurements on dv that is interrupted by administration of tx
One grp interrupted time series design. Threat is an event that occurs at same time as tx. Ie. price of cigs went up when administered. Control history with two grp time series design w a control grp that also look at over time.
Adv? Rule out threats to internal validity like maturation, regression, testing.
What is a single subject design?
Number of subjects is one or if two or more subjects treated as one group.
Dependent variable measured several times during both phases
Lots of variability poses threat to design.
Good for behavior modification study.
Usually baseline (no tx) and then tx phase.
Types: AB, reversal, multiple baseline
Describe the the single subject designs.
AB single baseline, single tx phase
Disadv..history, other confound
Reversal or withdrawal design
Single subject design
Treatment is withdrawn and data collected to det if behavior goes back to original level upon withdrawal. If returns to original level during wdrawal then more sure due to tx vs extraneous factors
ABAB. Tx reapplied at end. Adv over ABA is additional confirmation tx causes changes. Also don't leave them wo tx.
Multiple baseline used when reversal can't be. So can't reverse or withdraw tx due to ethics. May not be possible to demo tx effect in a reversal. So instead of a withdrawal, this applies the tx sequentially or across different baselines. In other words...single subject design in which IV is sequentially administered across 2 or more subjects, behaviors, or settings (baselines)
What is qualitative research?
Talk about the types, especially surveys and case studies.
Theory is developed from the data vs being derived beforehand.
Often pilot to help define ho
Surveys..personal, phone, mail
Biased selection or sampling is problem
Case studies ...case is example of more general class. Can't draw conclusions between variables and may not be generalize able.
Protocol analysis ...verbatim reports
No traditional quantitative techniques but based on interpretation.
What are the differences between stratified random sampling and clustering?
Stratified random sampling is a population divided into sub populations and all members of ea sub population have an equal probability of being chosen.
Clustering involved grouping subjects who are similar in terms of status on an extraneous variable and then assign to each grp. Naturally occuring grps (vs individual)
Greatest threat to validity for a mail survey is?
Defining feature of a true experimental design is
Random selection of subjects from the population
Random assignment of subjects into experimental grps
Use of manipulated variables
Use of non manipulated variables
Not c because other designs use manipulated variables
All are true of multiple baseline except?
Tx sequentially applied
May serve as a substitute when the ABAB design is unethical
May involve studying the same tx for different behaviors in diff settings, or w different subjects
Involve the administration and then withdrawal of tx
D. No withdraw of tx
Major threat to internal validity of a one grp time series design is
Regression to the mean
C. Administer multiple pretests and post tests to one grp of subjects before and after tx. Design controls for many threats such as maturation, testing, stat regression .
An external threat is the threat.
Major advantage and then a major disadvantage of case studies:
Can identify variables for future research
Involve study of one individual
Do not permit conclusions about causal relationships to be made
Permit generalization of results
Name the scales of measurement.
Dx, color, sex. Can be labeled w numbers but not ordered.
Ranks. Attitude scales.
Don't know how much more or less
Tell how ordered but not amount between the categories
Interval....numbers arranged in order and intervals in between are equal
No absolute zero pt
Can't multiply or divide; can add and subtract (say 50 points higher but can't say twice as smart)
Iq, standardized test scores, temps
Ratio....numbers arranged in order and intervals between are equal.
Absolute zero pt
Can add, subtract, multiply, divide
Dollar amounts, time, distance, height, wt...
A normal distribution is:
Called the bell shaped curve
Greatest number of cases fall close to the mean
How most variables are distributed in the population
Define negative distribution:
Define positive distribution
Tale tells the tale...location of the tail determines labeling
Most scores at high end
Tail at lo end (few lo scores) so neg
Mode greater than median greater than mean.
Most scores low end (left side)
Tail at hi end bc very few hi scores... .so positive
Mean is greater than median greater than mode
Mean pulled to the tail.
Describe the measures of central tendency
Mean is average.
Add all and divide by N.
Sensitive to extreme values and misleading when highly skewed.
When ordered lowest to highest.
Less sensitive to extreme scores
Mode...most frequent value
Can be bimodal or multimodal.
Measures of variability.
Difference between the highest and lowest score
Affected by extreme scores
Tells nothing of distribution
Only a general descriptor.
Average of the squared differences from the mean of each score
Measure of variability of scores
Basically how the scores disperse around the mean.
Sample variance is sum (x-M) squared divided by N-1.
Probably don't need to know formula. Just that it is variance is the square of the std deviation.
Square root of variance
Thought of as expected deviation from the mean if a score chosen at random.
Higher the std deviation the more that scores in a distribution are likely to deviate from the mean.
Define z score and how it is calculated.
A score is a measure of how many std deviations a given raw score is from the mean.
When all scores in the distribution are converted to a scores there will be a mean of zero and std deviation of 1. All scores below the mean will have negative scores, all scores above the mean are positive, all scores at mean are 0.
When raw scores are transformed to a scores the shape of the distribution does not change. This is called a linear transformation.
Z score is x minus M divided by standard deviation.
Define t scores
T score is a standard score w mean of 50 and std deviation of 10.
Stanine divide distribution into 9 intervals. One is lowest ninth...
Mean of 5, std dev of about 2.
Percent scoring below attained raw score.
Flat or rectangular distribution. So within a given range always same number of scores
Changes shape of distribution.
70 percentile is 70 percent scored below you
(Percentage is items answered correctly on a test; percentile is referenced to other scores in the distribution).
See graph on yellow sheets on how to convert and equivalents on the normal distribution.
True or false
A change in raw score in the middle of the normal distribution results in a much greater change in percentile rank then same raw score change at distributions extreme.
In middle if your score is increased you jump over a lot more people. At hi end of distribution there are only a few scores and will jump over many fewer people.
Converting raw scores to z scored u would be:
Linear transformation bc shape of distribution changes
Linear transformation bc shape doesn't change
Nonlinear bc shape changes
Nonlinear bc shape doesn't change
Percentile is nonlinear
1000 ppl take job test with a mean of 60 and std dev of 5. They want top 150. Normal distribution the cut off is
B. around 16 percent which is z score of 1.
Judy scores in 48 percentile. John at 93 percentile. Normal distribution but error. Each gets 3 pts Added to score but no one else's. what is true?
Both percentile ranks increase the same
Judy's will increase more
Johnny's will increase more
Neither will change.
B. more scores in middle of distribution and she will jump over more people.
Inferential stats means
What is an invariable result of using samples to study populations?
Standard error of the mean
Type I error
Inferential stats allow us to make inferences about what is happening in an entire population on the basis of what we observe in the sample.
Sample stats only provide estimates of corresponding populations (sample value is called statistic and population value is called parameter).
A. Sampling error is the inaccuracy of a sample value (stat) and the population value (parameter). When use a stat some error is inevitable.
One type of sampling error is standard error of the mean.
Define standard error of the mean
Type of sampling error
Difference between a sample mean and a population mean or extent to which a sample mean can be expected to deviate from its corresponding population mean.
Also called error of the mean
Must know formula!!!
SE mean equals std dev divided by square root of N (sample size).
N is 25 and sd is 10. Error is 2.
So sample obtained can be expected to deviate 2 points either way higher or lower from the population mean.
Error is smaller as sample size larger bc get closer to size of the population.
INVERSE relationship...as sample size increase, std error of mean decreases!
Define null and alternative hypothesis
Define one and two tailed tests.
Null hypothesis..no difference in tx conditions. So in population the IV has no effect on dv.
Alternative hypothesis...states the opposite of the null.
Usually predicts a relationship between variables
One tailed test..stat test used when the alternative ho is directional (one mean is greater than another). Greater than or less than another mean.
Two tailed...means are different but we don't know the direction.
What is type I error? Type II error?
1. Erroneously accept the null hypothesis. No difference.
2. Probability of rejecting a false null hypothesis. Correctly detect tx effect.
3. Nondirectional alternative hypothesis
4. Erroneously reject the null hypothesis. Say difference exists but in reality no difference exists. Finding something when it is not there.
1. Type II
2. Power. Probability of not making a type II error.
3. Two tailed test
4. Type I error
Discuss the retention region and the rejection region.
Retention region is the white area of the graph. If the value falls there the null ho is retained. Meaning it is kept.
If the value falls in the rejection region (defined by the alpha level), the null hypothesis is rejected. This is because the stat test has indicated that the null has only 5 % or less chance of being true or 95% chance that it is not true so we have rejected it. When reject it say reached at the significance level.
Say reject the null at the .05 level
The significance level is the probability at which we reject the null as being true.
What factors affect power? T or F?
A. As alpha increases, power increases.
B. as alpha decreases, power increases.
C. Two tailed tests are more powerful than one.
D. Larger the sample size, greater the power.
E. smaller the difference between the population means, more likely to detect these differences (more power).
A. True. Higher the alpha easier to reject null
C. False. One tailed tests more powerful.
E. false. Greater diff between pop means more power. Can make the difference between the IV bigger. Drink glass of wine vs bottle of whiskey..
Increasing alpha, increases power. Also means:
A. Probability of making type I error increases
B. probability of making a type II error increases
C. They both increase
D. They both decrease
Probability of type I goes up and type II goes down.
When set alpha consider real world circumstances. If I is more serious than II error then set alpha low. Do if research counterintuitive and contradict previous research.
Difference between a parametric and nonparametric test.
Test interval or ratio data.
Normal distribution of dv
Homogeneity of variance
Independence of observations
Ie. t test, ANOVA
Most tests robust re the first 2
Last is most important
Nominal or ordinal data
Not based on the assumptions
Distribution free tests
Generally less powerful that parametric tests
Chi squared, Mann Whitney u
Both assume samples are representative of population under study. Random selection of subjects
Study w 400 subjects, std deviation on dependent variable is 20. Std error of the mean is
Std error of the mean
Directly proportional to std dev and inversely proportional to N
Directly to std dev, directly N
Inversely to std dev, directly N
Inversely std dev, inversely N
Which of the following assumptions is shared by both parametric and nonparametric tests?
Normal distribution of data
Homogeneity of variance
Random assignment to groups
Random selection from population
When stat test lacks power, this means that
Probability of type I hi
Probability of type II lo
Statistical significance will be low
Results published low
C. Lacks power this means the probability of type II error is hi
Or that a false null will be kept
Test won't detect true effect
Won't yield stat significance
Alpha can be defined as
Probability of rejecting null when it is true
Retain null when it is true
Reject null when null is false
Retain null when null is false
A. Alpha is the probability of making a type I error
Which has least meaning?
Keep null when power is low
Reject the null when power is low
Retain null when power is hi
Reject null when power is hi
A. Power is low unlikely to detect an effect of IV when one is present
Likely keep null
When keep null doesn't mean u did correctly. Just means test lacked power to correctly reject
How decide to reject or accept null ho?
Compare stay value to critical value table. Critical value depends on alpha and degrees of freedom.
If obtained value exceeds critical value, null rejected.
Obtained value is lower than critical, keep null or retain null.
A. Nonparametric test
B. result is t ratio
C. Three types
D. Used if more than 2 means
False parametric test
Correct. T ratio
One sample t test...compares one sample mean w known population mean. Is. Sample 35 women lawyers and compare to national lawyer ave. df=N-1.
T test for independent samples
Compare two means from unrelated samples. Ie random assign subjects into drug and placebo grp and then compare means
Df = N-2
T test for correlated samples
Related in some way (matched, pre/post). Compare pre and post means.
Df = N-1. N equals number of pairs of scores.
D. No!! Only use t test w 2 means!
A. Stat test assesses difference between two means of two grps; one IV
B. 1 IV; more than 2 groups
C. 2 or more IV and all possible combo of IV administered
D. Use post hoc tests
A. True. One way ANOVA
one way ANOVA. Usually use t test bc easier. Yields F stat. Tells you there is a difference but not which direction.
C. Factorial ANOVA
Main and intx effects
D. ANOVA just tells difference in means. Use post hoc tests to identify exactly where the significant difference is
What are the degrees of freedom for:
1. One way ANOVA
2. T test
1. K-1 (k is number if groups)
W in N -k
2. T test
One sample, correlated sample
What is the mean square used to derive the f ratio on the ANOVA?
Index of variability used to derived f ratio. Equal to sum of squares divided by degrees of freedom.
F = MSB/MSW
MSB= sum of squares between/ k-1
MSW=sum of sq w in/N-k
Then compare to critical value. Higher than stat significance.
Doesn't tell which means differ significantly just that they do.
Post hoc done if significant
Post hoc comparisons are done when significance is found for the F stay in the ANOVA.
Pair wise comparison
Usually more than one comparison is of interest. More comparisons, greater chance of type I error.
What post hoc test provides the greatest protection against inflation of type I error?
What provides enough type I protection when doing pair wise comparisons?
Between 2 means
Between combined means
Protection? Scheffe is most conservative. Best protection against type I error when multiple comparisons are made. However this may increase type II error (and miss it if there is an effect)
If doing pair wise comparisons use what? Tukey.
What are all the ANOVAs?
One way ANOVA
For repeated measures
Forms of one way ANOVA
For repeated measures
One way...1 IV and more than 2 independent grps
Factorial ANOVA 2 plus IV
For repeated measures all levels of all IV applied to single grp (or matched grps)
Mixed or split plot ANOVA...
2 plus IV; mixed...at least one between subject IV and at least one repeated measures or w in subj variable (Not variance!)
Manova. 2 or more Dv;1 or more IV
Adv (vs many one way ANOVAs) is decreasing Expter wise error rate or type I error
ANOVA repeated measures
All subjects get all levels of IV
Stat control over one or more dv to control for effects of extraneous variables.
Two way ANOVA is 2 IV factorial
Advantage of using factorial ANOVA over many one way ANOVAs is?
A. Less work
B. allows main effect and intx
C. Less statistics
D. Hell who knows
Main effect...effect of one IV by itself. Find these differences, if any, in the marginal means column (pg 91).
Intx effect. Effects of IV at different levels of other Ivs. Look inside boxes or cell means to find . Numbers move in same direction for both then no intx when reading across or down. Go in opposite directions. Also can draw a graph. If intx then lines cross.
When interaction effect must interpret main effects w caution. Can't interpret them wout looking at intx
What is the nonparametric test used for categorical data?
A. Chi squared
B. Mann Whitney
C. Wilcoxon matched pairs
D. Kruskal wallis
A. Used when given frequency or number of subjects w in ea category (not mean scores).
Compares observed frequencies of observations within nominal categories to frequencies expected under null.
B. compare two independent grps on dv measured w rank ordered data. Alternative to t test for independent samples.
C. Compare two correlated grps on dv w rank ordered data.
Alt to t test for correlated samples
D. Compare 2 plus independent grps on dv w rank ordered data.
Alt to one way ANOVA.
Describe chi squared.
What cautions are needed when using it?
Nominal data. Frequencies of observations w in a category. Test ho that observed frequencies equal those expected of null true.
Single sample...from one grp of ppl
Df = C - 1. (Categories)
Multiple sample...adding another variable in addition to one that gives rise to classification categories
Df = (C-1) (R-1). R is rows
1. All observations must be independent of ea other
2. Ea observation only in one category or cell
3. Percentages of observations w in categories can not be compared.
May be exam question.
Determine the expected frequencies under the null ho for chi squared.
Total number of subjects/ number of cells.
Coke study had 100 subj and 2 cells
Fe = (column total)(row total)/N
See page 97.
Most do for ea cell. Need look example.
Nonparametric alternative to a t test for independent samples
Wicoxon matched pairs
Mann Whitney u
T test for correlated samples
See ? Pg 100
Use if which post hoc test results in greatest probability of making a type II error?
Scheffe is most conservative
Greatest protection vs type I but then increases chance of type II.
Tukey. Use for pair wise comparisons. Gives enough protection against type I
Correlation coefficient: t or f
A. Usually -1 to 1
B. Magnitude is number; higher number more related.
C. -1 is a perfect correlation
D. Zero is no relationship
E. positive coefficient..two variables move in same direction
F. Negative correlation..as one goes up, the other goes down or they vary inversely.
G. High correlation infers a causal relationship.
All true but g
Could mean causally correlated
If two variables are causally related, there will be a correlation between them.
Correlation is a necessary but sufficient condition of causality. So correlation does not guarantee causality but if there is a causal relationship they also must be correlated.
Pearson r correlation:
A. Most used
B. interval or ratio scale data
C. Assumes linear relationship
E. smaller range of sampled behavior, more accurate estimate of correlation.
All except d, e
Homoscedasticity ...dispersion of scores is equal thruout scattergram and is an assumption of Pearson r.
(Heteroscedacity is more dispersion at some parts of scattergram or not uniform). This lowers the coefficient.
Wider range of scores the more accurate the correlation. Increase a correlation by increasing the range of scores.
What is the coefficient of determination?
A. Percentage of variability in one measure accounted for variability in other measure.
B. squared correlation coefficient
C. Includes all but reliability coefficient which is never squared.
D. Example. Coefficient is .70 so 49 % of variation of x explained by y.
A. Allows prediction of unknown value of one variable from known value of another
B. one is independent or predictor variable; one is dependent or criteria variables
C. Criterion is that which is being predicted
D. Unless correlation is perfect (1 or -1) there will always be some error. Considered estimates.
E. higher correlations are, more accurate the predictions made will be by regression equation.
F. Linear relationship noted by a straight line in scattergram. If as close to as many dots as possible called line of best fit or regression line.
Regression line is pic of overall relationship between 2 variables. Higher correlation, closer dots and more accurate at predicting y.
Error is diff between predicted and actual criterion scores. Error scores assumed normally distributed w a mean of zero. Assume homoscrdastic.
Regression line is least amount of error in predicting y scores from x.
Regression can be used as sub for ANOVA.
A. Another name for multiple correlation coefficient.
B. use of scores on more than one predictor to estimate scores on criterion
C. Multiple R is highest when predictor variables each have hi correlations w criterion but low w ea other.
D. Lo correlations between predictors are desired bc better when ea predictor provides new info about the criterion.
E. multiple correlation coefficient is never lower than highest simple correlation
F.cant be negative
B. multiple regression
D. True. If predictors overlap/hi correlations w ea other combining them yields no significantly new info
Significant predictor overlap is multicollinearity. One of predictors past 3 or 4 bound to have hi correlation w one of others.
G true coefficient of multiple determination. Gives proportion of variance in criterion variable accounted for by combo of predictor variables.
What is stepwise regression?
Discuss forward stepwise and backward stepwise
Large number of potential predictors but use smaller set.
Goal to get smallest set that maximizes predictive power. Those w hi multiple correlation w criterion.
Multicollinearity..adding gives no more power
Forward...start w one predictor and add predictors one at a time. W ea you do analysis to determine predictive power of multiple regression is substantially increased. First kept has highest correlation w criterion. Add til no more predictive increase
Backward. Start w all potential predictors and remove one at a time. When starts to significantly decrease R stop removing.
Match the technique w the correlational technique;
A. Goal to classify individuals into grps on basis of scores on multiple predictors. Differential validity..ea predictor has different correlation w ea criterion variable. Continuous variables.
B. examines trend of change in dependent variable (vs if dv changes at all)
C. Assess relationship between 2 variables with the effects of another variable partialled out (stat removed). Remove suppressor variable may contribute to misleadingly hi or lo correlation.
D. Calculate relationship between 2 plus predictors and two plus criterion variables
E. set minimum cut off on a series of predictors. If not met on even one, not selected
F. Set of techniques calculating the pair wise correlations between multiple variables. Causal modeling, path analysis, liSeral
G. Classify into grps; no assumptions met; can be nominal data ; value between 0 and 1.
A. Discriminate function analysis
B. trend analysis
C. Partial correlation
D. Canonical correlation
E. multiple cutoff
F. Structural equation modeling
G. Logistical regression
(Vs multiple correlation coefficient which is relationship between two or more predictors and one criterion)
What is the difference between logistical regression and discriminate analysis?
A. Logistical regression uses interval and ratio data.
B. discriminate analysis assumes a normal distribution and homogeneity of variance.
C. Logistics regression assumes only homogeneity of variance.
D. Both have to do w grps.
E. logistic is used mostly w dichotomous dv or where classify into one of two criterion grps; predicted value between 0 and 1, probability belongs to one of grps
A. so does discriminate analysis however...logistical also used nominal or continuous
B. correct. Logistical doesn't have these assumptions
What is structural equation modeling?
Describe the types.
(Path and liSeral)
Different from trend analysis ...examine trend of change in quantitative(interval, ratio) dv (vs if change at all).
Types..linear, quadratic, cubic, quaric. Tells if trends (break points or change in direction) are stat significant.
General term for variety of techniques that are based on correlations between multiple variables.
Application...testing causal models
Specify model w different variables usually w path diagram
Stat analysis between all pairs
Interpret results..degree data consistent w model
Path analysis. Simpler models w one way causal flows.
Observed variables only
LiSReL. One or two way
Latent and observed variables
The difference between multiple regression and canonical correlation is?
Cannonical correlation coefficient is used with multiple criterion and multiple predictors.
Multiple correlation is 2 or more predictors and one criterion.
Test a has .6 correlation w test b and correlation of .3 with test c.
Test a accounts for ---- as much variability in test b as it does in c.
Square correlation coefficient
So .6 is .36 and .3 is .09.
Which describes a correlation between x and y?
Variability of y at each x is lower than total variability of y.
Variability of y is different at different levels of x
Variability of y scores at each x is equal to the total variability of y scores.
Variability of y at each x is approximately the same.
C. Says...range of y at every individual x will be equal to the entire range of y. In other words, any one score on x doesn't provide any info about y. Means correlation is 0.
B is heteroscedascity.
D is homoscedasticity
Define sampling distribution:
A. Whole set if cases researcher is interested in.
B. set of scores obtained from the sample.
C. Distribution of values of the statistic with each value computed from same sized samples drawn w replacement from the population.
D. Theoretical distribution of the means various samples one could draw, all of equal size, from one population.
A. Population and can be represented in frequency distribution. Every single score in population. Population can be anything..height or cornstalk etc
B. set of scores obtained from a sample is sample distribution. Most cases it will have less score variability than pop distribution. Make a frequency distribution. Doesn't include the full range of scores.
C. Sampling distribution. Can be done with the mean, t, F.
D. Sampling distribution of the means. Ea sample drawn will have a mean close to but not exactly at population mean. Plotting a large series of the means of these samples will yield a distribution of sample means that will approach a normal shape, be very tightly clustered around population mean, and have a mean that is equal to population mean. Use sampling with replacement and all sample sizes are the same.
Central limit theorem:
1. As sample increases, shape of the sampling distribution of means approaches a normal shape.
2. Above is true even if the population isn't a normal shape.
3. Mean of the sampling distribution of means is equal to the mean of the population.
4. Less variability than the population distribution.
5. Std deviation of sampling distribution of means is equal to population std deviation divided by square root of the size of the samples from which the means were taken. This is the formula for the standard error of the mean or standard deviation of the sampling distribution means.
Basis of statistical inference!
A test is robust or the rate of false rejections of the null/type I error rate is not substantially increased when normal distribution and homogeneity of variance are violated. Most parametric tests are robust w. as long as the sample size is adequate, still working w a normal distribution.
As N decreases, parametric tests are less robust..and important that the normality of assumption is met.
Re: homogeneity of variance...ok if equal number of subjects.
Unequal then inflated type I error.
In the time series design the assumption for independence of observations is not met. This leads to?
C. Means across measurements will be related and there will be a different magnitude of relationship among the means, depending on lag or how close together in time they were obtained. Correlation a between observations at given lags is autocorrelation.
Bayes theorem should be associated with:
A. Conditional properties
B. base rates
C. Common conceptual basis
D. Combines studies with a common conceptual basis as a larger part of a study.
A, b correct! Formula for conditional probabilities. Revise probabilities based on additional info.
C, d is meta analysis. analysis uses the results of each study as if they were separate scores, sums, averages. Ea study used is one subject in meta analysis
Stat yielded is effect size..gives magnitude if independent variables effect. Computed for ea dv. Then get average effect size.
Adv..allows consider size of effect
Disadv..subject to bias by analyzer, such as which studies to include
See of 137 for ?, of 140 ?
Which is Criterion based score?
B. how person does in reference to external criterion. Percentage tells is how did on a test or how much was mastered.
Others are norm referenced scores on how did compared to others. Don't say anything about how much of the criterion is mastered.
Defining feature of. True experiment is:
Random selection from population
Random assignment into grps
Use of manipulated variables
Use of non manipulated variables
Also c but that applies to others as well...like quasi experiments