Statistics Flashcards

(184 cards)

1
Q

Definition

one type of inferential statistics. It is used to determine whether there is a significant difference between the means of two groups. With all inferential statistics, we assume the dependent variable fits a normal distribution

A

T test

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is the internal and external validity rated for an quasi-experimental study?

A

Internal: medium

External: medium

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is a two-way ANOVA?

A

An ANOVA with 2 factors (IVs)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is contrasts?

A

The between treatment variability is explained by the experimental manipulation – i.e., is due to participants being assigned to different groups

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What do you do if homogeneity is violated in an ANOVA?

A
  • If sample sizes are large and equal…
    • ANOVA can handle normality violation
  • If sample sizes are small or not equal…
    • Use the Brown-Forsythe or Welch F ratio (and their associated p and df; Welch is more powerful) instead of the regular F ratio
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Define

T test

A

one type of inferential statistics. It is used to determine whether there is a significant difference between the means of two groups. With all inferential statistics, we assume the dependent variable fits a normal distribution

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What are some threats to external validity?

A
  • Generalising across participants or subjects
  • Generalising across features of a study
  • Generalising across features of the measures
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Definition

the validity of applying the conclusions of a scientific study outside the context of that study. Tt is the extent to which the results of a study can be generalized to and across other situations, people, stimuli, and times

A

External validity

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Definition

F = variance between sample means / variance expected by chance

A

F statistic

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

t-Tests are “difference tests”. Used to compare mean differences between up to ____groups or conditions

A

t-Tests are “difference tests”. Used to compare mean differences between up to two groups or conditions

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Define

Type I error

A

the rejection of a true null hypothesis (also known as a “false positive” finding or conclusion)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Definition

a collection of statistical models and their associated estimation procedures (such as the “variation” among and between groups) used to analyze the differences among group means in a sample

A

ANOVA

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What do you use in a post hoc test is there are unequal variances?

A

Games-Howell

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Define

Confounding variable

A

factors other than the independent variable that may cause a result

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What does witihin treatment variance look like on this graph?

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is the experiment-wise error rate of an analysis that uses 3 comparisons at an alpha level or .05?

A

αEW = 1 - (1 - αTW)c

αEW = 1 - (1 - .05)3

αEW = 1 - .953

αEW = .14

14% chance of comitting at least 1 type I error

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

What test is used to assess normality?

A

Shapiro-Wilk

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Definition

An experimental design that that looks a bit like an experimental design but lacks the key ingredient – random assignment

A

Quasi-experiment design

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

What is the internal and external validity rated for an experimental study?

A

Internal: high

Externl: low

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Definition

A type of reseach used to assess changes over an extended period of time

A

Developmental research

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

How do you minimise participant attrition?

A
  • Increase sample size and measure/ compare participants who do/don’t withdraw
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

What does a positive t-value tell you?

A

That the mean for condition/sample 1 is higher than the mean for condition/sample 2

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

What is the non-parametric alternative to a one-way repeated-measures ANOVA?

A

Friedman’s test

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

How do you minimise environmental variables?

A
  • Standard experimental procedures, setting, and experimenter
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
What are threats to both internal and external validity?
* Experimenter bias * Demand characteristics and participant reactivity
26
How do you calculate the F ratio using sum of squares and degrees of freedom?
Mean squared deviation (MS) = Sum of Squares (SS) / *df* *F* = MSBETWEEN / MSWITHIN
27
What is the biggest threat to internal validity?
Confounding variables
28
# Definition factors other than the independent variable that may cause a result
Confounding variable
29
Why choose repeated-measures?
* A repeated-measures ANOVA uses a _single sample_, with the same set of individuals measured in all of the different treatment conditions * Thus, one of the characteristics of a repeatedmeasures (aka within-subjects) design is that it _eliminates variance caused by individual differences_ * Individual differences are those participant characteristics that vary from one person to another and may influence the measurement that you obtain for each person * e.g., age, gender, etc.
30
What is the formula used to determine the experiment-wise error rate?
αEW = 1 - (1 - αTW)c Where *c* = number of comparisons
31
During a post hoc test, if assumptions are met and sample sizes are equal what do you use?
Turkey's HSD
32
What are examples of non-experimental designs?
Observational, cross-sectional or longitudinal studies
33
In a one-way repeated-measure ANOVA, what is the *F* ratio made up of?
*F* = (treatment effect + other effect) / other effect Numerator = between treatment variance Denominator = within treatment variance Individual differences not considers due to repeated measures
34
What is the internal and external validity rated for an non-experimental study?
Internal: low External: high
35
# Define Experiment-wise error rate (αEW)
The probability of making at least on type I error amongst a series of comparisons
36
# Define ANOVA
a collection of statistical models and their associated estimation procedures (such as the "variation" among and between groups) used to analyze the differences among group means in a sample
37
What type of follow up test do you use if there is a specific hypothesis? What about when there is no hypothesis?
Specific hypothesis: Planned comparisons No hypothesis: Post hoc tests
38
How do you minimise generalising across features of a study?
* Conduct naturalistic research * Switch from between-subjects to a withinsubjects or matched-subjects design. * Replicate study in different setting/with different experimenter
39
What does contrast 1 and contrast 2 test?
40
What effect size value is used for ANOVA? How do you calculate it?
Eta-squared (η2) η2 = SSbetween / SStotal
41
How do you minimise generalising across features of the measures?
* Use multiple response measures (e.g., selfreport, observation, physiological). * Systematically vary time of measurement as an IV and measure effect on DV and other IV.
42
43
What does a negative t-value tell you?
That the mean for condition/sample 1 is _lower_ than the mean for condition/sample 2
44
What do you do if sphericity is violated?
If assumption is violated, apply a correction factor (of epsilon) to the degrees of freedom – This will in turn adjust the p value for the ANOVA If Greenhouse-Geisser epsilon is less than .75 then use Greenhouse-Geisser If Greenhouse-Geisser epsilon is greater than .75 then use Huynh-Feldt epsilon correction
45
What value of the MANOVA should you report in most situations?
Pillai's trace
46
For the *F* ratio to be reliable and valid, what assumptions must be met?
1. Independence of observations * The observations within each sample must be independent 2. Interval/ratio level of data 3. Normality * Poplations must be normally distributed as determined by Shapiro-Wilk 4. Homogeneity of variance
47
How do you minimise time-related variables?
* Add control group for comparison purposes * Switch from a within-subjects to a betweenor matched-subjects design * Control/limit time between testing * Counterbalance order of presentation of conditions across participants
48
What are the key elements of an experiment
* Manipulation of the independent variable (IV) to create two or more treatment conditions (levels). * Measurement of a dependent variable (DV) to obtain a set of scores for each treatment condition (level) * Comparison of the DV scores for each treatment condition (level) * Control of all other (extraneous) variables to ensure that they do not confound the relationship between IV and DV. * Random assignment of participants to each condition so that the groups can be considered truly equivalent.
49
# Definition The alpha level used for each comparison
Test-wise error rate (αTW)
50
Distributions like this would violate which assumption?
Homogeneity
51
# Definition the ratio of the between group variance to the within group variance
F-ratio
52
# Define Internal validity
the extent to which a piece of evidence supports a claim about cause and effect, within the context of a particular study
53
If there are 4 time points, how many orthogonal contrasts are there?
k = 4 orthogonal contrasts = k - 1 Therefore, there are 3 orthogonal contrasts
54
What are two reasons for between treatment variance?
1. Treatment effects: the differences are caused by the treatment(s) 2. Chance: the differences are simply due to chance
55
What Shapiro-Wilk result suggests normality assumption is met
*p* \> .05, shape is not significantly different from normal, this normality assumption is met
56
How do you report one-way independent measures ANOVA test results?
* F* (*dfbetween, dfwithin) = value, p = value* i. e. *F*(2, 12) = 23.49, *p* \< .001
57
With an *F*-ratio of around 1 what does that suggest? Why?
With an *F*-ratio of around 1 we would conclude that there is no treatment effect
58
# Definition A type of ANOVA used to determine whether three or more group means are different where the participants are the same in each group
One-way repeated-measures ANOVA
59
True or False: Repeated-measures designs are powerful
True Repeated-measures designs are powerful because they remove individual differences
60
η2 = .059 is what size effect?
Medium
61
In terms of the F-ratio for a repeated measures design, the variance between treatments (the numerator) *does/does not* contain any individual differences
In terms of the F-ratio for a repeated measures design, the variance between treatments (the numerator) **does not** contain any individual differences
62
During Post hic tests, if sample sizes are slightly different then use _________ procedure because it has greatest power, but if sample sizes are very different use \_\_\_\_\_\_\_\_\_
During Post hic tests, if sample sizes are slightly different then use **Gabriel's** procedure because it has greatest power, but if sample sizes are very different use **Hochberg's GT2**
63
# Definition the rejection of a true null hypothesis (also known as a "false positive" finding or conclusion)
Type I error
64
# Definition A test used to determine whether there are any statistically significant differences between the means of two or more independent (unrelated) groups
One-way independent measures ANOVA
65
# Define External validity
the validity of applying the conclusions of a scientific study outside the context of that study. Tt is the extent to which the results of a study can be generalized to and across other situations, people, stimuli, and times
66
Why is an ANOVA preferred over *t*-tests for over 2 groups?
Each time you run a hypothesis test, you run the risk of commiting a type I error
67
# Define Developmental research
A type of reseach used to assess changes over an extended period of time
68
True or False: ANOVA tests only non-directional hypotheses
True
69
What do you use to calculate the effect size for a post hoc test?
Cohen's d *d* = mean difference / SD
70
True or False: You should not perform planned comparisons as well as post hoc tests for a one-way repeated-measures ANOVA
True
71
True or False: This is the results of an ANOVA
False; This table relates to MANOVA rather than ANOVA
72
# Definition any variables that you are not intentionally studying in your experiment or test
Extraneous variables
73
What do you do if normality is violated in an ANOVA?
* If sample sizes are large and equal... * ANOVA can handle normality violation * If sample sizes are small or not equal... * Transform your data * Run a Kruskal-Wallis test as the nonparametric alternative to a one-way independent-measures ANOVA
74
What extra assumption is required for repeated-measures? Why?
* Same participants in all conditions * Therefore, scores across conditions will correlate * Violates assumption of independence! * Because of this, an additional assumption is required for repeated-measures ANOVA – namely, sphericity * Put crudely, the assumption of sphericity means that the correlation between treatment levels should be the same * Actually, it assumes that the variances of the differences between treatment levels are equal
75
# Definition a type of experimental design and is thought to be the most accurate type of experimental research that supports or refutes a hypothesis using statistical analysis.
True experimental research
76
How would the sample populations relate to each other if the null hypothesis was rejected in an ANOVA?
The sample populations would be all equal to each other and the same as the original population
77
What is the 4 step process of hypothesis testing using an ANOVA?
1. State the hypotheses (H0 and H1) 2. Decide when to reject H0 3. Calculate the test statistic. In this case, the *F* ratio 4. Make a decision about H0 (reject/don't reject
78
How do you deal with individual differences in a repeated-measures ANOVA?
* The individual differences are automatically removed from the numerator because the design uses the same subjects in all treatments, but we must also remove them from the denominator * Remove individual differences from the denominator by measuring the variance within treatments and then subtracting the individual differences * The result is a measure of unsystematic error variance that does not include any individual differences
79
What are the threats to internal validity?
* Environmental variables * Individual differences * Time-related variables * Participant attrition * Communication between groups
80
ANOVA simply tests the null hypothesis that all group means are equal, and therefore a significant result merely tells you that at least one group’s mean is different from another. How do you make more specific comparisons?
* Post-hoc tests * No specific hypotheses at outset; Compare each group to each other but use a smaller α to limit type I error rate * Planned comparisons * Specific hypotheses at outset; make specific comparisons by breaking down the between treatment variance (total variance accounted for by model) into its component parts
81
What does a large *F*-ratio indicate?
The differences between treatments are greater than chance
82
# Define Extraneous variables
any variables that you are not intentionally studying in your experiment or test
83
For the following research question would you use an ANOVA or a *t*-Test? Are sufferers of depression who receive any form of treatment (i.e., medication, exercise, or a combination of medication and exercise) less depressed than people who do not receive any treatment?
ANOVA (more than 2 groups)
84
What can between treatment variability be broken down to?
This variability can be further broken down to test specific hypotheses about which groups might differ from one another We break down the variance according to hypotheses made a priori (before the experiment) Providing that the hypotheses are independent of one another, the experimentwise type I error will be controlled
85
# Define Quasi-experiment design
An experimental design that that looks a bit like an experimental design but lacks the key ingredient -- random assignment
86
Which sections are ANOVA and which are planned contrasts?
87
What is the H0 and H1 hypotheses for ANOVA?
H0: There really are no differences between the populations (or treatments). The observed differences between samples are due to chance (sampling error) H1: The differences between the sample means represent real differences between the populations (or treatments). That is, at least 1 of the treatments really do have different means, and the sample data accurately reflect these differences
88
How do you minimise experimenter bias?
* Conduct a double-blind study (i.e., neither participant nor experimenter know which condition the participant is in)
89
In ANOVA, an independent variable (IV) is called a \_\_\_\_\_ Each (treatment) condition of a factor is called a \_\_\_\_\_\_\_
In ANOVA, an independent variable (IV) is called a **factor** Each (treatment) condition of a factor is called a **level**
90
What does between treatment variance look like on this graph?
91
How do you minimise individual differences?
* Create equivalent groups using random assignment, holding constant, or matching * Switch from a between-subjects to a withinsubjects or matched-subjects design
92
What type of experimental design is this? Why? For example, researchers take data from two different schools that are expected to be similar. An intervention is tested is one school and not the other. The pretest-posttest change is then compared between schools
Quasi-experimental This is quasi-experimental because participants (students) were not randomly assigned. There may indeed by some small differences between the groups
93
What is a partial eta squared?
An eta squares with the effects of individual differences removed from the denominator
94
What test is used to test sphericity? When is sphericity met?
Mauchly's test Sphericity assumption is met if variances are roughly equal. Therefore assumption is met when p \> .05
95
In a one-way independent-measures ANOVA, what is the *F*-ratio made up of?
*F* = (treatment effect + individual differences + other error) / (individual differences + other error) Numerator = variability between treatments Denominator = variability within treatments
96
# Define True experimental research
a type of experimental design and is thought to be the most accurate type of experimental research that supports or refutes a hypothesis using statistical analysis.
97
# Define F-ratio
the ratio of the between group variance to the within group variance
98
# Definition The probability of making at least on type I error amongst a series of comparisons
Experiment-wise error rate (αEW)
99
What Lavene results suggests homogeneity assumption is met?
If p \> .05, the variances are not significantly different from one another, thus the homogeneity of variance assumption is met
100
Why is a quasi-experiement not a true experiment?
* The independent variable was not experimentally manipulated (i.e., pre-existing levels are selected and compared); or * The participants were not randomly assigned to conditions (e.g., groups were selected for analysis after the fact).
101
How do you minimise demand characteristics and participant reactivity?
* Switch from a within-subjects to a betweenor matched-subjects design * Conduct a blind study * Use measures which do not explicitly refer to construct being measured
102
True or False: Partial eta squared is interpreted the same as eta sqaured
True
103
# Definition the extent to which a piece of evidence supports a claim about cause and effect, within the context of a particular study
Internal validity
104
How do you choose an alpha level that would control the experiement-wise error rate?
Bonferroni αTW = αEW (desired) / number of tests
105
When do you use MANOVA over an ANOVA?
these results can be used in place of the regular ANOVA results if the sphericity or normality assumptions are violated. Whilst these tests are more robust than ANOVA to the assumption violations, they are also less powerful
106
What rules apply to choosing contrasts?
* Use control group at reference point * Only comparing 2 chunks of variation * Independence (orthogonal)
107
How do you minimise communication between groups?
* Conduct blind study (i.e., participants do not know which condition they are in) * Switch to a within-subjects design * Limit possibility of communication between groups (e.g., different locations)
108
How do you minimise generalising across participants or subjects?
* Use a probability sampling method such as proportionate stratified random sampling, or a non-probability method which tries to achieve the same result * Increase sample size
109
# Define Test-wise error rate (αTW)
The alpha level used for each comparison
110
If the homogeneity of variance assumption is violated and you want to report the Brown-Forsythe or Welch F-ratio because they don’t assume homogeneity of variance, how would you do this?
You need to state that the homogeneity of variance assumption is violated, and that that is why you used the Brown-Forsythe or Welch F-ratio instead. You then simply report the results as usual – except that you used two decimal places for second df value (since such F-ratio calculations are based on adjustments being made to the df)
111
What are the three criteria that must be met for a true experiment?
There are three criteria that must be met in this type of experiment 1. Control group and experimental group 2. Researcher-manipulated variable 3. Random assignment
112
What test is used to assess homogeneity of variances?
Levene statistic
113
What are reasons for within treatment variance?
Within each treatment, participants are treated the same, so chance would cause differences
114
η2 = .138 is what size effect?
Large
115
The t-test generates a \_\_\_-value, which is used to then determine a p-value
The t-test generates a **t**-value, which is used to then determine a p-value
116
What is the safest option during a post hoc test?
Bonferroni
117
# Define *F* statistic
*F* = variance between sample means / variance expected by chance
118
True or False: A Greenhouse-Geisser correction changes the *F* ratio
False A Greenhouse-Geisser correction changes the **degrees of freedom**
119
# Define One-way independent measures ANOVA
A test used to determine whether there are any statistically significant differences between the means of two or more independent (unrelated) groups
120
# Define One-way repeated-measures ANOVA
A type of ANOVA used to determine whether three or more group means are different where the participants are the same in each group
121
An *F* ratio close to 1 indicates what?
Strongly suggests little or no treatment effect
122
Chi-square goodness-of-fit test
A test used to compare the observed distribution to an expected distribution, in a situation where we have two or more categories in a discrete data. In other words, it compares multiple observed proportions to expected probabilities.
123
Chi-square test-for-contingencies
a procedure for testing if two categorical variables are related in some population
124
Kruskal-Wallkis test
a rank-based nonparametric test that can be used to determine if there are statistically significant differences between two or more groups of an independent variable on a continuous or ordinal dependent variable
125
Friedman test
non-parametric statistical test similar to the parametric repeated measures ANOVA, it is used to detect differences in treatments across multiple test attempts
126
What does statistical test selection depend on?
1. How many dependent and independent variables there are 2. What scales of measurement are used for each variable 3. How many groups there are, and, whether these are independent- or repeated-measures 4. Whether the assumptions have been met for a parametric statistical test
127
What type of data can be used for a non-parametric test?
Nominal/ordinal
128
What type of data can be used for a parametric test?
Interval/ratio
129
What is the non-parametric equivalent of a repeated-measures ANOVA?
Friedman test
130
What is the non-parametric equivalent for an independent-measures ANOVA?
Chi-square goodness-of-fit (nominal) Kruskal-Wallis test (ordinal)
131
What is the non-parametric equavalent of a pearson correlation?
Chi-square test-for-independence (nominal) Spearman correlation (ordinal)
132
To examine the relationship between texting and driving skill, a researcher uses orange cones to set up a driving circuit. A group of probationary drivers is then tested on the circuit, once while receiving and sending text messages and once without texting. For each driver, the researcher records the number of cones hit while driving each circuit. (Based on Gravetter & Wallnau, 2013, p. 674) Which of the following is a suitable inferential statistics test for these data? a) Independent-samples t-test b) Paired-samples t-test c) Repeated-measures ANOVA d) Linear regression
To examine the relationship between texting and driving skill, a researcher uses orange cones to set up a driving circuit. A group of probationary drivers is then tested on the circuit, once while receiving and sending text messages and once without texting. For each driver, the researcher records the number of cones hit while driving each circuit. (Based on Gravetter & Wallnau, 2013, p. 674) Which of the following is a suitable inferential statistics test for these data? a) Independent-samples t-test **b) Paired-samples t-test** c) Repeated-measures ANOVA d) Linear regression
133
“Hallam, Price, and Katsarou (2002) investigated the influence of background noise on classroom performance for children aged 10 to 12. In a similar study, students in one classroom worked on an arithmetic task with calming music in the background. Students in a second classroom heard aggressive, exciting music, and students in a third room had no music at all. The researchers measured the number of problems answered correctly for each student to determine whether the music conditions had any effect on performance.” (Gravetter & Wallnau, 2013, p. 674) Which of the following would be an appropriate statistical test for these data? a) Chi-square b) Spearman correlation c) Independent-samples t-test d) Independent-measures ANOVA
“Hallam, Price, and Katsarou (2002) investigated the influence of background noise on classroom performance for children aged 10 to 12. In a similar study, students in one classroom worked on an arithmetic task with calming music in the background. Students in a second classroom heard aggressive, exciting music, and students in a third room had no music at all. The researchers measured the number of problems answered correctly for each student to determine whether the music conditions had any effect on performance.” (Gravetter & Wallnau, 2013, p. 674) Which of the following would be an appropriate statistical test for these data? a) Chi-square b) Spearman correlation c) Independent-samples t-test **d) Independent-measures ANOVA**
134
“Belsky, Weintraub, Owen, and Kelly (2001) reported the effects of preschool childcare on the development of young children. One result suggests that children who spend more time away from their mothers are more likely to show behavioral problems in kindergarten. Suppose that a kindergarten teacher is asked to rank order the degree of disruptive behavior for the n = 20 children in the class. Researchers then separate the students into two groups: children with a history of preschool and children with little or no experience in preschool. The researchers plan to compare the ranks for the two groups.” (Gravetter & Wallnau, 2013, p. 675) Which of the following is the appropriate statistical test for these data? a) Mann-Whitney U-test b) Wilcoxon signed ranks test c) Chi-square test-for-independence d) Independent-samples t-test
“Belsky, Weintraub, Owen, and Kelly (2001) reported the effects of preschool childcare on the development of young children. One result suggests that children who spend more time away from their mothers are more likely to show behavioral problems in kindergarten. Suppose that a kindergarten teacher is asked to rank order the degree of disruptive behavior for the n = 20 children in the class. Researchers then separate the students into two groups: children with a history of preschool and children with little or no experience in preschool. The researchers plan to compare the ranks for the two groups.” (Gravetter & Wallnau, 2013, p. 675) Which of the following is the appropriate statistical test for these data? **a) Mann-Whitney U-test** b) Wilcoxon signed ranks test c) Chi-square test-for-independence d) Independent-samples t-test
135
“A researcher would like to determine whether infants, age 2 to 3 months, show any evidence of color preference. The babies are positioned in front of a screen on which a set of four colored patches is presented. The four colors are red, green, blue, and yellow. The researcher measures the amount of time each infant looks at each of the four colors during a 30 second test period. The color with the greatest time is identified as the preferred color for the child.” (Gravetter & Wallnau, 2013, p. 674) Which of the following would be an appropriate statistical test for these data? a) Single-sample t-test b) Independent-measures ANOVA c) Chi-square goodness-of-fit test d) Chi-square test-for-independence
“A researcher would like to determine whether infants, age 2 to 3 months, show any evidence of color preference. The babies are positioned in front of a screen on which a set of four colored patches is presented. The four colors are red, green, blue, and yellow. The researcher measures the amount of time each infant looks at each of the four colors during a 30 second test period. The color with the greatest time is identified as the preferred color for the child.” (Gravetter & Wallnau, 2013, p. 674) Which of the following would be an appropriate statistical test for these data? a) Single-sample t-test b) Independent-measures ANOVA **c) Chi-square goodness-of-fit test** d) Chi-square test-for-independence
136
Chi-square tests are intended for research questions concerning the ________ of the population in different categories
Chi-square tests are intended for research questions concerning the **proportion** of the population in different categories
137
What type of data can be used for a chi-square test?
Nominal data
138
What is the difference between actual and predicted values called?
Residual
139
What is the difference between an observed frequencey and an expected frequency called?
Residual
140
What are the two chi-square tests? How many variables do they examine?
Chi-square Goodness-of-Fit Test (1 nominal variable) Chi-square Test-for-Independence (2 nominal variables)
141
The chi-square goodness-of-fit test uses ________ from a sample to test hypotheses about the shape or proportions of a population
The chi-square goodness-of-fit test uses **frequency data** from a sample to test hypotheses about the shape or proportions of a population
142
The numbers of individuals in each category of a chi-square goodness-of-fit test is called what?
Observed frequencies
143
What is the null hypothesis for a chi-square goodness-of-fit test?
The null hypothesis specifies the proportion of the population that should be in each category The null hypothesis for the chi-square test for goodness of fit typically falls into one of two types: 1. a no-preference hypothesis which states that the population is distributed evenly across the categories, or 2. a no-difference hypothesis which states that the population distribution is not different from an established distribution
144
The proportions from the null hypothesis of a chi-square goodnes-of-fit are used to construct an ideal sample distribution, called \_\_\_\_\_\_\_\_\_\_\_\_\_\_\_, that describe how the sample would appear if it were in perfect agreement with the null hypothesis
The proportions from the null hypothesis of a chi-square goodnes-of-fit are used to construct an ideal sample distribution, called **expected frequencies (fe)**, that describe how the sample would appear if it were in perfect agreement with the null hypothesis
145
What is the formula for the expected frequency for each category in a chi-square goodness of fit?
*fe = pn* Where: * p* = the proportion stated in H0 * n* = sample size
146
True or False: Expected frequencies can be decimal numbers
True expected frequencies are hypothetical values
147
True or False: χ2 can never be negative
True χ2 can never be negative as the residuals (fo – fe ) are squared
148
Larger discrepancies between fo and fe produce _____ χ 2 values
Larger discrepancies between fo and fe produce **larger** χ 2 values
149
What effect size value do you use for chi-square goodness-of-fit?
Cohen's w w = sqrt(*X*2/N)
150
The chi-square test-for-independence is used to test whether or not there is a _______ between two categorical (nominal) variables
The chi-square test-for-independence is used to test whether or not there is a **relationship** between two categorical (nominal) variables
151
What is the null hypothesis for chi-square test-for-independence?
The null hypothesis for the chi-square test-forindependence can be phrased two ways: 1. there is no relationship between the two variables (they are independent); or 2. the distribution for one variable is the same (has the same proportions) for all the categories of the second variable
152
How do you calculate the degree of freedom for a chi-square test-for-independence?
*df* = (R - 1)(C - 1) Where: R = number of rows C = number of columns
153
What are the steps for calculating the chi-square test-for-independence statistic?
1. The null hypothesis is used to construct an idealised sample distribution of expected frequencies (fe ) that describes how the sample would look if the data were in perfect agreement with the null hypothesis (see picture) 2. A chi-square statistic is then computed to measure the amount of discrepancy between the ideal sample (expected frequencies from H0 ) and the actual sample data (the observed frequencies, fo )
154
When estimating effect size for a chi-square goodness-of-fit test what coefficient should you use?
For 2 x 2 table use _phi coefficient_ For tables larger than 2 x 2 use _Cramer's V_
155
What are the assumptions for chi-square tests?
1. Independence of observations 2. Expected frequencies should be at least 5
156
True or False: Observed frequencies can be less than 5 in a chi-square test
True
157
What test do you use to compare two independent sets of ordinal scores or interval/ratio scores if independent-samples t-test assumptions are violated?
Mann-Whitney U-test
158
What test do you used to compare two sets of related or repeated-measures scores measured on an ordinal scale, or interval/ratio scores if relatedsamples t-test assumptions are violated?
Wilcoxon signed-ranks test
159
For Mann-Whitney and Wilcoxon tests, the _______ the test statistic, the larger the difference between groups or conditions
For Mann-Whitney and Wilcoxon tests, the **smaller** the test statistic, the larger the difference between groups or conditions
160
What is the null hypothesis for the Mann-Whitney U-test?
the ranks for one group are not systematically higher or lower than the ranks for another group
161
What is the null hypothesis for the Wilcoxon signed-ranks test?
difference scores are not systematically positive or negative
162
Which test is used to evaluate differences between three or more treatment conditions (or populations) using ordinal data from an independent-measures design
Kruskal-Wallis test
163
What is the difference between a Kruskal-Wallis test and a one-way independent-measures ANOVA?
ANOVA requires interval or ratio scale scores that can be used to calculate means and variances The Kruskal-Wallis test, on the other hand, simply requires that you are able to rank order the individuals for the variable being measured
164
A ___________ can be used as the nonparametric alternative to a one-way independentmeasures ANOVA if the assumptions of the ANOVA are violated
A **Kruskal-Wallis test** can be used as the nonparametric alternative to a one-way independentmeasures ANOVA if the assumptions of the ANOVA are violated
165
The Kruskal-Wallis test is similar to the Mann-Whitney test. However, the ___________ is limited to comparing only two treatments, whereas the __________ is used to compare three or more treatments
The Kruskal-Wallis test is similar to the Mann-Whitney test. However, the **Mann-Whitney test** is limited to comparing only two treatments, whereas the **Kruskal-Wallis test** is used to compare three or more treatments
166
What is the null hypothesis for the Kruskal-Wallis test?
There is no tendency for the ranks in any treatment population to be systematically higher or lower than the ranks in any other treatment population.
167
What is the alternative hypothesis for the Kruskal-Wallis test?
The ranks in at least one treatment population are systematically higher or lower than the ranks in another treatment population.
168
How do you calculate the Kruskal-Wallis *H* statistic?
1. Combine the individuals from all the separate samples and rank order the entire group * i.e., rank all scores without regard to treatment condition 2. Regroup the individuals into the original samples and compute the sum of ranks (T) for each sample * i.e., add up the ranks for each treatment condition 3. The following formula is used to compute the KruskalWallis statistic – which is distributed as a chi-square statistic with degrees of freedom equal to the number of samples minus one
169
If the null hypothesis of a Kruskal-Wallis test is true what do we expect?
If the null hypothesis is true, we would expect the sums of ranks (T’s) to be more or less equal (aside from differences due to the sizes (n’s) of the samples). Thus, the Kruskal-Wallis statistic measures the degree to which the T’s differ from one another.
170
True or False: Kruskal-Wallis assumes normality and homogeneity of variance
False Kruskal-Wallis **does not** assume normality and homogeneity of variance
171
When ranking scores for a Kruskal-Wallis test, what do you do to tied scores?
Give tied scores the average of the affected rank positions
172
Like with the Mann-Whitney U-test, the _________ provide information about which groups had larger values than others.
Like with the Mann-Whitney U-test, the **Mean Ranks** provide information about which groups had larger values than others.
173
How do you calculate the number of pairwise comparisons?
number of comparisons = k(k - 1) / 2 where k = number of treatment conditions
174
What type of post hoc test do you conduct for a Kruskal-Wallis test?
175
The ___________ is used to evaluate differences between three or more treatment conditions using ordinal data from a repeated-measures design
The **Friedman test** is used to evaluate differences between three or more treatment conditions using ordinal data from a repeated-measures design
176
What is the difference between a one-way repeated-measures ANOVA and a Friedman test?
ANOVA requires interval or ratio scale scores that can be used to calculate means and variances The Friedman test, on the other hand, simply requires that you are able to rank order the individuals across treatments
177
A __________ can be used as the nonparametric alternative to a one-way repeated-measures ANOVA if the assumptions of the ANOVA are violated
A **Friedman test** can be used as the nonparametric alternative to a one-way repeated-measures ANOVA if the assumptions of the ANOVA are violated
178
For both a Kruskal-Wallis and a Friedman test what must interval/ratio scale data be converted to?
Ordinal data
179
The Friedman test is similar to the Kruskal-Wallis test. However, the ___________ is used for independent-measures designs, whereas the _____________ is used for repeated-measures designs
The Friedman test is similar to the Kruskal-Wallis test. However, the **Kruskal-Wallis test** is used for independent-measures designs, whereas the **Friedman test** is used for repeated-measures designs
180
What is the null hypothesis for a Friedman test?
The ranks in one treatment condition should not be systematically higher or lower than the ranks in any other treatment condition.
181
What is the alternative hypothesis for a friedman test?
The ranks in at least one treatment condition should be systematically higher or lower than the ranks in another treatment condition.
182
How do you calculate the Friedman statsitic, *X*F2?
1. Each individual (or the individual’s scores) must be ranked across the treatment conditions * i.e., for each participant, rank the scores in the treatment conditions from smallest to largest 2. Compute the sum of ranks (R) for each treatment condition * i.e., add up the ranks for each treatment condition 3. The following formula is used to compute the Friedman statistic – which is distributed as a chi-square statistic with degrees of freedom equal to the number of treatments minus one
183
If the null hypothesis of a Friedman test is true what do we expect?
If the null hypothesis is true, we would expect the sums of ranks (R’s) to be more or less equal. Thus, the Friedman statistic measures the degree to which the R’s differ from one another.
184
What post hoc tests do you conduct following a significant Friedman test?
Wilcoxon signed-ranks test