2nd exam Flashcards

(141 cards)

1
Q

Validity looks at what

A

Accuracy, is the test measuring what it is intended to measure

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

The overarching validity all others fall under this one

A

Construct Validity

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Does validity look at entire test or item quality?

A

Falls under reliability looking at the entire test

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What are the validities under construct validity

A

Content, convergent, discriminant, criterion related, incremental, ecological

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Construct validity

A

assessing the accuracy of the test to measure certain personality or psychological constructs

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

One problem with construct validity

A

is that the test is robust enough that it can accurately measure psychological constructs that often might not be stable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

With construct validity, we need to understand when the theories are changing

A

we know the testing is changing too

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

To truly measure what a test is intended to measure?

A

it has to have a certain level of timelessness

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Content validity

A

items need to cover the material that the instrument is supposed to cover, how relevant are the items to the construct

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Content validity is looking at 2 questions

A

does the test cover a representative sample of specified skills and knowledge? Is the test performance reasonably free from influence of irrelevant variables?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What does content validity assume?

A

assumes that you have a good detailed description of the content domain- which is not always possible

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What are the issues with experts administering tests with content validity?

A

Issue with experts is that they may be too invested or biased to the construct, or don’t have lived experienced, they only have theoretical knowledge

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What happens with content validity, if measure does not have appropriate content?

A

you will make incorrect or erroneous clinical judgments based on measure

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Why is content validity important to consider

A

Since development of measure maybe valid only under circumstances, different time periods, such as issues with gender nowadays
ex: cultural variability- not all cultures experience the same depression

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Differences between trait-based depression and state-based depression?

A

Trait- endogenous- no precipitating factors, no situation caused it its chemically happened
State- theres a stressor, has precipitating factors

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

One way we come up with content validity is by using

A

Focus groups

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Focus groups

A

allow you to go to individuals who have experienced the construct for them to help give appropriate items for that construct in a group setting

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

How is focus groups beneficial?

A

get a deeper understanding of the construct and people may feel comfortable to share their experiences in a group

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Lived experiences within focus groups helps with

A

the accuracy of the construct, creates better validity

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Focus groups were not used as much in the 1967-2002 when tests were created why?

A

It is hard to find a sample and it is hard to get funding for it

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

What are some drawbacks of content validity focus groups?

A

some are limited in their ability to participate (chronically mentally ill), hard to develop focus groups for rare constructs and some facilitators in focus groups will lead the group in a biased fashion

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

Examples of questions for focus groups

A

-“what was it like when you were deployed?”- for veterans to help with the language content in the test

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

What happens in content validity after the items have been generated through experts and focus groups?

A

experts will evaluate your scales and response options, help you write more clear questions

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

Criterion related validity

A

assess the degree to which the scores on an instrument accurately compare with a relevant criterion variable (a real life variable)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Criterion
real world implication
26
Two types of criterion related validity
Predictive and concurrent validity, both have a validity coefficient, the closer it is to 1, the more accurate it is in predicting
27
Predictive validity
predict an outcome (criterion) (ex: SAT predicts 1st year college GPA (real life outcome)
28
In predictive validity, validity coefficients look at
correlating the tests scores and the criterion variable for each person
29
Concurrent validity
when new test is administered and criterion variable is also collected at the same time - want to see if the measurement is matched to the criterion (real life variable)
30
Example of concurrent validity
(ex: chicago school interview day, we wrote a paper and had a interview (this is the real life because interaction) want to see if they are correlated
31
Convergent validity
testing measurement is shown to be correlated with another measurement that is examining the same construct (BDI and Hamiliton Depression Inventory should have high convergent validity)
32
2nd type of convergent validity
measuring the same construct but have multiple of different measurement methods (ex: self report of impulse control and family reports of impulse control problems)
33
convergent validity should be close to
1, which indicates that both measures are considered to be measuring the same construct
34
Discriminant validity
is where the testing measurement is shown to be minimally correlated with another measurement that is examining a construct that is dissimilar
35
With discriminant validity we want the validity to be
Close to 0 to show there is no overlap between measures or minimal association
36
The problem with discriminant validity
very hard to find constructs that are opposites of one another, many overlap
37
Face validity (not a real validity)
pertains to if the test looks valid to the examinees taking the exam, subjective and imprecise validity
38
Procedures to ensure validity
contrasted groups, previously available tests, criterion established by the rater, age differentiation, physical evidence of the behavior, real time observations in the world, controlled stimuli that depict variations in behaviors
39
Contrasted groups
give a test to two different kinds of samples and both samples have difference with regard to a specific trait, they contrast
40
Previously available tests
using previous tests as the criterion to compare new test ex: WAIS always compared to WISC should be correlated with one another
41
Age differentiation
want to make sure the test is checked against chronological age or developmental milestones to determine whether scores increase with advancing age (ex: a 14 year old cannot do what a 17 year old can do)
42
Criterion established by the rater
Wants clinical interview/diagnosis done to match the instrument that was given. What you find in the instrument, should be also found in a clinical intake interview
43
Criterion contamination
the rater doing the clinical intake, also knows what the testing has shown, it influences their diagnoses
44
The criterion established by the rater can sometimes cause
criterion contamination, this is a problem with ensuring validity
45
physical evidence of the behavior
have people wear pedometers and check to see if reporting walking behavior (exercise)
46
real time observations in the world
very labor intensive (ex: GRE and college performance)
47
Controlled stimuli can be created that depict variations behaviors
videotape marital relationships and then see if corresponds to their depiction of self report measures of marital satisfaction
48
Cautions when interpreting validity
1. testing procedures are not always reproduced accurately, 2. the criterion variable means nothing unless it is important or reliable, 3. make sure the population is representative of the sample, 4. need adequate sample size, 5. dont confuse criterion with predictor (when someone does well on SAT does not mean they are getting a 4.0 in college), 6. check for restricted range on both predictor and criterion, want the full normal curve, 7. is it generalizable, 8. consider differential predictions (its not that the SAT predicts GPA could be motivation
49
sample size does what to validity
it increases it
50
outliers do what to validity
can increase validity, because it is the full normal curve, you get all of the data
51
Incremental validity
a statistical method that measures how much more predictive a new assessment is than existing ones
52
Incremental validity is often seen in a chart of a
Regression Table looking at R squared
53
R squared means
amount of variance that can be predicted by a test
54
What happens when all of the R squared in a regression table are added up
Increases incremental validity
55
Ecological Validity is related to what type of psychology?
Neuropsychology
56
Can your measure be valid and not reliable?
No, if you have accuracy you also have consistency
57
Can your measure be reliable and not valid?
Yes, you can have a test that is reliable but not valid
58
Ecological Validity
how well do tests be generalized to real life or real world settings
59
Why do you give tests looking at Ecological Validity to neuropsych patients?
They have issues of daily living
60
Two ways to establish Ecological Validity
Verisimilitude & Veridicality
61
Verisimilitude Pt. 1
concerned with equivalence of tests to simulate everyday activities (ex: grocery list to neuropsych patients)
62
Veridicality Pt. 2
the degree to which the test shows an empirical relation to measures of cognitive functioning (it should map into every part of problem solving)
63
Standardization group
group of test takers who represent the population for which the test was intended
64
Norms
Performance of standardization group
65
Why are norms important?
you want them because norms allow you to have some point of comparison to the individual you are testing compared to a larger group
66
Mental age
how far along normal development path one has progressed
67
Example of mental age
giving a math test to 7-year-old, getting 15 out of 30 questions correct are developmentally 7 but less than that seem developmentally younger, this could be a problem
68
Why is mental age a problematic norm?
Because it labels a child as delayed when they may have a simple issue of math or spelling
69
What happens if Norms are flawed?
The test becomes flawed, have to know what the norms are because if not the test is applied inappropriately
70
Tracking
determining growth on a specific biological path (ex: formula babies higher percentile and breastfed babies lower percentile, studies only look at formula for profit from companies, creates a problematic norm
71
Ordinal Scale
are designed to identify the stage reached by the child in the development of specific behaviors or functions--generally seen in infancy
72
Grade equivalent
they are determined by computing the raw scored obtained by children in each grade then converted to the grade placement scale
73
Problems with grade equivalent norms
often misinterpreted, do not mean same percentile rank for each content area
74
Gifted
gifted means they are strong in certain areas or tasks, performing above the norm
75
Example of grade equivalent
30Q math test for 4 graders, Johnny got 24 answers correct--9th grade equivalent -My child is performing at 9th grade level in 4th grade, no this is incorrect, the child is doing extremely well in 4th grade math but they are not doing 9th grade math
76
What is the gold standard of norms?
Within group norms, it is easy to use
77
Within group norms
individuals performance if evaluated in terms of the nearly comparable standardization group (comparing a child's raw score with other children of a similar age)
78
How is percentile expressed in within group norms?
expressed in terms of the percentage of persons in the standardization sample who fall below a given raw score
79
Problem with within group norms
there is an inequality of units at the extreme of the distribution
80
How do we determine a stratified sample?
Looking at census data
81
Standard score
express the individuals distance from the mean in terms of the SD and the mean of the distribution
82
Scaled scores
takes into account grade levels norms and the difficulty of the test, these range from 1 to 100 or 100 to 1
83
Does percentile change in scaled scores due to more points?
getting more points in score but percentile may not change
84
Stanine
consists of single digits ranging from 1 to 9. Mean score is 5 and standard deviation is 2, cuts up normal curve into 9 pieces
85
Who is domain referenced test used best for?
for developmental delays or autism because these children are not compared to others of the same age, only comparing them to their best performance
86
Domain referenced test
describe a specific types of skills or tasks that the test taker demonstrates, information is used to determine where child needs growth
87
Do domain referenced tests have norms?
They have no norms
88
Standardization requires
large enough sample, representative sample of the population
89
What could be an issue with standardization and norms?
tests are normed differently and even if different tests are designed to measure the same construct they may have very different standardization groups
90
Who was Cicchetti?
developed overlap sample, means you use the same standardization group (sample people) for multiple tests
91
Cicchetti said Two major areas focused developmental
cognitive function (cognitive abilities) & adaptive functioning (daily living skills (tying your shoes)) these both need to be in line w/ each other
92
Why do overlap samples tend to not happen?
very costly and time consuming, have to find other ways to develop norms
93
Criterion referenced testing
another test that does not have any norms, scores are compared to specified content ex: mastery testing
94
Fixed reference groups
scores on a test are compared to a fixed group at a specific time ex: SAT (fixed reference norm test someone taking test in 1941 were 11,000 white males, their mean score became the mean of the SAT)
95
Cohort effect
everyone thinking the same thing (group thinking)
96
Does a fixed group norm have standardization?
No
97
Criterion Referenced Testing (mastery testing 2 basic components)
to determine the proportion of items which must be correct to establish mastery and how many items necessary to determine positive mastery
98
Do fixed referenced norm tests work?
They are not fair to everyone!
99
Is SAT biased?
Yes because of fixed referenced norms
100
Anything based on census data is not what?
Fixed anymore
101
With criterion referenced testing their are cut off points, what are the disadvantages of these?
may have increased error judgments (always should have more than one testing measure to determine mastery)
102
Test bias applies to both
looks at the item level, but the quality of overall test as well
103
Two types of test bias
gender bias in regard to math tests & African Americans (Racial bias) with cognitive tests (these have the most empirical data)
104
Gender Bias w/ math tests
test scores are systematically different for woman who take standardized math tests compared to men
105
Racial Bias
systematic difference of African American scores on cognitive tests as compared to their white counterparts
106
Most African Americans when they get to college they?
More than half drop out
107
With PHD, how is that racially biased?
Only 7% PHD go to African Americans, 56% go to Whites
108
Test bias
systematic error that occurs in test scores when tests are applied to other ethnicities
109
How do African Americans score on IQ tests?
they score 15 points lower on IQ tests than their white counterparts (puts them 1 SD lower)
110
The Bell Curve book said that
Reason why African Americans score less on IQ tests is because they have a predisposed cognitive deficits (Very Racist)
111
Claude Steele
developed the concept called the stereotype threat
112
Stereotype Threat
there are certain stereotypes that are in the air that get activated under certain situations, those stereotype dictate the way one behaves.
113
Steele did a study and gave 2 conditions (said one was problem solving and the other said it was math test)
It was the same test, but women were underperforming on the math tests whereas men were overperforming (this is a stereotype threat) because women think they cannot do good on math due to gender bias
114
Race example of stereotype threat by Steele
Gave same SAT test to individuals, broke them up into two conditions one told them to write down their race, the other told them not to and start the test. Those who are Black did worse when they put their race compared to the white people.
115
There is no way around the ?
Stereotype error, it becomes activated in certain situations
116
Disidentification (With Stereotype Threat) Steele
reject specific traits about their identity, they do this so they can maintain a status quo
117
Example of disidentification
People who may be bad at academics, want to go to school to be an athlete, they reject the specific trait of academics which closes the door to more academic opportunities
118
How do you stop from rejecting or disavowing specific traits?
By saying you are working on them
119
Issues with stereotype threat
According to Steele and Aronson does not imply that taking away the stereotype threat will eliminate the bias in scoring between Blacks and Whites
120
With test bias, thought biased questions
were removed, but their were no real differences in scoring
121
Differential item functioning
Attempts to identify on standardized tests those items that are biased against ethnic minority populations
122
Helms (2006) female African American psychologist
believes that issue is not psychometric equivalence, rather testing but its nature poses issues of fairness especially as it is applied to ethnically diverse populations
123
What does Helm feel is unfair?
to view African Americans that perform 1 SD below whites as apart of the achievement gap
124
According to Helms why are tests not fair?
-not psychometrically sound, but minimize the effect of "internalized racial or cultural experience" that effects the test taker and the testing process
125
What is the reason for ethnic suppression?
African American and other ethnic minority groups experience cultural discrimination and prejudice that affect his/her testing performance
126
Reasons for test bias? Helms
It is sample dependent and certain ethnic groups respond in specific ways
127
Frisby 1999 says what about test bias
When you have a stereotype threat and Helms model, it can affect the test scores even if it theoretically sound
128
How do African Americans suffer from test bias?
They would not be selected or promoted for educational or career advancement about 85% of the time since the gold standard achievement is how Whites perform
129
Predictive bias
that test scores show different prediction or classification based on the group (majority vs. minority)
130
Slope Bias
two different regression lines for each group creating differential predictions for each group, the test or measurement procedures yields systematically different validity coefficients for members or different groups.
131
Intercept Bias
when two groups have similar slopes, but their intercepts differ in other words, score similar on the test, but differ on the criterion score
132
What do you need with slope bias?
need both lines to have different slopes, meaning they are going different ways
133
Biases in tables look at
P (significance) less than .05 means their is significance showing some type of bias for African Americans
134
Y= mx+b what does this mean?
M= slope & b= y-intercept (where the line hits the y-axis)
135
When is there an overprediction of a groups scores in a graph?
when one line is higher or farther than the other line
136
When do you see intercept bias?
when the lines hit the y-axis at different points their is intercept bias
137
When is criterion the same in a graph?
when the scatter plot points are hitting the center on same line on horizontal axis
138
Scatterplot divided into four quadrants
correctly accepted- if all scattered closer together incorrectly accepted- next to correctly accepted below it correctly rejected- a lot but more scattered from each other incorrectly rejected- less scatterplots next to correctly rejected
139
in a validity chart, the ones in parenthesis are
reliability coefficients of internal consistency
140
in a validity chart, the ones that are underlined?
convergent validity, measuring different methods but looking at same constructs
141
Heart of scaling and classification & heart of test construction
Content validity