Test Bias & Test Utility Flashcards

1
Q

If we discover that one group of people score higher on a test than another group of people, what are the possible underlying reasons for this?

A

The test may be biased or the two groups may actually be different

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

If our test is biased then what are our options for dealing with this?

A

Have an accommodation within existing tests (e.g. extra time, point correction); redevelop the test; develop an alternative special test for the group; avoid testing altogether (but the alternatives may not be less biased)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What are some issues with alternative tests like the BITCH (black intelligence test for cultural homogeneity) and Chitling tests?

A

They may not exhibit psychometric validity; tend to yield much higher scores for groups at which they’re directed, but haven’t demonstrated predictive validity (e.g. not predictive of real world outcomes, like job performance or academic achievement)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Describe how we can use regression lines to model test bias

A

We can look at regression lines for criterion validity scatterplots separately for different groups; e.g. look at the relationship between how well people are predicted to do at a job (test score) and how well they actually do (job performance)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

If all of group A are scoring higher on both test score and job performance than group B, what does this mean?

A

Either the test isn’t biased, or both the job performance measure and the test score are equivalently biased (discriminating against group B)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

When does Intercept Bias occur, and what does it mean?

A

When the slope of the regression lines of the groups are the same, but they intercept the vertical axis at different places; test is biased between one group but the test is still equally predictive of job performance for both groups (although they can’t be compared)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

When does Slope Bias occur, and what does it mean?

A

When the slope of the regression lines of the groups are different; test is biased against one group and is also less predictive of job performance for them; test is differentially valid for the 2 groups

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What does “differentially valid” mean?

A

When the scatter of points is greater for one of the groups, so there’s a lower correlation between job performance and test score

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Some people have argued that the construct of race is primarily social and has no biological meaning (especially in societies like the US). What argument has been made for the idea that race differences in intelligence test scores are due to genetics?

A

Black children raised in white families with white education tend to still do worse at school and score lower on IQ tests (shows innate race differences)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What arguments have been made against the idea that race differences in intelligence scores are due to genetics?

A

School achievement and even IQ is partly a function of teacher expectation (e.g. Pygmalion effect); A minority group member may still face many disadvantages in such a situation, especially where group membership may be superficially obvious (e.g. race)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Describe the Pygmalion Effect found in Rosenthal and Jacobson’s experiment

A

School children were given a non-verbal IQ test; teachers were given a list of children who performed in the top 20% and were identified as bloomers (actually chosen at random); for earliest grades, bloomers scored significantly higher at the end of the year

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Steele and Aronson had students complete the Graduate Record Examination and divided them into 2 groups. How did they demonstrate the effect of self-stereotyping?

A

Group 1 were told the test measured intellectual ability and group 2 were told it was about problem-solving; African-Americans did worse than white Americans when in group 1 but didn’t differ in group 2

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Describe the experiment that Shih et al. carried out to demonstrate the effect of self-stereotyping with Asian American women

A

They gave them a maths test; when previously given a questionnaire relating to racial identity they did better on the maths test than controls, but performed worse when previously completing a questionnaire relating to gender

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

List the Commonwealth of Australia Acts that impact the use of psychological tests for employment purposes

A

Racial discrimination act; age discrimination act; human rights and equal opportunity commission act; sex discrimination act; disability discrimination act; fair work act

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

According to the Disability Discrimination Act, to overcome a claim of discrimination, what must the deficit be directly tied to?

A

An inherent requirement of the job

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What disabilities does the Disability Discrimination Act include?

A

Physical, intellectual, psychiatric, sensory, neurological, learning disabilities, physical disfigurement, and the presence of disease-causing organisms in the body

17
Q

Describe the requirements for psychological testing in relation to employment under Australian law

A

All tests must measure the person for the inherent requirements of the job, not the person in the abstract (content and criterion validity needed); aptitude and personality tests must relate directly to the genuine requirements of the job; should only be used to assess an applicant’s suitability for the position based on selection criteria; other information (e.g. personality/private life) shouldn’t be used when deciding about their suitability

18
Q

The only Australian case that discussed personality tests in the context of employment law, was between Australian Industrial Relations Commission vs. Coms21 (1999). What occurred?

A

Employees from a consulting firm were unfairly dismissed based on a personality test, and an inaccurate consultant’s report without regard to objective, reasonable and justifiable selection criteria (no associated skill or competency tests were used)

19
Q

Discuss the issues that have arisen when the use of psychological testing has been taken to court on the grounds of bias

A

Overrepresentation of mentally-retarded children with Spanish surnames; IQ tests used to place children in educable mentally retarded classes (disproportionate effect on black children); When WISC-R and Stanford-Binet were argued to be racially biased, judge ruled evidence as unconvincing; Inconsistencies in court decisions are commonplace; validity/reliability of many psychological tests currently being decided by the courts

20
Q

What does Test Utility refer to, and why is it important?

A

The practical usefulness of a test; it’s not enough for a test to be good (i.e. reliability/validity), it has to yield a worthwhile benefit that outweighs its costs (e.g. money, time, inconvenience, etc)

21
Q

Is a test with poor reliability and validity likely to have good utility?

A

Not if the utility includes interpreting the test score, but there may be situations when the test score is less important, such as if its being used for some other purpose (e.g. liar detector tests can be useful even if they don’t work)

22
Q

What is utility analysis?

A

A family of different techniques that can be used to decide the usefulness of a test; can also be applied to interventions (e.g. to decide most preferable training/therapy program)

23
Q

What is an expectancy table, and how do you calculate one?

A

It’s a utility analysis strategy generated from a criterion validity scatterplot; the scatterplot is divided into categories, based on test performance and job performance criteria, a cut-off (pass mark) is determined, and the number of correct hits, false positives, false negatives and correct negatives are calculated

24
Q

What are the selection ratios in the context of expectancy tables?

A

The ratio between available job positions and number of applicants (e.g. 2 positions: 4 applicants = selection ratio of .5)

25
Q

What are false positives and false negatives in the context of expectancy tables?

A

False positives: test incorrectly identifies person as being good when they’re not;
False negatives: test incorrectly identifies person as being no good when they’re good

26
Q

What is a cut off in the context of expectancy tables?

A

The minimum test score needed in the test to be hired (pass mark)

27
Q

Imagine you are an HR manager. Describe how you could use expectancy table techniques to deal with (1) a high selection ratio and (2) a low selection ratio

A
  1. High selection ratio: if we had lots of job positions, we could lower the pass mark of the test, so no good people missed out
  2. Low selection ratio: if we only had a few job positions to fill, we could raise the pass mark of the test, so no bad people would be hired
28
Q

What risks would you be running with either a high or low selection scenario?

A

High selection: lowering the pass mark could lead to an increase in false positives (more bad people hired);
Low selection: raising the pass mark could lead to an increase in false negatives (good people not hired)

29
Q

How many categories can an expectancy table have?

A

Multiple categories rather than just two (good/bad or pass/fail)

30
Q

What are the disadvantages of using the expectancy table technique?

A

It assumes a linear relationship between job performance and test score; doesn’t take into account other factors (e.g. minority status, physical health of applicant, etc)

31
Q

What influence does the criterion validity of a particular selection test have on its usefulness in recruiting people?

A

It can tell us if we’re choosing the right people; a test with low criterion validity is no better than selecting applicants at random (more scatter); a test with high criterion validity is very good at choosing the right people (positive correlation)

32
Q

How did Schmidt et al. use utility analysis to save employers millions of dollars?

A

Through analysis of the Programmer Aptitude Test (PAT) to select computer programmers; when compared with previous selection procedures, it yielded a much higher validity coefficient, saving the employer $6 million p/yr