Assessment Flashcards

(101 cards)

1
Q

Item difficulty index

A

percentage of people who got an item CORRECT

The lower the score, the more difficult the question is

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Who Americanized the Binet?

A

Lewis Terman
at Stanford University
thus, Stanford-Binet

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

The Buckley Amendment

A

AKA FERPA (Family Education Rights and Privacy Act of 1974)
Those over 18 can view their school record (including test data)
Can view their children’s test data
Can demand corrections to their file
Educational testing information cannot be released without adult consent

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Projective tests may also be called

A

self-expressive (e.g., sentence completion or word association)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

reactivity

A

clients/participants monitoring their own behavior and thus giving inaccurate answers

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

How can you increase reliability?

A

Increase the test’s length

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Spearman-Brown Prophecy formula

A

used to estimate the impact that lengthening or shortening a test will have on a test’s reliability coefficient

when estimating split-half reliability, the Spearman-Brown prophecy formula can be used to compensate mathematically for the shorter length

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Aptitude-achievement tests

A

GRE, MAT, MCAT, SAT

Ex: GRE attempts to predict graduate school performance but also tests level of current knowledge

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Generally, school selection tests assess

A

aptitude

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

The ACA division for testing

A

AMECD

Association for Measurement and Evaluation in Counseling and Development

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Interests and abilities are ____ correlated.

A

not highly

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Bender Visual Motor Gestalt Test

A

named after Lauretta Bender
expressive projective measure, though known most for its ability to discern whether brain damage is present. Suitable for ages 4+ — client copies 9 geometric figures

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Interest inventories

A

Work best with high-school age and beyond

Interests are not stable until age 25

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Aptitude tests

A

assess Potential and Predict (aPtitude)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Tests that analyze data outside of a given theory

A

factor-analytic tests

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Raymond Cattell

A

developed 16 Personality Factors

Responsible for defining fluid and crystallized intelligence

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

James Cattell

A

coined the term “mental test”

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Projective tests use one of 3 formats

A

Association (word)
Completion (sentence)
Construction (draw a person)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Projective tests use ___ stimuli

A

vague

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

MMPI-2 for adolescents

A

MMPI-A

suitable for 14 to 18 y.o.s

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

Arthur Jensen

A

tremendous controversy for his 1969 Harvard Educational Review article
Said Whites score 11-15 IQ points higher than Blacks because due to slavery, Blacks were bred for strength rather than intelligence
Said that heredity contributes 80% to IQ and environment only 20%

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

Robert Williams

A

made the BITCH (Black Intelligence Test of Cultural Homogeneity) to demonstrate that Blacks often excel when given a test with questions familiar to their community. Argued that IQ tests were part of “scientific racism”

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

John Ertl

A

weirdo who claimed he invented an electronic machine to take the place of paper and pencil IQ tests. It literally had a strobe light on it.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

Group tests are ___ accurate and have __ reliability, compared to individual tests

A

less accurate and less reliable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Means and SDs of Weschler and Binet IQ tests | Difference between the two
Weschler: M-100 SD-15 Stanford-Binet: M-100 SD-16 Binet seems to be not the best for adults and so Weschler is most popular
26
Forms of the Weschler IQ tests
WPPSI - preschool and primary; ages 2.6-7 WAIS - age 16+ WISC - for children; 6-16.11 years
27
A 9 year old task on the Binet is one that X of 9-year-olds could answer correctly
50%
28
Today's Binet is scored...
with a Standard Age Score (SAS) Mean of 100 SD of 16
29
IQ is calculated by
Mental Age / Chronological Age X 100
30
Alternatives to the split-half method of measuring internal consistency (inter-item consistency) of a test
Cronbach's alpha | Kuder-Richardson-20 or KR-21
31
Cross-validation
When a researcher further examines a test's criterion validity by administering the test to a new sample. This helps ensure the test is applicable to other populations who will take the exam. Helps guard against error factors, which are likely to be present if the original sample was small. The cross-validation coefficient will likely be smaller than the initial validity coefficient. This is called "shrinkage"
32
J. P. Guilford
isolated 120 factors that added up to intelligence Two of the dimensions are divergent thinking (coming up with new ideas) and convergent thinking (when divergent thoughts and ideas are combined into a singular concept)
33
Charles Spearman
in 1904, said that two factors were applicable to any mental task: G - general ability S - specific ability
34
Francis Galton
``` cousin of Darwin! first intelligence theory believed intelligence was a single "unitary" factor and that exceptional abilities were genetic and ran in families eugenics :/ Hereditary Genius (1869) ```
35
coefficient of determination
variance of one factor accounted for another; square the correlation ex: same test is given to the same group of people and the correlation between the administrations is .70. The % of shared variance is .70 squared, which is .70x.70 = .49 (49%)
36
For psychological tests, an acceptable reliability coefficient is X. For admissions to jobs/schools (achievement), it is X.
Psycholgical - .70 reliability is good | For admissions to jobs/schools (achievement) - .80 or even .90
37
A reliability coefficient of .70 means...
70% of the obtained score on the test represented the true score 30% of the obtained score could be accounted for by error AKA 70% is true variance while 30% is error variance. (NOT that 70% of ppl who are tested will get their true score)
38
A reliable test is ___ valid. | A valid test is ___ reliable.
A reliable test is not always valid. A valid test is always reliable.
39
Incremental validity (2 definitions)
The process by which a test is refined and becomes more valid as contradictory items are dropped ALSO refers to a test's ability to improve predictions when compared to existing measures. When a test has incremental validity, it gives you additional good info that wasn't available from other tests.
40
According to the 1974 committee that drafted Standards for Education and Psychological Tests, face validity is ___
not required
41
A construct is any trait that ___
you cannot measure or observe directly
42
What is the #1 consideration in test construction
Validity
43
5 types of validity
``` content validity construct validity concurrent validity predictive validity consequential validity ```
44
content validity
AKA rational or logical validity | does the test examine the behavior under scrutiny?
45
construct validity
a test's ability to measure a theoretical construct (like intelligence, self-esteem, etc)
46
predictive validity
AKA empirical validity | test's ability to predict future behavior
47
concurrent and predictive validity may be lumped under ___ validity.
criterion validity, which is an estimate of the extent to which a measure agrees with a gold standard (i.e., an external criterion of the phenomenon being measured)
48
concurrent validity
relationship between an instrument's results and another currently obtainable criterion (give a new depression assessment to people you already know are depressed)
49
consequential validity
tries to ascertain the social implications of using tests
50
horizontal vs vertical tests
horizontal - assess for different things (math, language) | vertical - versions for different age brackets or levels of education (preschooler, middle-school math assessments)
51
spiral test
the items get progressively more difficult
52
cyclical test
you have several sections that spiral in nature, the items within each spiral get progressively more difficult
53
ipsative test
does not measure absolute strengths measures a person's progress in relation to themselves comparing their score to another person's is meaningless items are independent of one another
54
convergent validity
established when measures of constructs that theoretically should be related are observed to be related e.g., scores on GAD are related to another anxiety measure
55
discriminant validity
established when measures of constructs that are not theoretically related are observed to have no relationship
56
standard error of estimate
a statistic that gives the expected margin of error in a predicted criterion score due to the imperfect validity of the test
57
validity coefficient
correlation between a test score and the criterion measure
58
a person's observed score (X) = ?
true score + error | X = T + e
59
standard error of measurement
SEM used to estimate how scores from repeated administrations of the same instrument to the same individual are distributed around the true score. SEM is computed using the SD and reliability coefficient: SEM = SD{sq rt of (1 - r)}
60
factors that influence reliability
test length homogeneity of test items (reliability goes up when items are homogenous) range expansion (reliability is lowered by a restriction of range) heterogeneity of test group (higher reliability) speed tests (high reliability because nearly everyone gets everything right)
61
reliability coefficient
reliability is expressed in this coefficient | closer to 1.00, the more reliable the scores
62
NOIR
nominal scale - no order or equal intervals ordinal - order, but no equal intervals interval - equal intervals, but no true 0 ratio - equal intervals, true 0
63
Semantic differential
Good _ _ _ _ _ _ Bad place a mark between where they feel Like a Likert scale but no #s?
64
Thurstone scale
Agree or Disagree only
65
Guttman scale
measures the intensity of a variable because items are presenting in a progressive order so that a respondent who agrees with one statement will also agree with all previous, less extreme items
66
percentile rank
indicates the % of scores falling at or below a given score | range from 1 to 99+ and have a mean of 50
67
z-score
``` mean = 0 SD = 1 ``` z = (X - M)/SD
68
T-score
``` mean = 50 SD = 10 ``` T = 10(z) + 50
69
deviation IQ
also known simply as standard score (SS) because they are used to interpret scores from achievement and aptitude tests ``` mean = 100 SD = 15 ``` SS = 15(z) + 100
70
stanine
mean = 5 SD = 2 range from 1 to 9 round up to a whole # stanine = 2(z) + 5
71
normal curve equivalent
developed for US department of education and used to measure student achievement 1 to 99 mean = 50 SD = 21.06 NCE = 21.06(z) + 50
72
___ tests are usually used in high stakes testing
criterion-referenced (have you learned X curriculum)
73
Mental Status Exam
``` AAMMTPTJI Appearance Attitude Movement and behavior Mood and affect Thought content Perceptions Thought process Judgment and insight Intellectual functioning and memory ```
74
suicide assessment acronyms - 3
PIMP (Plan, Intent, Means, Prior attempts) SLAP (Suicidal ideations, Lethality, Access, Plan) SAD PERSONS (sex, age, depression, previous attempt, ethanol abuse, rational thought loss, social supports lacking, organized plan, no spouse, sickness)
75
types of test bias
examiner bias - examiner's beliefs or behavior influence test administration interpretive - interpretation of results is unfair response - when clients answer one thing to all questions situational - testing conditions ecological - global systems affect (e.g., giving all students a test in English)
76
Army Alpha vs. Army Beta
Alpha - English speakers Beta - non-English speakers used to test intelligence of military recruits during WWII
77
Arthur Otis
developed the first scientifically reliable intelligence test for groups Otis Group Intelligence Test
78
Frank Parsons
father of vocational guidance and counseling
79
NBCC and ACA ethical guidelines for assessment
1. competence to use and interpret 2. informed consent 3. release of results to qualified professionals 4. instrument selected 5. conditions of administration 6. scoring and interpretation of assessments 7. obsolete assessments and outdated results 8. assessment construction
80
the Joint Committee on Testing Practices (JCTP) developed...
Rights and Responsibilities of Test Takers Test User Qualifications Code for Fair Testing Practices in Education
81
IDEA
Individuals with Disabilities Education Improvement Act of 2004 rights of students with disabilities to receive testing at the expense of the public school testing right to an IEP (individual education program)
82
ADA
Americans with Disabilities Act (1990) employment testing must accurately measure a person's ability to perform relevant job tasks people with disabilities get appropriate accommodations for testing
83
Carl D. Perkins act
Vocational and Technical Education Act of 1984 provides vocational assessment, counseling, and placement for low SES, disabled, single parents, those with limited English proficiency, incarcerated individuals
84
Civil Rights Act of 1964 and 1972, 1978, and 1991 ammendments
assessments used to determine employability must relate strictly to the duties outlined in the job description and cannot discriminate based on race, color, religion, pregnancy, gender, or origin
85
criterion validity
effectiveness of an instrument in predicting an individual's performance on a specific criterion
86
item discrimination
Performance of the top quarter of total scores minus the bottom quarter An item has good discrimination when high-scorers get it right and low-scorers get it wrong (positive item discrimination) items with 0 and negative item discrimination are poor
87
classical test theory
observed score = true score + error
88
item response theory
importance of applying mathematical models to the data collected from assessments to see how well individual items work AKA modern test theory
89
construct-based validity model
AKA unified construct model validity is a holistic construct, it doesn't have specific components like classical test theory would believe it has (e.g., the 3: content, criterion, and construct validity)
90
what are the 3 types of test theory
``` classical item response (AKA modern) construct-based validity model ```
91
criterion-referenced assessment
provide info about a person's score by comparing it to a predetermined standard or set criterion e.g., A = 90-100; B=80-90, and so on NCE and CPCE are criterion-referenced assessments as opposed to norm-referenced tests which make meaning by comparing a person's score to the norm group
92
achievement vs. aptitude tests
achievement - what one has learned at the time of testing | aptitude - what a person is capable of learning (GRE, SAT)
93
ASVAB
Armed Services Vocational Aptitude Battery | the most widely used multiple aptitude test in the world. Measures aptitude for military and civilian jobs
94
Luis Thurston
unlike Charles Spearman's two-factor approach to intelligence (g, s - general and specific factors), Luis Thurston identified 7 mental abilities
95
Howard Gardner
theory of multiple intelligences — 8
96
Cattell-Horn-Carroll (CHC)
theory of cognitive abilities - the most empirically validated theoretical model of intelligence intelligence is hierarchical and consists of 3 strata: general intelligence "g" broad cognitive abilities narrow cognitive abilities
97
high-stakes testing usually uses ___ assessment
criterion-referenced
98
performance assessments
non-verbal form of assessment client completes a task good for foreign language speakers ex: Draw-a-Man test; (Raymond) Cattell Culture Fair Intelligence Test; Test of Non-Verbal Intelligence (TONI)
99
computer-adaptive testing
the computer adapts the test structure and items to the examinee's ability level ex: GRE
100
the 3 main types of validity
content criterion construct
101
____ is the most widely used intelligence test
Weschler scales