Psychological Testing & Assessment Flashcards by Jyllen Arambulo

What are the 7 assumptions about psychological testing and assessment?

Psychological Trait & States Exist.
These can be quantified and measured.
Test behavior predicts non-test behavior.
Tests have strengths and weaknesses
Sources of error are part of the process.
Testing & assessment can be done in a fair and unbiased way.
Testing & assessment benefit society.

How well did you know this?

Not at all

Perfectly

Any distinguishable, relatively enduring way in which individual varies from one another.

Trait

How well did you know this?

Not at all

Perfectly

Distinguishes one person from another but relatively less enduring.

State

How well did you know this?

Not at all

Perfectly

The key to insight.

Item content

How well did you know this?

Not at all

Perfectly

Long standing assumption that factors other than what a test attempts to measure will influence performance on the test.

Error

How well did you know this?

Not at all

Perfectly

Refers to the component of test score attributable to sources other than the trait or ability measured.

Error variance

How well did you know this?

Not at all

Perfectly

The stability or consistency of measurement.

Reliability

How well did you know this?

Not at all

Perfectly

Elements of error

Observer’s error
Environmental changes
Participant’s changes

How well did you know this?

Not at all

Perfectly

Reliability coefficient should not go beyond _.

+/- 1

How well did you know this?

Not at all

Perfectly

The __ the coefficient alpha, the higher the reliability.

Higher

How well did you know this?

Not at all

Perfectly

Test scores gain reliability as the number of _ increases.

Items

How well did you know this?

Not at all

Perfectly

It means dependable, consistency or stability.

Reliability

How well did you know this?

Not at all

Perfectly

Defined as one on which test takers fall in the same positions relative to each other.

Reliable test

How well did you know this?

Not at all

Perfectly

Reliability assumes that test scores reflect 2 factors which are:

True characteristics
Random measurement of error

How well did you know this?

Not at all

Perfectly

Stable characteristics of the individual.

True characteristics

How well did you know this?

Not at all

Perfectly

Chance features of the individual or the situation.

Random measurement of error

How well did you know this?

Not at all

Perfectly

Tools used to estimate or infer the extent to which an observed score deviates from a true score.

Standard error of estimates/measurement

How well did you know this?

Not at all

Perfectly

Mathematical representation of Random Measurement error

X= T+E

How well did you know this?

Not at all

Perfectly

In a reliable test, the value of E should be close to _ and the value of T should be close to the _.

0
Actual test score X.

How well did you know this?

Not at all

Perfectly

Formula for the proportion of test score reflecting a person’s true characteristics.

T/X

How well did you know this?

Not at all

Perfectly

Formula for the proportion of test score reflecting random error.

E/X

How well did you know this?

Not at all

Perfectly

The reliability of the test is actually based on _.

Performance of people

How well did you know this?

Not at all

Perfectly

Difference between people in test scores reflect differences between them in _ plus differences in the effect of _ factors.

Actual knowledge/characteristics
Chance factors

How well did you know this?

Not at all

Perfectly

What are the sources of error variance?

Test construction
Test administration
Test scoring and interpretation
Other sources of error

How well did you know this?

Not at all

Perfectly

It correlates performance on two interval scale measures and uses that correlation to indicate the extent of true score differences.

Reliability Analysis

Used to evaluate the error associated with administering a test at two different times.

Test-retest method

The test retest method is of value only when we measure traits or characteristics that _.

Do not change overtime.

Ideally, test-retest method should have _ months or more interval.

When the interval between testing is greater than 6 months.

Coefficient of stability

It compares two equivalent forms of a test that measures the same attribute.

Parallel/Alternative forms method

The 2 forms in parallel/ alternative forms method use _ items while the rules used to select items of a particular difficulty level are _. (Same/Different)

Different Same

A test is given to some and divided into halves that are scored separately.

Split half method

One subscore is obtained for the odd number items in the test and another for the even-numbered items. Used in split half method.

Odd-even system

Allows you to estimate what the correlation between the 2 halves would have been if each half had been the length of the whole test.

Spearman-Brown Formula

Refers to the degree of correlation among all the items on a scale.

Inter-item consistency

Inter item consistency is calculated using _ administration of a _ test form. (frequency)

Single

Inter item consistency is useful in assessing the _ of a test.

Homogeneity

What are the methods used to obtain estimates of interval consistency?

KR 20 Cronbach alpha

Cronbach developed a more general reliability estimate which he called _.

Cronbach alpha

Cronbach alpha typically range from _ to _, _ values are theoretically impossible.

0-1 Negative values

The degree of agreement or consistency between two or more scorers with regard to a particular measure.

Inter-scorer reliability

Inter-scorer reliability is for what kinds of test?

Creativity or projective test

Judgment or estimate of how well a test measures what it purports to measure in a particular measure.

Validity

A judgment based on evidence about the appropriateness of inferences drawn from rest scores.

Validity

Process of gathering and evaluating evidence about validity.

Validation

Validation process if test users plan to alter format, instruction, language or content of the test. Example is the national standardized test to braille

Local validation.

As reliability of the test increases, the highest possible value of _ increases.

Validity coefficient

What are the 5 measures of reliability?

Test retest method Parallel/Alternative forms method Split-half method Inter-item consistency Inter-scorer Reliability

The first theory behind validity analysis is that the true score component reflects factors producing _ in test scores while E, error reflects factors producing _ in test scores.

Stability Instability

The second theory behind validity analysis is focused specifically on the variable producing _ differences.

True score differences.

What are the 2 components of true score?

Stable characteristics of individual relevent to the purpose of the test. Stable characteristics of individual irrelevant to the purpose of the test.

Formula of systematic measurement error.

R+I = T

Formula of both error components (systematic and random error)

X= R + I + E

When the item looks like they measure what they are supposed to measure.

Face Validity

The judgment about the item appropriateness in face validity is made by _.

Test takers

It is important whenever a test is used to make inferences about the broader domain of knowledge or skills represented by a sample of items.

Content validity

Content validity is important to what kinds of performance test?

Maximal and Typical Performance Test

The ability of a test to predict performance in another measure.

Criterion Validity

In criterion Validity, the test is referred to as the _ labeled X and the validation measure as the _ labeled Y.

Predictor; Criterion

It is important whenever a test is used to make decisions by predicting future performance.

Criterion Validity

A judgment of how adequately a test score can be used to infer an individual's most probable standing on some measure of interest- being the criterion.

Criterion- related Validity

What are the 2 types of validity evidence?

Predictive validity Concurrent validity

An index of the degree to which a test score predicts some criterion measure. Type of validity evidence.

Predictive validity

An index of the degree to which a test score is related to some criterion measure obtained at the same time (concurrently)

Concurrent validity

Used whether a test measures what it is intended to measure. Referred to as the personality dimension of personality traits.

Construct validity

What are the 2 construct validation techniques?

Congruent validity Discriminant or divergent validity

Construct validation technique that measures correlating the same construct.

Congruent validity

Construct validation technique that measures correlating it to inconsistent constructs.

Discriminant or divergent validity

An informed, scientific idea developed or hypothesized to describe or explain behavior.

Construct

What are the 5 types of validity?

Face validity Content validity Criterion Validity Criterion- related Validity Construct validity

What are the statistical test bias?

Intercept bias Slope bias

A judgement resulting from the intentional or unintentional misuse or rating scale.

Rating error

Error in rating that arises from the tendency on the part of the rater to be lenient in scoring.

Leniency Error

Systematic reluctance to giving ratings at either positive or negative.

Central Tendency Error

Tendency to give a particular ratee a higher rating than he of she objectively deserves because the rater's failure to discriminate among conceptually distinct aspects of a ratee's behavior.

Halo effect

The extent to which a test is used in an impartial, just and equitable way.

Fairness

Refers to a group of statistics that can be calculated for individual test items.

Item analysis

What are the 2 commonly used techniques of item analysis?

Item Difficulty Item Discrimination

A commonly used technique of item analysis that is appropriate for maximal performance test achievement and aptitude test. It requires that test items be scored as correct and incorrect.

Item Difficulty

Percentage of the pupils who got the item right. It can also be interpreted as how easy or how difficult an item is.

Difficulty Index/ Item Difficulty Index

Formula for Item Difficulty index

P= number of person answering item correctly/ N (total number of people taking the test)

What is the value for optimal item Difficulty?

Between .5-.7

What does a high p value mean in item difficulty index?

Most people are correct on the item.

What does a low p value mean in item difficulty index?

Most people answered the item incorrectly.

Table of equivalent in interpreting the difficulty index: _= Very difficult .21-.80= moderately difficult .81-1.00= _.

.00-.20 Very easy

The presence of many items at the _ level limits the potential reliability and validity of the test.

The presence of many items at _ level reduces variability and limits the test's potential reliability and validity.

A commonly used technique of item analysis that is appropriate for almost any type of test. It indicates the extent to which different types of people answer an item in different ways.

Item Discrimination

It separates the bright test takers from the poor ones.

Discrimination Index

What are the 2 approaches for measures of item Discrimination?

Item discrimination index Item-total correlation

The proportion obtained by comparing the performance of 2 subgroups of test takers, used in maximal performance testing. Also known as D.

Extreme Group Method

Psychological Testing & Assessment Flashcards

(91 cards)