Week 3, Measurement, Key Terms Flashcards by Skyler Grunberg

Measurement

The assignment of numbers to objects or events according to a set of rules

How well did you know this?

Not at all

Perfectly

Indicators

In psychological measurement we do not measure constructs directly (try to put a finger on IQ…) .

Instead we measure the characteristics or properties associated with individuals.

We measure indicators (signs that point to something else).

How well did you know this?

Not at all

Perfectly

Why not measure organizational constructs directly?

We loose specificity as we move from micro – to macro level – easier to do direct measurement at the indiv. level than it is to do at the Org. level

How well did you know this?

Not at all

Perfectly

Scales of Measurement

Psychological measurement varies in precision.

Differences in precision are reflected in the types of scales on which particular characteristics are being measured.

Four levels of measurement

Nominal
Ordinal
Interval
Ratio

How well did you know this?

Not at all

Perfectly

Nominal measurement

Lowest level of measurement

Represent differences in kind

Individuals are assigned or classified into qualitatively different categories

Merely labels

Frequently used to identify or catalog individuals and events
Ex.
SS#
Assign 1 to males and 2 to females

The classes must be mutually exclusive

How well did you know this?

Not at all

Perfectly

Ordinal Measurement

Not only allows classification by category, but also provides an indication of magnitude

Rank ordered according to greater or lesser amounts of some dimension

If (a>b) and (b>c) then (a>c)

In top down selection this may be all the info that we need to know

How well did you know this?

Not at all

Perfectly

Interval Measurement

Have other useful properties

Scores can be transformed in any linear fashion without altering the relationships between the scores

Allows two scores from different tests to be compared directly on a common metric

Standardization

How well did you know this?

Not at all

Perfectly

Ratio Measurement

Highest level of measurement

In addition to equality, transitivity, additivity, the ratio scale has a natural or absolute 0 point.

Height, distance, & weight are all ratio scales

Don’t see these scales much in psych measurement

How well did you know this?

Not at all

Perfectly

Psychological Measurement

Principally concerned with individual differences in traits, attitudes, or behaviors.

Trait – a descriptive label applied to a group of interrelated behaviors

Based on standardized samples of individual behavior we infer the position or standing of the individual on the trait in question

How well did you know this?

Not at all

Perfectly

Systematic Nature of Measurement

TEST - a systematic procedure for measuring a sample of behavior.

Procedures are systematic in order to minimize the effects of unwanted contaminants (error or bias)

What is the difference between a personality “test” and a test of cognitive ability?

Found in:
Mental Measurements Yearbook
&
Publishers
&
3rd Party (e.g. Rocket-Hire)
&
Authors* (Taking the Measure of Work)

How well did you know this?

Not at all

Perfectly

Classifying tests

Content
Tests may be classified in terms of the task inherent in the scale

Ex Cognitive ability tests
Achievement
Aptitude
VS
Non-cognitive instruments (or inventories)

Tests may also be classified in terms of the efficiency with which they can be administered.

E.g.
Individual vs. Group
Speed vs. Power – designed to prevent perfect scores (always want variability on measurement tools)

Speed test – more items than you can answer in an amount of time
Power- you can take as long as you want to answer the items, scored by correct answers – the longer you take to take the test – the more variance you get in the scores – it could take someone 24 hours to take a test because they want to do the best they can – too much variance

How well did you know this?

Not at all

Perfectly

Likert Scales

When I am stressed, sometimes I get high.

A. strongly disagree
B. disagree
C. agree
D. strongly agree

Self-report measure

How well did you know this?

Not at all

Perfectly

Behavioral Observation

The other end of the continuum

Best predictor of future behavior…

Issue of Obtrusiveness:
-Heisenberg uncertainty principle (observer principle)
–When people see that you’re paying attention to them, their behavior will change
-Hawthorne effect
–Turned the heat up – performance went up, turned the lights up – performance went up, turned the heat down - performance went up – WHY? Because people are observing their performance

Can be cumbersome with large N size

To capture behavior you must be there when it occurs
Naturalistic observation

How well did you know this?

Not at all

Perfectly

Situational Judgment Test

The purpose is to identify a respondent’s intentions

Presents the person with a series of relevant incidents, and asks what he/she would do in that situation
The typical question is “ what would you do if …”

Often used to assess intelligence in a more “real world” fashion

Can assess a variety of constructs

How well did you know this?

Not at all

Perfectly

Theory Based

Goal setting theory
Intentions or goals are the immediate precursor of a person’s behavior
Added benefit of content validity

Attitudes>Intentions>Behavior

How well did you know this?

Not at all

Perfectly

Assessment Centers

Study These Flashcards

Simulate the situation in which the individual will be performing

Predicts how successful that person will be in the actual situation

Exercises vary in fidelity and immersion

Assessment Center Examples

Study These Flashcards

AT & T developed and operated the Advanced Management Potential Assessment Program (AMPA) for itself and the Bell System Operating Companies. The program was used by all the Bell System companies from 1979 through 1983.

Dr. Rich’s example of the study he conducted in the early 2000s where he and his team immersed executives in a situation in Baltimore – testing their adaptability – they were put into different situations all over Baltimore – e.g. they were told to talk to a guy about a problem – when they got to him – they realized that he was def – so some people just gave up since they couldn’t use sign language – others would grab a napkin and a pen so that they could communicate with him.
The CEO could then see who was needed at the company and who wasn’t – like the person who would give up when they couldn’t figure out a situation.

Psychometrics

Study These Flashcards

RELIABILITY

If measurement procedures are to be useful, they must produce dependable scores

Consistency

Freedom from unsystematic (random) errors of measurements

Methods to assess reliability

Study These Flashcards

Test Re-test

Parallel (alternate) forms

Internal consistency
-Split half
–Splitting a test in half – you can split the test anyway
-Kuder-Richardson 20-
–Test with a right and wrong answer
-Alpha
–Average of all split-half reliabilities
-Omega

Test Re-test is a good way to test reliability.
The downside to giving someone the same test twice: is the practice effect – will do better since they’ve taken it once.

Issues Related To Reliability

Study These Flashcards

No fixed value that indicates acceptable

Reliabilities often range from .70 -.90

Range of scores (need variability)
-A range of scores is reliable

Sample size & number of items
-The more observation you have the more reliability you have

Reliability & Validity

Study These Flashcards

Theoretically it would be possible to develop a perfectly reliable measure whose scores were completely uncorrelated with any other variable.

This measure would have no practical value.

It would be highly reliable but would have no validity.

Limit on validity

Study These Flashcards

Validity is reduced by the unreliability in a set of measures

Ex. performance appraisal
-Typical reliabilities are low (.60)
-Sets a cap on possible criterion validity
-We can statistically correct for this type of unreliability

What is Validity ?

Study These Flashcards

The extent to which a measurement procedure actually measures what it is designed to measure

Degree to which evidence and theory support the interpretation of test scores for their intended purpose

The investigation processes of gathering or evaluating data to asses this is called validation.

Really concerned with two issues 1. What a test measures 2. How well it measures it.

Validity

Study These Flashcards

Tests scores are typically used to draw inferences about applicant behavior in situations beyond the testing environment

Test user must be able to justify the inferences drawn by having a cogent rationale or empirical support linking the test score to the inferred outcome

Nobody cares about the test score – what they care about are the consequences (inferences)

Validation Strategies

Content - Related Evidence Criterion - Related Evidence Construct - Related Evidence Standards (1999, 2014)

Standards

Standards for Educational & Psychological Testing (2014). Sources of validity evidence based on: Test Content Response Processes Internal Structure Relations to other Variables Consequences of Testing

Content Validity

The content of the test is drawn from the domain of interest

Content Validation

Concerned with whether or not a measurement procedure contains a fair sample of the domain of situations it is supposed to represent -Ex. suppose your first test had items drawn completely from texts that were not assigned for reading or covered in the lecture… Our domain is usually job performance Can also be other aspects of work, ex. Training proficiency MUST provide evidence that a selection procedure samples knowledge or skills required for a job MUST be based on accurate job information NEED A JOB ANALYSIS MAY restrict job content domain to important or frequent activities (minimize the irrelevant) In conducting a content validation study: Content strategies are relatively data free Need a panel of SMEs to rate each item on the relevancy to the job Can be quantified (CVI) Most of the inferences of validity are supported by the documentation surrounding the development of the test

Criterion Validity

The criterion variable is a measure of some attribute of outcome that is of primary interest The choice of the criterion and the measurement procedures used to obtain criterion scores are of central importance Companies often overlook good measurement during the criterion (use cheap easily accessible criterion) Requires data – nothing complex, but it needs data – needs at least 100 subjects. “G” = trait If we get a statistically significant relationship, evidence of criterion validity

Feasibility

Job is reasonably stable and not in a period of rapid evolution Relevant, reliable and uncontaminated criterion measure Contaminated: Measuring things other than performance Based on a sample that is reasonably representative Statistical power As you approach the statistical value of .3, the normal curve starts to form

Predictive & Concurrent Validity

P - data on the selection procedure are collected at the time applicants are hired - after employees’ performance levels have stabilized criterion data are collected. Applicant data on validated measure is not used in decision! C - the predictor and criterion data are collected on job incumbents at approximately the same time

Construct Validity

Am I measuring what I intended to measure Specifying the meaning of the construct Distinguishing it from other constructs Indicating how the construct should relate to other variables Nomological Network

Conducting construct validation

Analysis of internal consistency Factor analysis (establishing that items or item clusters share common variance) Establishes that it is one construct Correlations of a new procedure with established measures of the same construct (convergent validity) and with measures of unrelated constructs ( divergent evidence) Establishes what that construct is

Construct Advanced Methods

Factor invariance Does factor structure change when conditions change (when moderators are present) Constructs/items are different around the world They are interpreted differently For example does factor structure change across cultures Big Five versus Chinese Personality Assessment Inventory Etic vs. Emic

Week 3, Measurement, Key Terms Flashcards

(34 cards)