List 4 examples of measurement devices.
- Test
- Questionnaire
- Interview schedule/protocol
- Personality scale
List 2 factors of validity
- extent to which a measure/instrument measures what it is designed to measure
- accurately performs the function(s) it is purported to perform
List 3 KEY points about validity
- validity is relative to the purpose of testing
- validity is a matter of degree
- no measure/instrument is perfectly valid
What is a ‘construct’?
4 features
- an abstract concept used in a particular theoretical manner to relate different behaviors according to their underlying features or causes
- used to describe, organize, summarize and communicate our interpretations of behavior
- abstract term used to summarize and describe behaviors that share certain attributes
- collection of related behaviors that are associated in a meaningful way
Why is validity important in quantitative research?
researchers reduce constructs to numerical scores
Why is validity important in qualitative research?
researchers must describe results in enough detail so that readers can picture the meanings that have been attached to a construct
List 3 types of validity.
- judgmental
- empirical
- judgmental-experimental
List and define 2 types of judgmental validity.
- Content: expert judgment
2. Face: participant judgment
List and define 4 types of Empirical validity.
- criterion-predictive: correlation
- criterion-concurrent: correlation
- Convergent: correlation
- Divergent: correlation
Judgmental-Empirical Validity is what type?
Construct validity
Judgmental-Empirical construct validity is established by what 2 things?
- hypothesize about relationship
2. test of hypothesis
Judgmental validity is an approach to establishing validity that uses ______________, usually of _____________ and therefore is only as good as the ____________. (6, 10)
- judgments
- experts
- judges
Content Validity is a type of _______________ validity.
judgmental
Content validity is ______________.
the degree to which measurements actually reflect the variable of interest
What two questions does content validity answer?
- Are we tapping the appropriate contents by the measure?
- Does the instrument cover all the areas needed to be observed AND does it cover them equally or proportionally to the interest?
Three principles for writing tests with high content validity
- Broad content
- focus to reflect importance
- appropriate level of language (vocabulary, sentence length) for the audience
Face Validity is a type of _____________ validity.
judgemental
Face Validity is __________.
the degree to which an instrument appears to be valid on the face of it
The _______________ Test does not have very good face validity.
Rohrshach
What is the question that Face validity answers?
On superficial inspection, does the instrument appear to measure what it purports to measure?
The Rohrshack Test is designed to measure ____________.
psychopathology
Who are the experts for the Rohrshack Test?
the person taking the test
Making the measurement tool LOOK like its measuring what it claims to be measuring is important to ___________ Validity.
Face
When is low Face Validity desirable?
when researchers want to disguise the true purpose of the research from the respondents because the participant might answer inaccurately due to socially acceptable expectations
What is Empirical Validity?
an approach to establishing validity that relies on, or is based on, observation or planned data collection rather than theory or subjective judgment
Empirical validity is usually reported as a ____________ ____________.
Validity Coefficient
What is the Validity Coefficient?
a correlation coefficient used to express validity
A correlation coefficient can range from ______ to _____ to ______.
-1 to 0 to +1
Validity coefficients are typically low because _________ and _________.
- Performance on many criterion complex, involving many traits
- Criterion measures themselves, may not be highly valid
The closer a correlation coefficient is to zero means there is _________ correlation.
low
The closer a correlation coefficient is to -1 or +1, the _________ valid the measurement.
MORE
Validate the measurement against some kind of criteria such as (3) __________, _________, ___________.
rule
standard
already existing test
Criterion Validity is a type of ______________ Validity.
Empirical
Criterion Validity is ____________.
the extent to which the scores obtained from a procedure correlate with an observable behavior
What is a criterion?
- a rule or standard for making a judgment
2. The standard by which the test is being judged
The two types of Criterion Validity are __________ and ________.
- Predictive
2. Concurrent
Predictive (criterion validity) is ________
the extent to which a procedure allows for accurate predictions about a participant’s future behavior
Concurrent (criterion validity) is ____________.
the extent to which a procedures correlates with the present behavior of participants
Convergent Validity is _______
correlated with an already established instrument to establish another equally as valid instrument
Divergent Validity is __________
measurement of a variable that is the opposite of a known measurement that is valid
What is Judgmental-Empirical Validity?
an approach to establishing validity that relies on subjective judgments and data based on observation
*combo: expert and observation
Construct Validity is a type of ______________ Validity.
Judgemental-empirical
Construct validity is _____.
the extent to which a measurement reflects the hypothetical construct of interest
** not observable
What is a construct?
- an abstract concept used in a particular theoretical manner to relate different behaviors according to their underlying features or causes
- used to describe, organize, summarize and communicate our interpretations of behavior
- term used to summarize and describe behaviors that share certain attributes
- a collection of related behaviors that are associated in a meaningful way
A ___________ does not have a physical being outside of its indicators.
construct
Researchers infer the existence of a construct by observing the ____________ of related indicators.
collection
What is the collection of indicators in a construct?
- historical facts: family, medical, social
- symptoms: behaviors, family reports
- Clinical judgment and observation
Two factors in determining construct validity.
- Judgment about the nature of relationship: hypothesize about how the construct in the form of the instrument designed to measure it should effect or relate to other variables
- Empirical evidence: test the hypothesis using empirical methods
The method for determining construct validity offers only ____________ evidence regarding the validity of a measure.
indirect
Often construct validity is found through ___________ evidence.
indirect
Because the evidence for construct validity is indirect, researchers should be very cautious about declaring a measure to be valid on the basis of a ____________ study.
single
Construct validity is _________ secure
less
In construct validity researchers usually test a number of ___________ about the construct before determining construct validity.
hypotheses
A synonym for Reliability is ______.
consistency
____________ is more reliable than subjective.
objective
Reliability is __________.
the degree to which measurements are consistent
Types of Reliability errors are ___________.
- Random
- chance
- Unsystematic
* * interchangeable terms
Two important facts about reliability errors
- since such errors are in principle random and unbiased, they tend to cancel each other out.
- the sum of chance errors, when a sufficiently large number of cases is considered, approaches zero
The more concerning type of Reliability Error is ___________.
- Systematic Error
2. Constant Error
Definition of systematic error
an error produced by some factor that affects ALL observations similarly so that the errors are always in one direction and do not cancel each other out
A systematic error is usually a constant error and can be detected and _________________ for during statistical analysis.
corrected
What is the relationship between Reliability and Validity
reliability is a precursor of validity
A test cannot be valid if it is not first ____________.
reliable
Reliability comes ________, before it can be ________.
first
valid
______ before ______
R before V
High reliability means ________ random error.
little
High validity correlates with _______ true score
HIGH
Low reliability means ___________ random error
High
Can you have low reliability and high validity?
No, because you MUST have high reliability BEFORE validity can be considered
Two factors in the classic model for measuring reliability.
- measure twice
- check to see that the scores are consistent with each other usually done with a correlation coefficient, known as a reliability coefficient
What is the range of reliability?
-1 to 0 to +1
What are the three ways of measuring Reliability?
- Inter observer or Inter-rater
- Test-retest
- Parallel forms
Describe an inter observer or inter-rater method.
the extent to which raters agree on the scores they assign to a participant’s behavior
Describe Test-retest method.
the consistency with which participants obtain the same overall score when tested at different times
Describe Parallel forms method.
the consistency with which participants obtain the same overall score when given two forms of the same test, spaced slightly apart in time
How high should the reliability coefficient be?
.80 for individuals
.50 for groups of 25 or more
Why can the reliability coefficient for groups be lower than for individuals?
- reliability coefficients indicate the reliability for individuals’ scores
- Group scores are averages
* statistical theory indicates that averages are more reliable than the scores that underlie them (individual scores) because when computing an average, the negative errors tend to cancel out the positive errors
What is internal consistency/reliability?
use the scores from a single administration of a test to examine the consistency of test scores
*examines the consistency within the test itself
List two methods for establishing internal consistency/reliability.
- split-half
2. Cronbach’s Alpha (preferred)
What is the Split-half method of establishing internal consistency/reliability?
correlate scores on one half of the test with scores on the other half of the test
What is the Cronbach’s alpha method of establishing internal consistency/reliability?
mathematical procedure used to obtain the equivalent of the average of all possible split-half reliability coefficients
Larger number of items leads to a _________ result.
better
Cronbach’s is a formula used frequently in social sciences because it measures one particular _____________.
attribute
High internal consistency/reliability is desirable when a researcher has developed a test designed to measure a __________ unitary variable
single
Alphas should be ______ or more.
.80
In a test that measure several attributes you can still segment out each attribute’s questions and perform a ____________ on those for each attribute
Cronbach’s
List three types of Norm and Criterion Referenced tests.
- Norm-referenced
- Standardized
- Criterion-referenced
What is a norm-referenced test?
tests designed to facilitate a comparison of an individual’s performance with that of a norm group
What is a standardized test?
tests that come with standard directions for administration and interpretation
What is a criterion-referenced test?
tests designed to measure the extent to which individual examinees have met performance standards (i.e. a specific criteria)
List 3 attributes of Achievement Tests
- measures knowledge and skills individuals have already acquired
- Reliability: dependent on objectivity of scoring
- Validity: dependent on comprehensiveness of coverage of stated knowledge or skill domain
What is an achievement test?
a measure of optimal performance
What is an Aptitude Test?
a measure of potential performance
List 4 attributes of Aptitude Tests
- predict some specific type of achievement
- measure likelihood that individual will be able to acquire knowledge and skills in a particular area
- Reliability: r = .80 or higher for published tests
- Validity: determined by correlating scores with a measure of achievement obtained at a later time (r = .20 - .60 for published tests)
List 4 attributes of Intelligence Tests
- predict achievement in general, not any one specific type
- measure the likelihood that individual will be able to acquire knowledge and skils in general
- Reliability: no information provided
- Validity: published tests have low to modest validity for predicting achievement in school
List 4 criticisms of Intelligence Tests.
- tapping into culturally bound knowledge and skills rather than inmate (inborn) intelligence
- Slanted towards dominant racial or ethnic groups
- measure knowledge and skills that are acquired with instruction/formal schooling
- don’t measure all important aspects of intelligence
What is a Likert-Type Scale?
- 5 point scale ranging 1-5
- use verbal anchors for each number
- reduce response bias by providing positive and negative statements
Likert scale is an __________ level scale.
interval