Assessment And Test Flashcards
Appraisal can be defined as
a. the process of assessing or estimating attributes.
b. testing which is always performed in a group setting.
c. testing which is always performed on a single individual.
d. a pencil and paper measurement of assessing attributes.
The process of assessing or estimating attributes.
Appraisal is a broad term which includes more than merely “testing clients.” Appraisal could include a survey, observations, or even clinical interviews.
A test can be defined as a systematic method of measuring a sample of behavior. Test format refers to the manner in which test items are presented. The format of an essay test is considered a(n) ________ format.
a. subjective
b. objective
c. very precise
d. concise
Subjective.
A “subjective” paradigm relies mainly on the scorer’s opinion. If the rater knows the test taker’s attributes, the rater’s “personal bias” can significantly impact upon the rating. For example, an attractive examinee might be given a higher rating. (This is the so-called halo effect.)
The National Counselor Exam (NCE) is a(n) ________ test because the scoring procedure is specific.
a. subjective
b. objective
c. projective
d. subtest
Objective.
A short answer test is a(n) ________ test.
a. objective
b. culture-free
c. forced choice
d. free choice
Free Choice.
Some exams will call this a “free response” format. In any case, the salient point is that the person taking the test can respond in any manner he or she chooses. Although free choice response patterns can yield more information, they often take more time to score and increase subjectivity (i.e., there is more than one correct answer).
The NCE and the CPCE would be examples of a(n) ________ test.
a. free choice
b. forced choice
c. projective
d. intelligence
Forced Choice.
Forced choice” items are sometimes known as “recognition items.” This book is composed of forced choice/recognition items. On some tests this format is used to control for the “social desirability phenomenon” which asserts that the person puts the answer he or she feels is socially acceptable (i.e., the test provides alternatives that are all equal in terms of social desirability). The MMPI-2 (Minnesota Multiphasic Personality Inventory), for example, uses forced choices to create a “lie scale” composed of human frailties we all possess. This scale, therefore, ferrets out those individuals who tried to make themselves look good (i.e., the way they believe they “should” be).
The ________ index indicates the percentage of individuals who
answered each item correctly.
a. difficulty
b. critical
c. intelligence
d. personal
Difficulty.
The higher the number of people who answer a question correctly, the easier the item is—and vice versa. A 0.5 difficulty index (also called a difficulty value) would suggest that 50% of those tested answered the question correctly, while 50% did not. Most theorists agree that a “good measure” provides a wide range of items that even a poor performer will answer correctly.
Short answer tests and projective measures utilize free response items. The NCE and the CPCE uses forced choice or so-called ________ items.
a. vague
b. subjective
c. recognition
d. numerical
Recognition.
Recognition items give the examinee two or more alternatives.
A true/false test has ________ recognition items.
a. similar
b. free choice
c. dichotomous
d. no
Dichotomous
“Dichotomy” simply means that you are presented with two opposing choices. This explains why choice “a” is definitely incorrect. When a test gives the person taking the exam three or more forced choices (e.g., the NCE, the CPCE, or this book) then psychometricians call it a “multipoint item.”
A test format could be normative or ipsative. In the normative
format
a. each item depends on the item before it.
b. each item depends on the item after it.
c. the client must possess an IQ within the normal range.
d. each item is independent of all other items.
Each item is independent of all other items.
Ipsative measures compare traits within the same individual; they do not compare a person to other persons who took the instrument. The Kuder Career Planning instruments are often cited as falling into this category. The ipsative measure allows the person being tested to compare items.
A client who takes a normative test
a. cannot legitimately be compared to others who have taken the test.
b. can legitimately be compared to others who have taken the test.
c. could not have taken an IQ test.
d. could not have taken a personality test.
Can legitimately be compared to others who have taken the test.
In an ipsative measure the person taking the test must compare
items to one another. The result is that
a. an ipsative measure cannot be utilized for career guidance.
b. you cannot legitimately compare two or more people who
have taken an ipsative test.
c. an ipsative measure is never a forced choice format.
d. an ipsative measure is never reliable.
You cannot legitimately compare two or more people who have taken an ipsative test.
Since the ipsative measure does not reveal absolute strengths, comparing one person’s score to another is relatively meaningless.
Tests are often classified as speed tests versus power tests. A timed typing test used to hire secretaries would be
a. a power test.
b. neither a speed test nor a power test.
c. a speed test.
d. a fine example of an ipsative measure.
A speed test.
In terms of difficulty, a speed test is really intended to be fairly easy. The difficulty is induced by time limitations, not the difficulty of the tasks or the questions themselves.
A counseling test consists of 300 forced response items. The person taking the test can take as long as he or she wants to answer the questions.
a. This is most likely a projective measure.
b. This is most likely a speed test.
c. This is most likely a power test.
d. This is most likely an invalid measure.
This is most likely a power test.
An achievement test measures maximum performance or present level of skill. Tests of this nature are also called attainment tests, while a personality test or interest inventory measures
a. typical performance.
b. minimum performance.
c. unconscious traits.
d. self-esteem by always relying on a Q-Sort design.
Typical performance.
In a spiral test
a. the items get progressively easier.
b. the difficulty of the items remains constant.
c. the client must answer each question in a specified period
of time.
d. the items get progressively more difficult.
The items get progressively more difficult.
Just remember that a spiral staircase seems to get more difficult to climb as you walk up higher.
In a cyclical test
a. the items get progressively easier.
b. the difficulty of the items remains constant.
c. you have several sections which are spiral in nature.
d. the client must answer each question in a specified period
of time.
You have several sections which are spiral in nature.
In each section the questions would go from easy ones to those
which are more difficult.
A test battery is considered
a. a horizontal test.
b. a vertical test.
c. a valid test.
d. a reliable test.
A horizontal test.
In a test battery, several measures are used to produce results that could be more accurate than those derived from merely using a single source.
In a counseling research study, two groups of subjects took a test with the same name. However, when they talked with each other they discovered that the questions were different. The researcher assured both groups that they were given the same test. How is this possible?
a. The researcher is not telling the truth. The groups could not possibly have taken the same test.
b. The test was horizontal.
c. The test was not a power test.
d. The researcher gave parallel forms of the same test.
The researcher gave parallel forms of the same test.
When a test has two versions or forms that are interchangeable they are termed parallel forms or equivalent forms of the same test. From a statistical/psychometric standpoint each form must have the same mean, standard error, and other statistical components.
The most critical factors in test selection are
a. the length of the test and the number of people who took the test in the norming process.
b. horizontal versus vertical.
c. validity and reliability.
d. spiral versus cyclical format.
Validity and reliability.
Validity refers to whether the test measures what it says it measures while reliability tells how consistent a test measures an attribute.
Which is more important, validity or reliability?
a. Reliability.
b. They are equally important.
c. Validity.
d. It depends on the test in question.
Validity.
Experts nearly always consider validity the number one factor in the construction of a test. A test must measure what it purports to measure.
In the field of testing, validity refers to
a. whether the test really measures what it purports to measure.
b. whether the same test gives consistent measurement.
c. the degree of cultural bias in a test.
d. the fact that numerous tests measure the same traits.
Whether the test really measures what it purports to measure.
A counselor peruses a testing catalog in search of a test which
will repeatedly give consistent results. The counselor
a. is interested in reliability.
b. is interested in validity.
c. is looking for information which is not available.
d. is magnifying an unimportant issue.
Is interested in reliability.
Which measure would yield the highest level of reliability?
a. A TAT, projective test popular with psychodynamic helpers.
b. The WAIS-IV, a popular IQ test.
c. The MMPI-2, a popular personality test.
d. A very accurate postage scale.
A very accurate postage scale.
In the real world physical measurements are more reliable than psychological ones.
Construct validity refers to the extent that a test measures an abstract trait or psychological notion. An example would be
a. height.
b. weight.
c. ego strength.
d. the ability to name all men who have served as U.S. presidents.
Ego Strength.
Any trait you cannot “directly” measure or observe can be considered a construct.