CH 5, 6 Flashcards

1
Q

A criterion-referenced achievement test would be least useful for

(A) planning classroom instruction.
(B) analyzing individual achievement.
(C) comparing individuals’ performance in a group.
(D) determining minimum competency in a content area.

A

(A) planning classroom instruction.

*A criterion-referenced achievement test is used to determine if the students mastered the subject matter, but not for comparing individuals’ performance in a group.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What measurement scale is ethnicity?

(A) Nominal scale
(B) Ordinal scale
(C) Interval scale
(D) Ratio scale

A

(A) Nominal scale

*Ethnicity categorizes individuals into mutually exclusive groups. Thus, it is a nominal scale.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What measurement scale is household income in the unit of U.S. dollars?

(A) Nominal scale
(B) Ordinal scale
(C) Interval scale
(D) Ratio scale

A

(D) Ratio scale

*Household income is a ratio scale variable because it has equal intervals and it has an absolute zero.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Scholastic aptitude tests are useful in schools because they

(A) can be used to predict achievement.
(B) provide a measure of academic achievement.
(C) provide a measure of ability uninfluenced by academic experience.
(D) are not influenced by subject’s motivation, home background, and so on.

A

(A) can be used to predict achievement.

*Scholastic aptitude tests measure the potential for learning a body of knowledge and can be used to predict future achievement.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Standardized and researcher-made tests share some of the same characteristics. Which of the following is not usually characteristic of a researcher-made test?

(A) the minimal influence of random errors of measurement
(B) the use of objective-type items
(C) the availability of norms for comparison
(D) the availability of raw scores which can be converted to percentile rank

A

(C) the availability of norms for comparison

*Norms for comparison are available for standardized tests, but not for researcher-made tests.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

The ratings that three teachers made of the leadership ability of a particular high school senior agreed closely. This agreement among raters is referred to as interrater

(A) validity.
(B) reliability.
(C) objectivity.
(D) convergence.

A

(B) reliability.

*The close agreement or high correlation between raters is called interrater reliability.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

A teacher reports having a kindergarten child who is withdrawn and does not interact with other students. What measurement tool would a researcher use to get a better grasp of the problem before suggesting behavior modification therapy?

(A) attitude scale
(B) direct observation
(C) personality inventory
(D) semantic differential

A

(B) direct observation

*Direct observation is the best measurement if we want to measure the degree to which the child interacts with people around him or her.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Which one of the following statements would be most suitable for a Likert-type scale measuring students’ attitudes toward math?

(A) Math is easy for some students and difficult for others.
(B) Math is fun.
(C) Math is one of the basic skills.
(D) Some students like math.

A

(B) Math is fun.

*For measuring students’ attitudes towards math, the most suitable item among the four would be “ Math is fun”.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Predictive validity evidence ________.

(A) is the relationship between scores on a measure and criterion scores available at a future time.
(B) is evidence-based on internal structure
(C) is evidence-based on relationship to other variables
(D) is the relationship between two scores that measure the construct at the same time.

A

(A) is the relationship between scores on a measure and criterion scores available at a future time.

*Tests in Print (and Mental Measurements Yearbook) provide feedback to the user on the types of measurement instruments available for a given construct.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

The standard error of measurement is based on the test’s

(A) validity.
(B) difficulty.
(C) reliability.
(D) discriminability.

A

(C) reliability.

*SSCI allows you to take an important piece of work in your !eld and look at who has referenced that work in later studies.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

A test has a reliability coefficient of 0.84. What percent of test variance is error?

(A) 4%
(B) 16%
(C) 32%
(D) 84%

A

(B) 16%

*The percent of error variance equals 1 minus reliability.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Adding 10 items similar to those already in the test would

(A) raise reliability.
(B) lower reliability.
(C) neither raise nor lower reliability.
(D) cannot be determined.

A

(A) raise reliability.

*Usually a longer test with good items yields higher reliability.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Reliability is defined as the ratio between the variance of

(A) the true scores and observed scores.
(B) error scores and observed scores.
(C) two sets of scores from identical or equivalent tests.
(D) error scores and true scores.

A

(A) the true scores and observed scores.

*Reliability is defined as the ratio of the variance o true scores over observed scores.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

The score validity of a test is related highly to the

(A) test format.
(B) number of items.
(C) purpose of the test.
(D) availability of equivalent forms.

A

(C) purpose of the test.

*Validity is highly related to what the test intends to measure, that is, the purpose of the test.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Which of the following would contribute the best evidence for the score validity of a new group intelligence test?

(A) The correlation between Form A and Form B of the test.
(B) The correlation between test scores and grades in reading.
(C) The correlation between test scores and scores from the Stanford-Binet intelligence test.
(D) An examination of the homogeneity of scores on the test.

A

(C) The correlation between test scores and scores from the Stanford-Binet intelligence test.

*The validity of an intelligence test can be established by the high correlation between the new test and an established IQ test. This is also referred to as criterion-related validity evidence.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

A test-retest reliability coefficient can be affected by the following factors except

(A) practice effects from taking the test on more than one occasion.
(B) day-to-day fluctuations in a person’s behavior.
(C) long-term change in a person’s behavior.
(D) the particular sample of items used on the test.

A

(D) the particular sample of items used on the test.

*Test-retest uses the same sample of items, thus the sample items will not a#ect test-retest reliability. All the other three factors can a#ect a test-retest reliability coefficient.