Reliability Flashcards

Question 1

Q

Question ID #10236: A test has a standard deviation of 12, a mean of 60, a reliability coefficient of .91, and a validity coefficient of .60. The test’s standard error of measurement is equal to:
Select one:

A.
12

B.
9.6

C.
3.6

D.
2.8

Answer

A

The correct answer is C.

To calculate the standard error of measurement, you need to know the standard deviation of the test scores and the test’s reliability coefficient. The standard deviation of the test scores is 12 and the reliability coefficient is .91. To calculate the standard error of measurement, you multiply the standard deviation times the square root of one minus the reliability coefficient: 1 minus .91 is .09; the square root of .09 is .3; .3 times 12 is 3.6.

Answers A, B, and D: These responses provide incorrect data - a test with a standard deviation of 12, a mean of 60, a reliability coefficient of .91, and a validity coefficient of .60, has a standard error of measurement of 3.6.

Question 2

Q

Question ID #12391: A researcher correlates scores on two alternate forms of an achievement test and obtains a correlation coefficient of .80. This means that ___% of observed test score variability reflects true score variability.
Select one:

A.
80

B.
64

C.
36

D.
20

Answer

A

The correct answer is A.

A reliability coefficient is interpreted directly as a measure of true score variability - a correlation coefficient of .80 would equate to 80%.

Answers B, C, and D: These responses provide incorrect percentages.

Question 3

Q

Question ID #12392: To estimate the effects of lengthening a 50-item test to 100 items on the test’s reliability, you would use which of the following?
Select one:

A.
Eta

B.
KR-20

C.
Kappa coefficient

D.
Spearman-Brown formula

Answer

A

The correct answer is D.

The Spearman-Brown prophecy formula is used to estimate the effects of lengthening or shortening a test on its reliability coefficient.

Answer A: Eta is used to determine the degree of association between two continuous variables when their relationship is nonlinear.

Answer B: The Kuder-Richardson Formula 20 (KR-20) is a measure of internal consistency reliability that can be used when test items are scored dichotomously.

Answer C: The kappa coefficient is a measure of inter-rater reliability.

Question 4

Q

Question ID #12393: To assess the internal consistency reliability of a test that contains 50 items that are each scored as either “correct” or “incorrect,” you would use which of the following?
Select one:

A.
KR-20

B.
Spearman-Brown

C.
Kappa statistic

D.
Coefficient of concordance

Answer

A

The correct answer is A.

The Kuder Richardson formula (KR-20) is a measure internal consistency reliability that can be used when test items are scored dichotomously (correct or incorrect).

Answer B: The Spearman-Brown formula is used to estimate the effects of lengthening or shortening a test on its reliability.

Answer C: The kappa statistic (also known as the kappa coefficient) is a measure of inter-rater reliability.

Answer D: The coefficient of concordance is another measure of inter-rater reliability.

Question 5

Q

Question ID #12394: You administer a test to a group of examinees on April 1st and then re-administer the same test to the same group of examinees on May 1st. When you correlate the two sets of scores, you will have obtained a coefficient of:
Select one:

A.
internal consistency.

B.
determination.

C.
equivalence.

D.
stability.

Answer

A

The correct answer is D.

Test-retest reliability indicates the stability of scores over time, and the test-retest reliability coefficient is also known as the coefficient of stability.

Answer A: Split-half reliability and coefficient alpha are methods for evaluating internal consistency.

Answer B: The coefficient of determination is the proportion of variance in the dependent variable that is predicted by the independent variable.

Answer C: The alternate forms reliability coefficient is also called the coefficient of equivalence when the two forms are administered at the same time.

Question 6

Q

Question ID #12395: The kappa statistic for a test is .95. This means that the test has:
Select one:

A.
adequate inter-rater reliability.

B.
adequate internal consistency reliability.

C.
inadequate inter-rater reliability.

D.
inadequate alternate forms reliability.

Answer

A

The correct answer is A.

The kappa statistic (coefficient) is a measure of inter-rater reliability. The reliability coefficient ranges in value from 0 to +1.0. Therefore, a kappa statistic of .95 indicates a high degree of inter-rater reliability.

Answer B: Coefficient alpha is used to evaluate internal consistency.

Answer C: A low kappa statistic would indicate inadequate inter-rater reliability.

Answer D: The alternate forms reliability coefficient is used to evaluate the equivalence of alternate forms of a test, not inter-rater reliability.

Question 7

Q

Question ID #12396: For a newly developed test of cognitive flexibility, coefficient alpha is .55. Which of the following would be useful for increasing the size of this coefficient?
Select one:

A.
Adding more items that are similar in terms of content and quality

B.
Adding more items that are similar in terms of quality but are different in terms of content

C.
Reducing the heterogeneity of the tryout sample

D.
Using a true or false format for the items rather than a multiple-choice format

Answer

A

The correct answer is A.

A test’s reliability is increased when the test is lengthened by adding items of similar content and quality, the range of scores is unrestricted (i.e., the tryout sample heterogeneity is maximized), and the ability to choose the correct answer by guessing is reduced (i.e., using multiple-choice items instead of true-false items).

Answers B, C, and D: These responses present incorrect information - a test’s reliability is increased when the test is lengthened by adding items of similar content and quality.

Question 8

Q

Question ID #12397: A student receives a score of 450 on a college aptitude test that has a mean of 500 and standard error of measurement of 50. The 68% confidence interval for the student’s score is:
Select one:

A.
400 to 450

B.
400 to 500

C.
450 to 550

D.
350 to 550

Answer

A

The correct answer is B.

The standard error of measurement is used to construct a confidence interval around an obtained test score. To construct the 68% confidence interval, one standard error of measurement is added to and subtracted from the obtained score. Since the student obtained a score of 450 on the test, the 68% confidence interval for the score is 400 to 500.

Answers A, C, and D: These responses present incorrect information - since the student obtained a score of 450 on the test, the 68% confidence interval for the score is 400 to 500.

Question 9

Q

Question ID #12398: Consensual observer drift tends to:
Select one:

A.
increase the probability of answering a test item correctly by chance alone.

B.
decrease the possibility of answering a test item by chance alone.

C.
produce an overestimate of a test’s inter-rater reliability.

D.
produce an underestimate of a test’s inter-rater reliability.

Answer

A

The correct answer is C.

Consensual observer drift occurs when two or more observers working together influence each other’s ratings on a behavioral rating scale so that they assign ratings in a similar idiosyncratic way. Consensual observer drift makes the ratings of different raters more similar, which artificially increases inter-rater reliability.

Answers A, B, and D: These responses provide incorrect information - consensual observer drift tends to artificially inflate inter-rater reliability.

Question 10

Q

Question ID #12413: To determine a test’s internal consistency reliability by calculating coefficient alpha, you would:
Select one:

A.
administer the test to a single sample of examinees two times.

B.
administer two alternate forms of the test to a single sample of examinees.

C.
administer the test to a single sample of examinees and have the tests scored by two raters.

D.
administer the test to a single sample of examinees one time.

Answer

A

The correct answer is D.

Determining internal consistency reliability with coefficient alpha involves administering the test once to a single sample of examinees and using the formula to determine the degree of inter-rater reliability.

Answer A: Administering the test to a single sample of examinees on two occasions would be the procedure for assessing test-retest reliability.

Answer B: Administering two alternate forms of the test to a single sample of examinees is the procedure for assessing alternate (equivalent) forms reliability.

Answer C: Having a test that was administered to a single sample of examinees scored by two raters is the procedure for assessing inter-rater reliability.

Question 11

Q

Question ID #12414: A problem with using percent agreement as a measure of inter-rater reliability is that it doesn’t take into account the effects of:
Select one:

A.
sample heterogeneity.

B.
test length.

C.
chance agreement among raters.

D.
inter-item consistency.

Answer

A

The correct answer is C.

Inter-rater reliability can be assessed using percent agreement or by calculating the kappa statistic. A disadvantage of percent agreement is that it does not take into account the amount of agreement that could have occurred among raters by change alone, which can provide an inflated estimate of the measure’s reliability. The kappa statistic is more accurate because it adjusts the reliability coefficient for the effects of chance agreement.

Answers A, B, and D: These responses present incorrect information - percent agreement as a measure of inter-rater reliability doesn’t take into account the effects of chance agreement among raters.

Question 12

Q

Question ID #13226: According to classical test theory, total variability in obtained test scores is composed of:
Select one:

A.
true score variability plus random error.

B.
true score variability plus systematic error.

C.
a combination of communality and specificity.

D.
a combination of specificity and error.

Answer

A

The correct answer is A.

As defined by classical test theory, total variability in test scores is due to a combination of true score variability plus measurement (random) error: X = T + E.

Answers B, C, and D: These responses provide incorrect information- total variability in obtained test scores is composed of true score variability plus random error.

Question 13

Q

Question ID #13227: Which of the following methods for evaluating reliability is most appropriate for speed tests?
Select one:

A.
Split-half

B.
Coefficient alpha

C.
Kappa statistic

D.
Coefficient of equivalence

Answer

A

The correct answer is D.

Of the methods for evaluating reliability, the coefficient of equivalence (also known as alternate or equivalent forms reliability) is most appropriate for speed tests.

Answer A: Split-half reliability is a type of internal consistency reliability, and measures of internal consistency reliability overestimate the reliability of speed tests.

Answer B: Coefficient alpha is a type of internal consistency reliability, and measures of internal consistency reliability overestimate the reliability of speed tests.

Answer C: The kappa statistic is a measure of inter-rater reliability.

Brainscape's Knowledge GenomeTM

Reliability Flashcards

Brainscape's Knowledge Genome^TM