Standardized Tests and Ethics of Test Prep Flashcards

(17 cards)

1
Q

Standardized Test

A

A test designed to be administered to a given group of students with a predetermined set of curricular aims, difficulties, formats, and scale measurement procedures.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Norm-Referenced

A

A test devised with the intention of measuring students’ skills in one or a variety of skills comparatively.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Criterion-Referenced

A

A test variety designed in order to determine how well a student currently performs on predetermined curricular aims

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Mean

A

Average of a scored series: Add the scores together and divide by the total number of scores in order to generate.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Median

A

The middle number of a sequenced score set.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Stanine

A

Result according to the stanine scale, which is a score distribution that generally seems to follow a standard bell curve. Score is chosen from one of nine separate distribution sections in which the fewest are found in the lowest and highest sections and the majority of the scores are housed in central sections four through six. Hence, if a student scores a stanine level 9, then he or she has attained a level only achieved by about four percent of the tested population. It the student achieves a stanine of level four, then he or she ranks with 17% of the population. Stanines are approximations of skill, so that provides them a bit of flexibility that can be interpreted as a strength.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Percentile

A

A comparative number that states what percent of the norm group you outperformed on whatever aim is being measured.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Normal Curve Equivalent

A

Uses raw score information to generate a percentile score. It requires a standard bell curve distribution to offer valid judgments. If this is not available, then NCE is worthless.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Standard Deviation

A

Shows the spread between high and low scores pitted against the median score. Higher numbers = More deviation. Lower numbers = Less deviation.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Standard Error of Measurement

A

Seeks to demonstrate likelihood of consistency of performance on equally difficult tests over the same content.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

P-Value

A

A difficulty measurement rendered by dividing the total number of correct answers on an item by the number of students attempting the item. If ten students attempted a question and seven of them answered correctly, then the p-value of the question would be .7.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Scaled Score

A

Converts raw score to a different set of numbers that will be used regularly on a certain battery of assessments. Generally, these scores are higher for better performance and lower for poorer performance. Scale scoring can take item response theory into account, allowing the difficulty of items to be factored into results, however IRT is not accessible to the layperson without significant assistance. The NCE and stanine scale are both housed under scaled scores.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Item Response Inventory

A

A complex variety of scaling that permits the evaluator to factor in the challenge (or other chosen properties) of questions on an assessment. This permits evaluators to assign scale ranges throughout the grades, but results are challenging to interpret for both educators and students’ families without significant framing.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Grade Equivalent

A

A developmental score comprised of a grade level and the month of the school year, divided by a period. For example, a 6.3 would be the grade level, and the three would indicate the third month of the school year. These scores are frequently misunderstood and should be framed in terms of the precise standards being measured and the notion of relative developmental equivalency with students at that grade level at that point in time on that examination. They should not be taken as indications that the student is performing on X grade level in a completely comprehensive and literal sense. Popham (2020) does a fine job explaining this when he states that if a student receives a higher grade-level result than typical, one should regard it as a general average of how a student on that grade level would have performed on a test on the student’s grade level, rather than how the student would perform on a test on the “achieved” grade level because the standards and rigors would likely be different.
Popham, W.J. (2020). Classroom assessment: What teachers need to know. Pearson

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Item Discrimination

A

A judgment of how well an item discriminates between high and low scorers on the assessment. As a rule, a question that is answered correctly by those that perform well on the test, and incorrectly by those who perform poorly, is considered to be a “positive discriminator”, which is desirable. Questions that are generally answered correctly more often by those who perform poorly than those that score well are regarded as “negative discriminators,” which are generally regarded as troublesome. Those that show equivalency in the ratio of correct answers between high and low scorers are “nondiscrimators” and relatively neutral. Ideally, positive discriminators should be the most common on the assessment (Popham, 2020, p. 273).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Item Discrimination Procedure

A
  1. Sequence scores from high to low and split the stack in half.
  2. Figure the p-value for both groups (# correct answers/# in group)
  3. Subtract low group p value from high group p value to receive D (the discrimination index).

Popham, W.J. (2020). Classroom assessment: What teachers need to know. Pearson.

17
Q

Discrimination Index Evaluations

A

.4 and higher: Strong
.3 - .39: Sensibly good, can still be refined
.2 - .29 Okay, should probably be improved
.19 and lower: Weak. Reject or revise.

Popham, W.J. (2020). Classroom assessment: What teachers need to know. Pearson.