- Quality of test score - Consistency of measurement across different characteristics of testing situations

TMT Flashcards by William Blanchet

Dictation techniques (Characteristics)

Traditionally (spelling measurement)
Now (listening and writing ability measurement)
Direct measurement of language ability
Beginner = Fill in the blanks dictation

How well did you know this?

Not at all

Perfectly

Why is it important to specify texts with many details?

Representativeness (texts candidates should be able to read)
Authenticity (texts nativelike-or-not in test depending on text items intended to measure)

How well did you know this?

Not at all

Perfectly

Non-Manifestation of Receptive Skills (problems)

Reading and Listening (skills involved uncertain)
Difficult to know if items succeeded measuring skills

How well did you know this?

Not at all

Perfectly

Receptive Skills (Problems)

Exercise of Recept. Sk. doesn’t mannifest itself in overt behaviour
Challenge to make tests that demand the use of skills and result in behaviour that demonstratre Successful use of Sk.

How well did you know this?

Not at all

Perfectly

What are 3 Direct Test Techniques?

1 - Oral interview (everything)

2 - Composition Writing (writing, voc, gram. etc)

3 - Summarizing - valid and direct (oral + reading comp, writing)

How well did you know this?

Not at all

Perfectly

How to Improve Positive Washback

abilities wanted to develop
widely and unpredicably sample
criterionreferenced/ direct
measure course objectives
Provide Support for Teacher
Weigh impact against practicability

How well did you know this?

Not at all

Perfectly

7 Specification of Test developpement when stating the Testing Problem

Type
Purpose
Test-Takers
Context
Abilities to be tested (part of purpose)
Test score consequences
Limitations (of context)

How well did you know this?

Not at all

Perfectly

Validity of a test determined

in relation to the purpose of the test

What the test should measure)

How well did you know this?

Not at all

Perfectly

What are Test Specifications?

Blueprint of a test
Details and information that enable to Develop a test
Pre-test development Stage

How well did you know this?

Not at all

Perfectly

Who needs test specification?

Teachers and tests developers
Items Writers
Test Validators
Test Users (e.g. Schools)

How well did you know this?

Not at all

Perfectly

What is Interactiveness?

Quality of the test itself
Task interactive if it corresponds to test takers language ability
If task not interactive = Not Valid

How well did you know this?

Not at all

Perfectly

What is the purpose of Construct Definition in Test Specification?

Specify Abilities to be Assessed
Determines What we have to ask learners to do

How well did you know this?

Not at all

Perfectly

Construct Definition (Example)

For a test of reading

The macro-construct is reading.

Reading Comprehension
- Skim

– Scan

– Speed

How well did you know this?

Not at all

Perfectly

Most Information Content in test specs?

+ information on content =

arbitrary decisions in what to include in any versions

How well did you know this?

Not at all

Perfectly

What Are the 7 Test specifications

Content
Structure
Timing
Medium Channel
Techniques
Criterial Levels of Performance
Scoring Procedures

How well did you know this?

Not at all

Perfectly

What is Reliability?

Quality of test score
Consistency of measurement across different characteristics of testing situations

How well did you know this?

Not at all

Perfectly

What are the 7 Reliability Factors

Make Longer Tests
Choose Items that Dicriminate
Provide Clear Instructions
Write Clear Items
Do Not give Much Choice
Make Objective Tests
Make Laid out and Legible Tests

How well did you know this?

Not at all

Perfectly

Authenticity and Interactiveness in Direct Testing

Authenticity - Using Construct Definition

Interactiveness - Avoiding under-over repres. of construct (measure nothing more-less)

How well did you know this?

Not at all

Perfectly

Discrete-Point vs. Integrative Items

DP - Items that measure One thing at a time

II - Items that measure Many things at Once

How well did you know this?

Not at all

Perfectly

Items that measure Many things at Once

Integrative

How well did you know this?

Not at all

Perfectly

Items that measure One thing at a time

Discrete-Point

How well did you know this?

Not at all

Perfectly

Norm-Referenced vs. Criterion-Referenced

N-R - Not Meaningful - Compared to other test takers (estimate grade)

C-R - Meaningful - gives What students know in which lang. ability

How well did you know this?

Not at all

Perfectly

High test Usefulness (3 Principles)

Overall Usefulness Important (not only 1 quality)
Evaluate Combined effect of test qualities to get overall test usefulness
Test Usefulness and Qualities Balance must be adapted to each testing situation

How well did you know this?

Not at all

Perfectly

Test Usefulness (6 qualities)

Study These Flashcards

Reliability
Validity
Authenticity
Interactiveness
Impact
Practicality

What is validity ? (+ What Approach)

- Quality of Test Scores - Test Should Measure what it's supposed to - Ongoing Process - Construct validity (unitary-unified concept)

What is Authenticity?

- Qualitity of the test itself - How test measures abilities authentic to real-life tasks - Validity - if authenticity mesure -or+ constructs

Authenticity (Validity)

- If authenticity mesures only some constructs = Construct-Under-Representation - If authenticity mesures more than constructs = Construct-Over-Representation

What is Holistic Scoring

- Point of departure = Ability - Compensatory = All elements tested in One criterial level (1 weak ability, but 1 mastery = Score High anyway)

What is the most important characteristic or quality of a test?

Construct Validity - Test what should be tested

Testing (Frequency, Objectives, Tasks)

F: One time, summative O: Vague Broad Subjects T: Not ideal to evaluate language abilities

Assessment (Frequency, Objectives, Tasks)

F: Many times, Gradual, Formative O: Focused, Brings to Adapted Activities T: Easily Evaluate Language Abilities

Validity (in Direct Testing)

+ Direct = + Validity + Direct = + Subjective Make test as valid possible before reliability

Validity's Purpose

- Specify Test Constructs - Make sure tasks measure nothing more, nothing less

Reliability (in Direct testing)

Too direct = (- Reliability) + Subjective = (- Reliability) After test valid make Reliable + Objective with VALID SCORING

Valid Scoring

Developing Good Scales: - Restrict teachers+raters (not rate on own taste) - If Scale Imposed and Many Raters

Valid Scoring (important)

- Measuring Direct Assessment (Writing & Oral Production)

How to Ensure Reliability in Testing

- Maximum number of Tasks - Restrict the Candidates - Give No choice of Task - Create Scoring Scales - Train Scorers - Have Multiple Scoring

Content Specification (Elements)

- Operations/Instruction - Input Materials - Addressees of texts - Lenght of test and tasks - Dialect, accent, style & speed

Content Specification (Operations/Instruction)

What candidates have to be able to carry out, tasks to do

Content Specification (Input Materials)

- Types of texts - Topics - Vocabulary Range - Structural Range

Content Specification (Addressees of texts)

Who candidates are expected to be able to speak to

Content Specification (speed)

Speed of processing

Multiple Choice Items (Disadvantages)

- Test Only Recognition Knowledge - Guessing may have Considerable/Unkowable impact test score - Restrict what can be tested - Difficult to write successful items - Backwash may be Harmful - Cheating facilitated

Functions Test Techniques should have

- Elicit Behaviour Reliable & Valid Indicator of the Ability - Elicit Behaviour Reliavly Scored - Economic of Time & Effort - Beneficial Backwash Effect

Cloze Procedure (Characteristics)

- Not Completely direct - Reading Tests - Preferred over Multiple Choice Tests - Easy to Develop & Score

Too specific Content Specification (Danger, Hughes)

- May go beyond understanding of components of language ability and their relationships

Content Specification (Hughes, Safest Procedure)

include only elements whose contribution is well established

Test Techniques Definition (Hughes)

Means of eliciting behavior from candidates that will tell us about their language abilities

Writing and Moderating Items (Steps)

1 - Sampling 2 - Writing items 3 - Moderating items

Sampling (Writing and Moderating items)

- Choose content (content validity + Positive Backwash) - Sample Widely & Unpredictably

Writing items (Writing and Moderating items)

Based on test specification through eyes of test-takers to avoid misunderstanding Consider all possible response + Provide key

Moderating Items (Writing and Moderating items)

Show to collegues for problems - no modification of wrong item possible = Delete Item

Specification of Criterial Levels of Performance (elements)

- Accuracy - Appropriacy - Range - Flexibility - Size

Criterial Level of Performance (Accuracy)

Pronunciation, grammatical & lexical accuracy

Criterial Level of Performance (Appropriacy)

Use of language appropriate to function

Criterial Level of Performance (Range)

(range available) Not Searching for Words

Criterial Level of Performance (flexibility)

Ability to initiate conversation and adapt to new topics

Criterial Level of Performance (Size)

long answers, can explain, develop

TMT Flashcards

(58 cards)