Test Development Flashcards Preview

PSY3041 > Test Development > Flashcards

Flashcards in Test Development Deck (93)
Loading flashcards...
1
Q

Define

Test revision

A

Action taken to modify a test’s content or format for the purpose of improving the test’s effectiveness as a tool of assessment

2
Q

Definition

a plan of the number and type of items that are required for a test, as indicated in the test specification

A

Plan for item writing

3
Q

Which item would be the best to remove?

A

Item 3

4
Q

How do you go about forming a test concept?

A
  • Review existing tests
  • Review literature regarding existing tests
  • Review the need for a test
  • Decide to develop/adapt a test
5
Q

Define

Rasch model

A

a model that relates the probability of response of a particular sort (e.g. right/wrong) to the difference between a person’s standing on a latent variable and the difficulty of the item

6
Q

What are the advantages of a Likert scale?

A
  • Degree of trait can be measured
  • Lots of information
  • Easy to use and administer
  • Works best with strong (but not extreme) statements
7
Q

What sort of factor analysis would you use when number of factors is known?

A

Confirmatory factor analysis (CFA)

8
Q

Define

Model of measurement

A

the formal statement of observations of objects mapped to numbers that represent relationships among the objects

9
Q

Definition

The decrease in item validities that inevitably occurs after cross-validation

A

Validity shrikage

10
Q

Definition

the assignment of numbers to objects according to a set of rules for the purpose of quantifying an attribute

A

Measurement

11
Q

Define

Plan for item writing

A

a plan of the number and type of items that are required for a test, as indicated in the test specification

12
Q

Definition

Action taken to modify a test’s content or format for the purpose of improving the test’s effectiveness as a tool of assessment

A

Test revision

13
Q

Definition

the extent to which the score on an item correlates with an external criterion relevant to the attribute or construct that is the subject of test construction

A

Item validity

14
Q

Definition

a way of constructing psychological tests that relies on collecting and evaluating data about how each of the items from a pool of items discriminated between groups of respondents who are thought to show or not show the attribute the test is to measure; also an approach to personality that relates the reports that people make about their characteristic behaviours to their social functioning and thereby provide tools for personality prediction

A

Empirical approach

15
Q

Give an example of a double-barrelled item

A

e.g. I support civil rights because discrimination is a crime against God.

16
Q

Define

Classical test theory

A

the set of ideas, expressed mathematically and statistically, that grew out of attempts in the first half of the twentieth century to measure psychological variables; and that turns on the central idea of a score on a psychological test comprising both true and error score components

17
Q

Definition

a way of constructing psychological tests that relies on both reasoning from what is known about the psychological construct to be measured in the test, and collecting and evaluating data about how the test and the items that comprise it actually behave when administered to a sample of respondents

A

Rational-empirical approach

18
Q

What are the disadvantages of a written/essay format test?

A
  • Narrow content
  • Bluffing possible
  • Hiding behind good writing
  • Time consuming scoring
  • Inter-rater reliability issues
19
Q

Definition

the percentage of the total group that got the item correct

A

Optimal item-difficulty index

20
Q

Which item is better? Why?

‘I get tired after soccer’ vs. ‘I get tired after exercise’

A

‘I get tired after exercise’

21
Q

Define

Construct

A

a specific idea or concept about a psychological process or underlying trait that is hypothesised on the basis of a psychological theory

22
Q

Define

Optimal item-difficulty index

A

the percentage of the total group that got the item correct

23
Q

Why is it is often recommended to have the initial item poor reviewed by experts prior to administering the questionnaire to the target sample?

A
  1. Confirm or invalidate your definition of the construct (by asking how relevant is each item to what you intend to measure).
  2. Evaluate the items clarity and conciseness
  3. Identify other items that you have failed to include
24
Q

Define

Test manual

A

the document that accompanies a psychological test and that records the way in which the test was developed, how the test is to be administered (including the groups for which it is relevant), information on the reliability and validity of the test when used for use for specific purposes, and norms for test interpretation

25
Q

Definition

a family of theories that specifies the functional relationship between a response to a single test item and the strength of the underlying latent trait

A

Item response theory (IRT)

26
Q

Define

Item characteristic curve

A

the term for a trace line in item response theory

27
Q

Definition

the formal statement of observations of objects mapped to numbers that represent relationships among the objects

A

Model of measurement

28
Q

Definition

the various forms the content of a psychological test can take

A

Item

29
Q

Definition

Test is administered to a representative sample of test-takers under conditions that stimulate the conditions that the final version of the test will be administered under

A

Test tryout

30
Q

These questions are based on which test assumption?

  • What is the test designed to measure?
  • Is there a need for this test?
  • What content will the test cover?
A

Psychological traits exist

31
Q

Definition

any of various similar model validation techniques for assessing how the results of a statistical analysis will generalize to an independent data set. It is mainly used in settings where the goal is prediction, and one wants to estimate how accurately a predictive model will perform in practice

A

Cross validation

32
Q

Define

Attribute

A

the consistent set of behaviours, thoughts or feelings that is the target of a psychological test

33
Q

What sort of factor analysis would you use to identify a manageable number of factors to extract?

A

exploratory factor analysis (EFA)

34
Q

Define

Item

A

the various forms the content of a psychological test can take

35
Q

Definition

refers to how many attributes a dataset has

A

Dimensionality

36
Q

What is a paired comparision?

A
  • Test-taker has to choose one of two options (e.g., a statement, object, picture) on the basis of some rule
  • The value (e.g., 1 or 0) of each option in each paired comparison is determined by judges prior to test administration
37
Q

Define

Latent trait

A

the hypothesised continuously and normally distributed dimension of individual differences that is the sole source of a consistent set of observable behaviours, thoughts and feelings, which is the target of a psychological test

38
Q

Define

Validity shrikage

A

The decrease in item validities that inevitably occurs after cross-validation

39
Q

These questions are based on which test assumption?

  • Who benefits from this test?
  • Is there any potential for harm?
A

Testing/assessment can be fair and benefit society

40
Q

Definition

the consistent set of behaviours, thoughts or feelings that is the target of a psychological test

A

Attribute

41
Q

What are some item writing guidelines for test construction?

A
  • Write items using straight forward language that is appropriate for the reading level of the population
  • Avoid double barrelled items
  • Avoid slang and colloquial expressions that may quickly become obsolete
  • Consider if using positively and negatively worded items is a good idea
  • Write items that majority of respondents can respond to appropriately
  • Ask about sensitive issues using straightforward and nonjudgemental language
  • Choose the item response carefully
    *
42
Q

Define

Item analysis

A

the process of studying behaviour of items when administered to a group of respondents, usually with a view to the selection of some of the items to form a psychological test

43
Q

Definition

the set of ideas, expressed mathematically and statistically, that grew out of attempts in the first half of the twentieth century to measure psychological variables; and that turns on the central idea of a score on a psychological test comprising both true and error score components

A

Classical test theory

44
Q

Definition

the extent to which items on a test represent the universe of behaviour the test was designed to measure

A

Content validity

45
Q

What is the aim of item development during test construction?

A

The researcher aims to generate an item pool with good content validity

46
Q

Define

Differential item functioning

A

the possibility that a psychological test item may behave differently for different groups of respondents

47
Q

Definition

a model that relates the probability of response of a particular sort (e.g. right/wrong) to the difference between a person’s standing on a latent variable and the difficulty of the item

A

Rasch model

48
Q

Define

Likert scale

A

a graphical scale originally with five points used by a respondent to represent the strength of an underling attitude or emotion

49
Q

Definition

a graphical scale originally with five points used by a respondent to represent the strength of an underling attitude or emotion

A

Likert scale

50
Q

Definition

a specific idea or concept about a psychological process or underlying trait that is hypothesised on the basis of a psychological theory

A

Construct

51
Q

Define

Dimensionality

A

refers to how many attributes a dataset has

52
Q

Define

Content validity

A

the extent to which items on a test represent the universe of behaviour the test was designed to measure

53
Q

What does item-discrimation index tell you?

A

Does the item separate ‘high’ and ‘low’ scorers?

54
Q

What are the 5 broad steps to test development?

A

Test conceptualisation

Test construction

Test tryout

Item analysis

Test revision

55
Q

What does oblique rotation assume?

A

assumes factors are correlated

56
Q

Definition

the document that accompanies a psychological test and that records the way in which the test was developed, how the test is to be administered (including the groups for which it is relevant), information on the reliability and validity of the test when used for use for specific purposes, and norms for test interpretation

A

Test manual

57
Q

What are the disadvantages of a binary scale?

A
  • Allows guessing (T/F)
  • Only suits content where a dichotomous response can be made
  • Content not as rich
58
Q

What types of questions should you ask yourself during test conceptualisation?

A
  • What is the test designed to measure?
  • What is the objective of the test?
  • Is there a need for the test?
  • Who will use the test?
  • Who will take the test?
  • What content will the test cover?
  • How will the test be administered?
  • What is the ideal format of the test?
  • Should more than one form of the test be developed?
  • What special training will be required of test users?
  • What types of responses will be required of test takers?
  • Who benefits from this test?
  • Is there any potential for harm in developing this test?
  • How will meaning be attributed to scores on this test?
59
Q

Define

Test tryout

A

Test is administered to a representative sample of test-takers under conditions that stimulate the conditions that the final version of the test will be administered under

60
Q

Define

Test construction

A

A stage in the process of test development that entails writing test items (or rewriting or revising existing items), as well as formatting ideas, setting scoring rules,and otherwise designing and building a test

61
Q

True or False

Over inclusion of items during test construction is recommended

A

True

62
Q

What are the disadvantages of a Likert scale?

A
  • Number of response options need to be considered
  • Odd vs even number of responses
63
Q

What does orthogonal rotation assume?

A

assumes factors are uncorrelated

64
Q

Definition

A stage in the process of test development that entails writing test items (or rewriting or revising existing items), as well as formatting ideas, setting scoring rules,and otherwise designing and building a test

A

Test construction

65
Q

Define

Empirical approach

A

a way of constructing psychological tests that relies on collecting and evaluating data about how each of the items from a pool of items discriminated between groups of respondents who are thought to show or not show the attribute the test is to measure; also an approach to personality that relates the reports that people make about their characteristic behaviours to their social functioning and thereby provide tools for personality prediction

66
Q

Define

Rational-empirical approach

A

a way of constructing psychological tests that relies on both reasoning from what is known about the psychological construct to be measured in the test, and collecting and evaluating data about how the test and the items that comprise it actually behave when administered to a sample of respondents

67
Q

Define

Test conceptualisation

A

the first stage of test development where the idea for a test begins

68
Q

Definition

the use of factor analysis inductively to identify the factor structure of a set of variables

A

Exploratory factor analysis

69
Q

These questions are based on which test assumption?

  • What is the ideal format of the test?
  • What types of responses will be required of test takers?
A

Traits/states can be measured

70
Q

What item properties might be investigated during item analysis?

A
  • Item difficulty/distribution
  • Dimensionality (i.e. factor analysis)
  • Item reliability
  • Item validity
  • Item discrimination
71
Q

Definition

a graph of the probability of response to an item as a function of the strength of or position on a latent trait

A

Trace line

72
Q

What can factor anaylsis provide?

A
  • Determine the number of underlying latent variables or constructs
  • Help condense information
  • Define the content or meaning of the factors
  • Helps identify items that are performing better or worse
    • Items that do not fit into any factor, or those that fit into more than one can be considered for elimination
73
Q

When looking at item distributions, what are the characteristics of items that should be flagged for removal?

A
  • Consider removing items with a highly skewed distribution
  • These are items that virtually everyone answers in the same way
    • Item conveys little information
    • Limited variability so will correlate weakly with other items (impacts on FA).
  • Keep items with a high variance/distribution
    • Likely to discriminate between the different level of the construct
  • Keep items with a mean close to the centre of the range of possible scores
74
Q

Define

Test specification

A

a written statement of the attribute or construct that the test constructor is seeking to measure and the conditions under which it will be used

75
Q

Definition

the first stage of test development where the idea for a test begins

A

Test conceptualisation

76
Q

Definition

a written statement of the attribute or construct that the test constructor is seeking to measure and the conditions under which it will be used

A

Test specification

77
Q

In what ways do tests ‘age’?

A
  • Domains change
  • Interpretations change
  • The stimuli age
  • Certain words change in their meaning
  • Test norms become outdated
  • Theories behind the test change
78
Q

What must be considered when deciding on the optimal item-difficulty index?

A

The probability of guessing correctly is taken into account when deciding the optimal item-difficulty index.

79
Q

Definition

the possibility that a psychological test item may behave differently for different groups of respondents

A

Differential item functioning

80
Q

Define

Item response theory (IRT)

A

a family of theories that specifies the functional relationship between a response to a single test item and the strength of the underlying latent trait

81
Q

What are the advantages of a binary scale?

A
  • Easy to construct
  • Easy to score
  • Quick to administer
  • Large number of questions
82
Q

Definition

the process of studying behaviour of items when administered to a group of respondents, usually with a view to the selection of some of the items to form a psychological test

A

Item analysis

83
Q

Definition

the hypothesised continuously and normally distributed dimension of individual differences that is the sole source of a consistent set of observable behaviours, thoughts and feelings, which is the target of a psychological test

A

Latent trait

84
Q

Definition

the term for a trace line in item response theory

A

Item characteristic curve

85
Q

Define

Trace line

A

a graph of the probability of response to an item as a function of the strength of or position on a latent trait

86
Q

Define

Exploratory factor analysis

A

the use of factor analysis inductively to identify the factor structure of a set of variables

87
Q

Define

Item validity

A

the extent to which the score on an item correlates with an external criterion relevant to the attribute or construct that is the subject of test construction

88
Q

These questions are based on which test assumption?

  • How will meaning be attributed to scores on this test?
A

Test behaviour is predictive

89
Q

Define

Cross validation

A

any of various similar model validation techniques for assessing how the results of a statistical analysis will generalize to an independent data set. It is mainly used in settings where the goal is prediction, and one wants to estimate how accurately a predictive model will perform in practice

90
Q

What are the advantages of a written/essay format test?

A
  • Complex, imaginative or original knowledge
  • Written communication
  • Information generated not recognised
91
Q

Define

Measurement

A

the assignment of numbers to objects according to a set of rules for the purpose of quantifying an attribute

92
Q

What does cross-validation tell you?

A

Is the test applicable to this population?

93
Q

These questions are based on which test assumption?

  • What is the ideal format of the test?
  • What types of responses will be required of test takers?
A

Tests have strengths /weaknesses /error