CHAPTER 4: OF TESTS AND TESTING Flashcards
It has been defined as “any distinguishable, relatively enduring way in which one individual varies from another.”
Trait
It distinguishes one person from another, but is relatively less enduring.
States
It is an informed, scientific concept developed or constructed to describe or explain behavior.
Construct
It refers to an observable action or the product of an observable action, including test- or assessment-related responses.
Overt Behavior
This assumption holds that psychological traits (like shyness or intelligence) and states (like anxiety or happiness) are real and meaningful ways to describe how people differ from one another. Traits are relatively stable over time, while states are more temporary. Although traits and states are not directly observable, their existence is inferred from behavior, whether through observation, test answers, or self-reports. These constructs help psychologists explain and predict behavior. However, traits do not appear 100% of the time and can be influenced by the situation and the person’s environment. Importantly, how we label traits (like “shy” or “outgoing”) often depends on the context and the comparison group being used.
Assumption 1: Psychological Traits and States Exist
It refers to a method of interpreting test results where each response contributes to a total score that reflects the strength of a specific trait, ability, or state.
Cumulative Scoring
This assumption holds that once psychological traits or states are defined—such as aggression, intelligence, or anxiety—they can be measured in numerical terms using well-designed tests. Since traits like “aggression” can have different meanings depending on the context, test developers must create clear operational definitions and select behaviors that best represent those definitions. Then, they design test items that reflect these behaviors and determine how much each item should contribute to the final score. Through cumulative scoring, a person’s responses are totaled, and their score reflects the strength or level of the trait being measured. In essence, this assumption supports the idea that even abstract psychological qualities can be reliably quantified and evaluated.
Assumption 2: Psychological Traits and States Can Be Quantified and Measured
This assumption states that behavior shown during a test is meaningful because it can predict behavior outside of the test. While test tasks may seem simple or unrelated, like answering multiple-choice questions or pressing keyboard keys, they are designed to reflect or correlate with broader psychological traits or behaviors. For instance, personality test responses can indicate the likelihood of certain mental health issues, and job-related tests can predict future work performance. In some cases, tests are used not to predict but to postdict behavior—helping understand past behavior, such as in forensic settings. Essentially, test results serve as samples that help forecast or explain real-world behaviors.
Assumption 3: Test-Related Behavior Predicts Non-Test-Related Behavior
This assumption emphasizes that no test is perfect—every psychological test or measurement tool has its own strengths, limitations, and appropriate uses. Competent and ethical test users must fully understand the test’s development, purpose, proper administration, and interpretation. They must also be aware of what a test cannot do and be able to compensate for its weaknesses by using other sources of information when needed. This assumption highlights the importance of responsible and knowledgeable use of testing tools in psychological assessment.
Assumption 4: Tests and Other Measurement Techniques Have Strengths and Weaknesses
This assumption highlights that error is an unavoidable part of psychological testing and not necessarily a mistake, but a natural part of the measurement process. Test scores can be influenced by many factors other than the trait being measured—such as the examinee’s health, mood, environment, the assessor’s behavior, or even test flaws. These influences contribute to error variance, which reflects how much of a test score is due to factors unrelated to the actual trait. Measurement theories like Classical Test Theory (CTT) and Item Response Theory (IRT) all account for this inherent variability. Essentially, error is not a flaw in testing but a factor that must always be considered in interpreting results.
Assumption 5: Various Sources of Error Are Part of the Assessment Process
It refers to the part of a test score that is caused by factors unrelated to the trait or ability being measured. It is the “noise” in a score that makes it less accurate or reliable because it reflects influences other than the actual construct being assessed.
Error Variance
This assumption asserts that psychological testing and assessment can be fair and unbiased, but fairness depends heavily on how tests are developed and used. Although most modern test publishers strive for fairness by adhering to standardized guidelines, fairness issues still arise—especially when tests are used with populations they weren’t designed for. Sometimes these concerns are less about the test itself and more about societal goals, like those seen in debates over affirmative action. Ultimately, tests are tools, and their fairness depends on whether they are used appropriately and ethically.
Assumption 6: Testing and Assessment Can Be Conducted in a Fair and Unbiased Manner
This assumption emphasizes that psychological testing and assessment ultimately benefit society. While some may see tests as burdensome, especially in academic settings, their absence would lead to chaos and inefficiency in critical fields like medicine, education, aviation, and the military. Without tests, there would be no standardized way to evaluate competence, diagnose difficulties, or make fair and informed decisions. Thus, testing serves as an essential tool for ensuring safety, fairness, and effective functioning in many areas of life.
Assumption 7: Testing and Assessment Benefit Society
It refers to the consistency or stability of test scores over time or across raters. A good test or, more generally, a good measuring tool or procedure is reliable and involves consistency.
Reliability
It refers to the accuracy of a test in measuring what it is intended to measure. A test is considered valid for a particular purpose if it does, in fact, measure what it purports to measure.
Validity
It is a method of evaluation and a way of deriving meaning from test scores by evaluating an individual test taker’s score and comparing it to the scores of a group of test takers.
Norm-referenced Testing and Assessment
It pertains to the test performance data of a particular group of test takers that are designed for use as a reference when evaluating or interpreting individual test scores.
Norms
It is the group of people whose performance on a particular test is analyzed for reference in evaluating the performance of individual test-takers.
Normative sample
It refers to the process of deriving norms.
Norming
It is the controversial practice of norming on the basis of race or ethnic background.
Race Norming
It consists of descriptive statistics based on a group of test-takers in a given period of time, rather than norms obtained by formal sampling methods.
User Norms or Program Norms
It is the process of administering a test to a representative sample of test-takers for the purpose of establishing norms.
Standardization or Test Standardization
A smaller group selected from a larger population, used to represent the characteristics of the whole population. A subset of that population.
Sample
It is the process of selecting a portion of the universe deemed to be representative of the whole population.
Sampling