Psychometrics Flashcards
(45 cards)
Name and define 2 approaches to the study of IDs
1) Nomothetic: seeking general laws of e.g. personality
2) Idiographic: seeking to understand the unique person in his/her glowing individuality
Give 3 uses of personality or IQ tests
1) Their scientific use in evaluating theories
2) Their practical use in educational, clinical & personnel assessment
3) The factors influencing their design are relevant to other areas of psychology & other social sciences
Why do tests face ethical & political dilemmas? Give an e.g.. Tests are also used to justify the ___ ___ ___
Because they are used to allocate jobs, resources and educational opportunities e.g. the debate regarding why some GPs have higher IQs than others. Political status quo
Who began the scientific study of IDs with the ‘Classification of Men According to their Natural Gifts’ in 1869?
Galton (cousin of Darwin)
Allport (1937) labelled psychology’s nomothetic approach as ___-___. It ignored…. Murray (1938) similarly argued that any human being is in some respects…1), 2) & 3)
Pseudo-scientific. The outstanding characteristic of man which is his individuality. 1) like all other humans, 2) like some other humans & 3) like no other humans
Define a psychological test
An objective and standardised measure of a small but carefully chosen sample of a person’s behaviour
Psychological test items must be ___ of the entire domain of behaviour which the test intends to measures
Representative
Name 4 uses of achievement testing
1) Diagnosis of strengths & weaknesses
2) Evaluation of a person’s progress
3) Info for educational policy
4) Accountability
Define standardisation. Why is it necessary? Give 5 e.g.s of what should be standardised
Ensuring uniformity of the procedure used to administer & score a psychological test. To allow results from different people to be directly compared. Materials used, time limits set, oral instructions, preliminary demonstrations & ways of handling queries
After administering the test to a large sample of people we can calculate the ___ & ___ of scores so that we can compare each person’s score directly to these calculated ___ to specify that person’s strengths and weaknesses
Mean. SD. Norms
What is dynamic testing? Why is it useful?
It’s when the difficulty of the next Q presented by the computer is determined by whether the Pp answered the previous Q correctly or not. Dynamic testing begins with Qs of moderate difficulty. This more personalised testing procedure prevents floor or ceiling effects from being reached
Name 3 ways in which we must control the use of a psychological test
1) Ensure the test is administered by a qualified examiner
2) Prevent the test from reaching the public domain so as to prevent general familiarity with test content which would invalidate it. 3) Ethical issues e.g. individual feedback should not be given where unnecessary e.g. in experiments
Why is invalidation of a test problematic?
Because it results in having to spend time & money designing a new test
Name 5 disadvantages of testing. Why was testing unnecessary in the past?
1) Qs may be biased towards a particular cultural group or gender, 2) when a person gets a Q wrong we don’t know why, 3) test anxiety confounds results, 4) being familiar with psych tests in general may provide an advantage, 5) ecological validity. Because close communities relied on LT family reputations
Define reliability in the normal sense & in the sense of sources of measurement error. A test is reliable when…
The extent to which a test is consistent in what it measures. The test’s relative freedom from (tendency not to be affected by) unsystematic errors of measurement. When it gives the same results for the same person under varying conditions that can produce unsystematic measurement errors
Distinguish between and give e.g.s of systematic and unsystematic errors. Which affect reliability?
U: affect test scores in a random, unpredictable manner from situation to situation & so lower test reliability. E.g. Pps’ motivation & attention. S or constant errors affect test scores in a fixed way & so do not affect reliability e.g. practice effects & resultant learning
What is error variance?
Any distortions to scores which are irrelevant to the test’s purpose e.g. fluctuations in mood or gender. Note that a factor classified as providing error variance in exp 1) may be classified as providing true variance in exp 2)
Name 4 sources of error variance which threaten reliability
1) Time sampling
2) Content sampling
3) Content heterogeneity
4) Observer differences
How do we check for error variance caused by time sampling? How long should the inter-test time interval be? Why?
Administer the same test to the same Pps at a different points in time which gives us a test-retest reliability co-efficient. 6 months. To prevent recall of test answers whilst also ensuring the person’s IQ/personality won’t have changed due to a change in their situation
From a 6 to 42 year test-retest interval, the Big Five have…
high test-retest reliability coefficients e.g. 0.78 to 0.85
How do we check for error variance caused by content sampling? What is this procedure called?
Administer to the same Pps at the same time point 2 equivalent tests which contain the same kind of items of equal difficulty but not the exact same items & calculate a between-forms reliability coefficient. Parallel forms
What is the most desirable procedure for estimating reliability which takes into account error variance cause by both time and content sampling?
To correlate scores between two parallel forms at different points in time
Why are parallel forms problematic? Therefore we may use…instead of parallel forms
Because they are difficult and expensive to construct. A split-half reliability check
How can we check for error variance caused by content heterogeneity?
By calculating a split-half reliability coefficient: a single test is viewed as consisting of 2 parts or parallel forms, each of which should be measuring the same construct. The correlation between scores for two arbitrarily selected halves of a test is calculated e.g. odd & even Q numbers. This checks for content homogeneity