Psychometrics Flashcards by Robert Blakey

Name and define 2 approaches to the study of IDs

1) Nomothetic: seeking general laws of e.g. personality

2) Idiographic: seeking to understand the unique person in his/her glowing individuality

How well did you know this?

Not at all

Perfectly

Give 3 uses of personality or IQ tests

1) Their scientific use in evaluating theories
2) Their practical use in educational, clinical & personnel assessment
3) The factors influencing their design are relevant to other areas of psychology & other social sciences

How well did you know this?

Not at all

Perfectly

Why do tests face ethical & political dilemmas? Give an e.g.. Tests are also used to justify the ___ ___ ___

Because they are used to allocate jobs, resources and educational opportunities e.g. the debate regarding why some GPs have higher IQs than others. Political status quo

How well did you know this?

Not at all

Perfectly

Who began the scientific study of IDs with the ‘Classification of Men According to their Natural Gifts’ in 1869?

Galton (cousin of Darwin)

How well did you know this?

Not at all

Perfectly

Allport (1937) labelled psychology’s nomothetic approach as ___-___. It ignored…. Murray (1938) similarly argued that any human being is in some respects…1), 2) & 3)

Pseudo-scientific. The outstanding characteristic of man which is his individuality. 1) like all other humans, 2) like some other humans & 3) like no other humans

How well did you know this?

Not at all

Perfectly

Define a psychological test

An objective and standardised measure of a small but carefully chosen sample of a person’s behaviour

How well did you know this?

Not at all

Perfectly

Psychological test items must be ___ of the entire domain of behaviour which the test intends to measures

Representative

How well did you know this?

Not at all

Perfectly

Name 4 uses of achievement testing

1) Diagnosis of strengths & weaknesses
2) Evaluation of a person’s progress
3) Info for educational policy
4) Accountability

How well did you know this?

Not at all

Perfectly

Define standardisation. Why is it necessary? Give 5 e.g.s of what should be standardised

Ensuring uniformity of the procedure used to administer & score a psychological test. To allow results from different people to be directly compared. Materials used, time limits set, oral instructions, preliminary demonstrations & ways of handling queries

How well did you know this?

Not at all

Perfectly

After administering the test to a large sample of people we can calculate the ___ & ___ of scores so that we can compare each person’s score directly to these calculated ___ to specify that person’s strengths and weaknesses

Mean. SD. Norms

How well did you know this?

Not at all

Perfectly

What is dynamic testing? Why is it useful?

It’s when the difficulty of the next Q presented by the computer is determined by whether the Pp answered the previous Q correctly or not. Dynamic testing begins with Qs of moderate difficulty. This more personalised testing procedure prevents floor or ceiling effects from being reached

How well did you know this?

Not at all

Perfectly

Name 3 ways in which we must control the use of a psychological test

1) Ensure the test is administered by a qualified examiner
2) Prevent the test from reaching the public domain so as to prevent general familiarity with test content which would invalidate it. 3) Ethical issues e.g. individual feedback should not be given where unnecessary e.g. in experiments

How well did you know this?

Not at all

Perfectly

Why is invalidation of a test problematic?

Because it results in having to spend time & money designing a new test

How well did you know this?

Not at all

Perfectly

Name 5 disadvantages of testing. Why was testing unnecessary in the past?

1) Qs may be biased towards a particular cultural group or gender, 2) when a person gets a Q wrong we don’t know why, 3) test anxiety confounds results, 4) being familiar with psych tests in general may provide an advantage, 5) ecological validity. Because close communities relied on LT family reputations

How well did you know this?

Not at all

Perfectly

Define reliability in the normal sense & in the sense of sources of measurement error. A test is reliable when…

The extent to which a test is consistent in what it measures. The test’s relative freedom from (tendency not to be affected by) unsystematic errors of measurement. When it gives the same results for the same person under varying conditions that can produce unsystematic measurement errors

How well did you know this?

Not at all

Perfectly

Distinguish between and give e.g.s of systematic and unsystematic errors. Which affect reliability?

U: affect test scores in a random, unpredictable manner from situation to situation & so lower test reliability. E.g. Pps’ motivation & attention. S or constant errors affect test scores in a fixed way & so do not affect reliability e.g. practice effects & resultant learning

How well did you know this?

Not at all

Perfectly

What is error variance?

Any distortions to scores which are irrelevant to the test’s purpose e.g. fluctuations in mood or gender. Note that a factor classified as providing error variance in exp 1) may be classified as providing true variance in exp 2)

How well did you know this?

Not at all

Perfectly

Name 4 sources of error variance which threaten reliability

1) Time sampling
2) Content sampling
3) Content heterogeneity
4) Observer differences

How well did you know this?

Not at all

Perfectly

How do we check for error variance caused by time sampling? How long should the inter-test time interval be? Why?

Study These Flashcards

Administer the same test to the same Pps at a different points in time which gives us a test-retest reliability co-efficient. 6 months. To prevent recall of test answers whilst also ensuring the person’s IQ/personality won’t have changed due to a change in their situation

From a 6 to 42 year test-retest interval, the Big Five have…

Study These Flashcards

high test-retest reliability coefficients e.g. 0.78 to 0.85

How do we check for error variance caused by content sampling? What is this procedure called?

Study These Flashcards

Administer to the same Pps at the same time point 2 equivalent tests which contain the same kind of items of equal difficulty but not the exact same items & calculate a between-forms reliability coefficient. Parallel forms

What is the most desirable procedure for estimating reliability which takes into account error variance cause by both time and content sampling?

Study These Flashcards

To correlate scores between two parallel forms at different points in time

Why are parallel forms problematic? Therefore we may use…instead of parallel forms

Study These Flashcards

Because they are difficult and expensive to construct. A split-half reliability check

How can we check for error variance caused by content heterogeneity?

Study These Flashcards

By calculating a split-half reliability coefficient: a single test is viewed as consisting of 2 parts or parallel forms, each of which should be measuring the same construct. The correlation between scores for two arbitrarily selected halves of a test is calculated e.g. odd & even Q numbers. This checks for content homogeneity

What is the official name of the co-efficient used in the split-half reliability check? Is it high for the Big Five?

Cronbach's alpha. Yes

How do we check for error variance caused by inter-scorer differences in observation-based tests? When is this not necessary in observation studies? When there are more than 2 observers, a...is conducted

Correlate the test scores of 2 different observers to give an inter-rater reliability coefficient (a measure of observer agreement). When the test is objective and so only clerical error can produce inter-scorer differences. An interclass correlation or coefficient of concordance

Parallel forms is also called...and comes in 2 previously mentioned variants:.... In total we have covered ___ reliability check techniques

Alternate-forms. Immediate or delayed. 6

Using different reliability co-efficients it is possible to assign different proportions of variances to different causes. E.g. An overall reliability co-efficient of 0.85 would indicate that...

85% of the variance in test scores is attributable to true variance in people's traits & 15% is attributable to error variance

How high must the reliability co-efficient be? It depends on the purpose of the test: if comparing...

2 group means then .65 is satisfactory. If comparing 2 individuals then .85 is necessary

Name 3-5 ways in which reliability can be increased. Some of the answers reflect ways of increasing true variance and therefore the % of total variance which is true

1) by ensuring items are of moderate difficulty (.5 probability of correct answering), 2) by lengthening the questionnaire, 3) by training observers in a more standardised way, 4) by changing test items, 5) by selecting a more heterogenous group to attain a variety of scores (1, 2 & 5 are from the slides)

Define validity

The extent to which the test measures what it is designed to measure

What is validity affected by: systematic or unsystematic error? Can a test be valid without it being reliable? Can a test be reliable without it being valid?

Both! No. Yes

Name 7 types of validity!

1) face validity, 2) content validity, 3) predictive validity, 4) concurrent validity, 5) incremental validity, 6) criterion-related validity, 7) construct validity

Define face validity

The extent to which a test looks good for a specific purpose = important when marketing a test

Define content validity. Who might determine a test's content validity & so contribute to test ___ & ___?

The extent to which a test includes a wide range of items (calling upon a range of responses & skills) to represent the entire target domain. Subject-matter experts. Construction. Evaluation

How do we check for criterion-related validity?

By comparing test scores with performance scores on other criterion measures e.g. other test scores, tutor ratings, coursework marks

Concurrent validity refers to a type of criterion-related validity which is measured whenever the...

Criterion measure is available at the time of testing e.g. an IQ test taken in March and teacher reports from March or computer anxiety vs. general anxiety measures administered at the same time point

Whenever a test is applied to a new category for the first time we must.... We can then use test scores as a more efficient means of categorising people into their likely groups I.e. group membership becomes a way of assessing the...e.g. The MMPI (Minnesota Personality Inventory) can be used to classify...

Calculate the average and SD of test scores for this group of Pps. Criterion-related concurrent validity of the test. Mental disorders

Predictive validity is a form of criterion-related validity which is used whenever.... The Q asked is.... The correlation is rarely greater than 0.___ I.e. not more than ___% of the variance in the outcome V can be accounted for by the predictor (test score)

The criterion measure is unavailable until some time after the test has been administered. Do test scores predict criterion scores? 0.6. 36%

Define incremental validity. Give an e.g. of an incrementally valid personality dimension

The extent to which using this test adds to the predictive validity of pre-existing predictive measures I.e. whether the new test is worth the expense and time required to administer it given the tools we already have e.g. conscientiousness predicts GPA over & above SAT scores alone

Define construct validity

The most general type of validity which involves the slow, laborious process of collating evidence from studies of content validity and criterion-related validity

How is construct validity usually tested? E.g. with the construct aggressiveness...

The strength & direction of a construct's correlations with directly & indirectly characteristic behaviours & uncharacteristic behaviours are predicted & tested e.g. a weak +ve correlation between making decisions (indirectly linked via need for power) &aggression & no correlation between honesty & aggression

A test with high construct validity should have high ___ & ___ validity too. Define these 2 types of validity

Convergent validity = strong correlations between the test & other measures of the same characteristic Discriminant validity = weak/ no correlations between the test & measures of different characteristics

Which 4 types of correlations are relevant for assessing convergent and discriminant validity I.e. construct validity?

1) Measuring the same trait with the same method, 2) measuring different traits with the same method, 3) measuring the same traits with different methods, 4) measuring different traits with different methods

Using this ___-___ approach (Campbell & Fiske, 1958), evidence for construct validity is obtained when correlations ___ & ___ are higher than ___ & ___ from the previous Q

Multitrait-multimethod. 1 & 3 > 2 & 4

Psychometrics Flashcards

(45 cards)