Study Cards Flashcards

(189 cards)

1
Q

What is the APGAR test? What does it measure?

A

Evaluates health of baby based on appearance, pulse, grimace, activity, respiration

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Who is Alfred Binet? Why is he important?

A

French psychologist. Introduced the idea of intelligence testing

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is an operational definition?

A

The exact way a construct is measured, and what qualifies something as being in/out of a given category

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is an operational measure?

A

The exact way in which something is tested, and how it should always be tested (think procedure)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is a normative group?

A

Aka reference group. The sample of the population used to attain a base/average score

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is a normal distribution? What is it used for?

A

A distribution that, when mapped out, forms a bell curve. Depicting the mean, median, and mode as equal.
Used as the assumption of the layout of datasets in a group

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What are deviations?

A

The difference between the observed values and the mean

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What was the first version of the Binet-Simon intelligence test? What did results show?

A

A group of children were asked to perform a series of tasks to asses the knowledge they have acquired

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What were Binet’s original concerns with his intelligence test?

A

That it would be misused, and that children who were behind would be labeled “idiots” and unteachable.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What are some of Binet’s contributions to the natural and social sciences

A
  1. The development of scales of measurement
  2. The formal operationalization of constructs
  3. The development of non-verbal intelligence tests
  4. The proposal that intelligence is both acquired and innate
  5. The operationalization of terms and concepts
  6. The development of mental age
  7. The idea and use of normative groups
  8. Established the dominance of psychology in the field of testing
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Who is Francis Galton? What did he contribute to psychology?

A

He was a psychologist who had a fascination with data collection and variability. He started the development of large scale data collection

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is the law of error? Is it 100% true?

A

In any group or set of measurements, the outliers tend to cancel each other out, forming a normal distribution. It is not always true, but used as an assumption of truth

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What are distributions of error (deviations) and how do you calculate them?

A

A deviation shows how far, on a scale from -3 to +3, scores are away from the mean.
Observed score - mean = deviation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What are the first 3 principles of psychometrics?

A
  1. Defining and operationalizing is central to understanding if a claim is justifiable - always ask how a construct is measured and defined
  2. Variability exists everywhere - this is the essence of the law of error
  3. There is always a normative group - ask who is the sample and who created the sample
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

How does Anne Anastasi define a psychological test? Define the different aspects

A

An objective and standardized measure of a sample of behaviour
Objective: free of bias, clearly defined, little to no interpretation
Standardized: everyone gets the same test and is measured the same way
Sample of behaviour: This should be how they would act regularly, but the sample may not be representative

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

How does Lee Cronbach define psychological tests? How does it compare to the aspects of Anastasi’s definition?

A

Psychological tests are a systematic procedure for comparing the behaviour of two people.
Systematic vs standardized and objective: Cronbach recognized that tests cannot be 100% objective

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

What is psychometrics according to Thurstone? (2 parts)

A

A construction of instruments and procedures for measurement
The development and refinement of theoretical approaches to measurement

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

What is a construct? And how do they relate to the definition of psychometrics?

A

A construct is any idea or concept we’d like to measure
A. Constructing tests to measure these constructs
B. The methods and approaches must be refined when measuring these constructs

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

What are the 4th and 5th principles of psychometrics?

A
  1. Most (if not all) test questions, in any format, are imperfect indicators of the construct being measured
  2. Assigning numbers to data imposes a relationship among indicators that may not be justifiable
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

What does it mean to measure something? What are the 4 main scales of measurement?

A

The assigning of numbers to individual scores in a systematic way, according to one or another rule or convention
1. Ratio
2. Interval
3. Nominal
4. Ordinal

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

Explain the 4 main scales of measurement

A

Ratio: Equal intervals with a true zero
Interval: Equal intervals with NO true zero
Nominal: a categorical for, of organizing data
Ordinal: Determined rank or order, numbers have no value, intervals may be unequal

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

What is the 5th principle of psychometrics?

A

The leap of faith principle. By assigning numbers to data, you impose a relationship among indicators that might not be justifiable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

What does a distribution measure in psychometrics?

A

The performance of the entire test

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

What are the 3 factors that ALWAYS affect variability?

A

Systematic effect, systematic bias, random effect

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
What is systematic effect?
It is the primary cause of the score. How much of the construct you have
26
What is systematic bias? Give an example
An affect that effects a subgroup. EX: a delayed train effects commuters
27
What is random effect? Give an example
Random factors that affect the score of an individual, but have no relationship to the construct. EX: poor sleep
28
What is the difference between a formal and an operational definition?
A formal definition defines the construct for what it is while an operational definition defines how it is measured
29
How does Plato’s allegory of the cave help us understand constructs?
It captures the challenges we face when measuring constructs that cannot be directly seen. The shadows and they symptoms are observed and interpretations must be made
30
What is Novak’s classical test theory?
A persons true score is different from their observed score (due to error)
31
How do you calculate true score (Novak)?
T= X +/- E
32
How does Galton’s law of error play in classical test theory?
Error is just as likely to be positive as negative
33
What is item response theory?
An attempt to directly estimate an individual’s ‘true score’ by examining how individuals respond to questions - as a function of their ability
34
What does an item response graph show?
Shows the minimum required ability to get an answer correct
35
What is scientific model testing?
The evaluation of different approaches to find which one best explains the data in that case
36
How does Ockham’s razor fit with scientific model testing?
When there are two theories that explain the data equally well, the most simple explanation is most often better
37
What is the definition of criterion validity?
Criterion validity is the correlating of scores with some external criterion that is relevant to the purpose of the test
38
What is scale validation?
The methods used to test validity
39
What are the features of scale validation according to Rulon?
1. A test cannot be labeled as valid or invalid without respect to a given purpose 2. Assessments of validity must include an assessment of the content of the instrument and its relation to the purpose 3. Different forms of validity evidence are required for different types of instruments 4. Some measures are obviously valid (face validity) and require no further study
40
What are the 4 main domains of validity?
1. Content validity 2. Structural validity 3. External validity 4. Item validity
41
What are the 3 types of content validity?
1.Domain representativeness 2. Domain relevance 3. Face validity
42
What is content validity?
The content represented by the construct The degree to which a test measures all aspects of a criterion
43
What is domain representativeness?
The extent to which the questions/tasks/etc. measure the entire domain
44
What is domain relevance?
The extent to which the questions are relevant to assessing the construct
45
What is inclusionary criteria?
The signs and symptoms that MUST be present to have the construct
46
What is exclusionary criteria?
The signs and symptoms that CANNOT be present for the criteria
47
What type of validity includes inclusionary and exclusionary criteria? What is the interaction?
Domain relevance, these criteria are considered more important or more relevant
48
What is face validity?
Whether the test APPEARS to measure a given construct
49
What is structural validity?
The components that a test measures
50
What are the 2 components of structural validity?
1. Dimensionality 2. Order
51
What is dimensionality?
The number of factors the questions can be attributed to (pieces of the cake)
52
What is order?
The number of tiers that are needed to explain how the different factors are interrelated (layers of the cake)
53
What are the 4 factors of external validity
1. Criterion validity 2. Convergent and divergent validity 3. Predictive validity 4. Incremental validity
54
What is external validity?
The manner to which test scores are related to other constructs
55
What is criterion validity?
The extent to which test scores on questionnaire are related to some other outcome or condition
56
What is convergent validity?
The degree to which it a measure is correlated with other measures
57
What is divergent validity?
The degree to which a measure does not correlate with other measures
58
Explain the relationship chart of convergent and divergent validity?
Should converge: r>0.70 - good convergent validity, r<0.30 - poor convergent validity Should diverge: r>0.70 - poor divergent validity, r<0.30 - good divergent validity Anything in between is mild, and depends the theory.
59
What does a multi-trait multi-method matrix show?
It shows the correlates of different traits and how well they converge to measure the same construct
60
How do you read the multi-trait multi-matrix table?
The traits are listed down the side and along the top, grouped by test (method), and shows the correlation coefficient in the cross section of each individual trait
61
What are the factors of predictive validity? Define them
Concurrent (predicts a criterion measured at the same time) and prospective (predicts a criterion observed in the future) validity
62
What is incremental validity?
The degree to which a new (additional) measure adds the prediction of a criterion - beyond what can be predicted by some other measure
63
What are closed format tests?
Tests that have preset answers that cannot be changed or elaborated
64
What does it mean to have a dichotomous response?
The answer can only be yes or no
65
What is a likert scale response style?
A range of replies (typically from strongly agree to strongly disagree) in which a person rates how much they agree with a statement
66
What does it mean if a test response is rank-ordered?
The subject must rank each statement (example: most important - least)
67
What are open format tests?
The questions do not have predetermined responses, allowing for elaboration
68
What are open ended questions?
Questions that allow the participants to come up with their own responses
69
What is a visual-analogue response style?
When the respondents rate their level of a construct on a continuous scale
70
What are anchors? - give an example
They are statements that help specify what each number refers to in the real world 1. Rarely or never -
71
What is standard deviation?
The variability within a group - differences in individual scores
72
What is standard error?
Variability across distributions - differences between groups
73
What is estimated true score?
How ability and probability of correctness correlate
74
What is the mean - and the equation for it?
Mean: the average Mean = the sum of the population scores / the number of scores μ = ΣN/N
75
What is the equation for standard deviation?
Stand. Dev = the square root of the sum of scores - mean squared / number of scores σ = √ (x-μ)^2 / N
76
What is variance - and the equation for it
The differences in scores Variance - sum of (scores-mean) squared / total number of scores σ2 = Σ(x-μ)^2 / N OR σ2 = σ^2
77
What is the line of best fit?
A line through a scatter plot that minimizes discrepancy between observed and predicted scores Measures the degree of mis-fit between scores
78
What is a predicted score? How do you calculate it?
An estimated score for future tests Regression = y intercept + slope * X Y= aX+b OR Y= b0 + b1*X
79
What is effect size? How do you calculate it
The magnitude of differences between groups Effect size - mean of group 1 - mean of group 2 / standard deviation D = (x̄1 - x̄2) / s
80
What is sensitivity? How do you calculate it?
Of the people who actually have the condition, how many were designated to have it Sensitivity - A / (A+C)
81
What is specificity? How do you calculate it?
Of the people who don’t actually have the condition, how many were designated not to have it Specificity = D / (B+D)
82
What is positive predictive value? How do you calculate it?
Of the positive results, how many actually have the condition PPV = A / (A+B)
83
What is negative predictive value? How do you calculate it?
Of the negative results, how many really don’t have the condition NPV = D / (C+D)
84
What is base rate?
The guaranteed rate of prevalence in a population
85
What is a self report test?
A test completed by someone who reports their own experiences
86
What kind of test is the BDI? Key features
The beck depression index is a self report test that measures depression A unidimensional test, the use of cutoff scores indicates a discrete condition, any combination of items can be used to designate the presence of depression
87
What is an informant based test?
A test completed on behalf of someone else
88
What are projective tests?
Tests that measure SUBCONSCIOUS impulses, emotions, difficulties, etc
89
What are objective tests? Why were they created?
Tests that use standardized measures that allow little to no interpretation Created to account for the limits of projective tests
90
What is the RORS?
Projective testin which the patient interprets inkblots
91
What is an aptitude test?
A test designed to measure individual aptitudes, attitudes, preferences, etc
92
What is the MBTI? Key features?
The Meyers Briggs is a self report measure of psychological preferences in how people see the world and make decisions Measures innate aptitudes that are either mental or physical
93
What are structured tests?
Tests in which the questions and structure are predetermined, no changes or follow up can be made
94
What are semi structured tests?
Tests in which the procedure and questions are predetermined but the doctor is able to add in and take out questions up to their discretion
95
What is the SCID? Key features?
The structured clinical interview for DSM is a semi structured test that helps clinicians assess the presence or absence of psychiatric symptoms to render formal diagnoses It is semi structured, allowing for follow up and the adding/removing of questions
96
What is information variance
The way in which questions are asked and how tests are presented changes the amount of information that comes out of a test
97
What is criterion variance?
How a doctor interprets the information to make conclusions that can result in changes between scores
98
What are personality tests?
Tests designed to asses personality characteristics
99
What is the NEO PI-R? Key features?
A test that measures the degree of OCEAN - openness, conscientiousness, extraversion, agreeableness, neuroticism Uses a likert scale for questions, multidimensional- assesses each personality characteristic based on multiple smaller factors
100
What is OCEAN in the NEO PI-R?
Openness, Conscientiousness, Extraversion, Agreeableness, Neuroticism
101
What is the MMPI? Key features?
Minnesota Multiphastic Personality Index. Dsigned to address existing concerns on existing self-report measures, that assesses psychopathology and personality in a clinical setting, prioritizing criterion validity over face validity
102
What is the act frequency approach?
A measure of how behaviour and personality traits correlate
103
What is the BAI? Key features?
The Behavioural Acts Inventory. Designed to measure actions and behaviours to identify the correlates with personality
104
What are normative tests?
Tests designed to measure quantitative personality characteristics, comparing them to patterns of normality
105
What are the WAIS and WISC
Intelligence tests for adults (WAIS) and children (WISC) which evaluates intelligence and cognitive ability
106
What are achievement tests?
Tests that measure developed skills or knowledge
107
What is the GRE? Key features?
Graduate Record Examination that measures the acquired knowledge of students Evaluates verbal reasoning, quantitative reasoning, analytical writing, critical thinking, and knowledge
108
What makes a test reliable?
When it produces the same score continuously over time
109
What does reliability measure?
How close our observed score approaches the true score
110
What is the expected score?
An estimate of true score
111
How do you calculate true score?
(E)rror*(x)observed=estimate of True
112
What is the ‘fast move’ of classical test theory?
If error is uncorrelated with test scores, then error from two different tests is also uncorrelated, meaning errors from one test will be uncorrelated with the True Score of another test
113
What are the 5 types of reliability?
Test-retest, Inter-rater, Parallel forms, Split half, Internal consistency
114
What is test-retest reliability?
The ability for a test to produce consistent scores from one time to another
115
What is inter-observer reliability?
The degree to which different observers give consistent estimates of the same construct
116
What is parallel forms reliability?
The consistency of two separate but similar tests
117
What is split half reliability?
The consistency between two halves of the same test
118
What is internal consistency (reliability)?
The consistency of the results across items of a test
119
How do you estimate reliability?
By comparing two different groups of items
120
What are the ways you can estimate reliability? Explain
Within a single test - one part vs another part Across multiple test - test 1 vs test 2
121
What is used to measure internal consistency?
Cronbach’s alpha (a) and Cohen’s Kappa (k)
122
How do you calculate Cohen’s kappa (k)?
(Observed agreement - chance agreement)/(1-chance agreement)
123
How do you calculate chance agreement?
[probability of ‘yes’ from DR.a/probability of ‘yes’ from DR.b] X [probability of ‘no’ from DR.a / probability of ‘no’ from DR.b]
124
How do you calculate observed agreement?
(‘Yes’ from both + ‘No’ from both) / N
125
What is item analysis?
The analysis of how each individual item on a test performs
126
What assumptions are made when calculating true score?
That T = the average score on a test if taken repeatedly, that error is random and independent
127
What score would be ‘excellent’ for reliability?
a > 0.9
128
What score would be ‘good’ for reliability?
0.9 > a > 0.8
129
What score would be ‘acceptable’ for reliability?
0.8 > a > 0.7
130
What score would be ‘questionable’ for reliability?
0.7 > a > 0.6
131
What score would be ‘poor’ for reliability?
0.6 > a > 0.5
132
What score would be ‘unacceptable’ for reliability?
0.5 > a
133
What is item analysis?
The analysis of how each individual item performs and the correlation of individual items with the total score
134
What is item analysis used for?
To determine which items are the best measurement of a construct
135
What is item total correlation?
An assessment of total score - the cumulative degree of agreement for a construct
136
How do we calculate item total correlation?
Each individual score is averaged (across a ‘group’) for an item total. Each average item total is added and averaged for a total score. This average item agreement is plotted with the total score to find r (total, item)
137
What is distinctiveness in item analysis?
When a items are more highly correlated with one factor than the others
138
What is the item response model?
The probability of choosing an option correlated with the level of a construct required to choose a given option
139
How can the item response model be explained?
The amount of knowledge you need to get an answer right.
140
What are the features of item response curves?
Discriminability, difficulty, precision
141
What is discriminability in item response?
The slope. The point at which changes are easily observed
142
When is discriminability better and worse?
Better: steep slopes Worse: flattened regions
143
What is difficulty in item response?
How much of the construct is needed before you choose that option (answer the question correctly)
144
How do you observed the difficulty?
Using the 0.5 threshold. The point on the x-axis at which the curve is at 0.5
145
What is more difficult and less difficult in item response?
More: when the slope is very shallow for a while, or it begins further down the x-axis Less: when the slope begins early on the d-axis and/or is very steep right away
146
What is precision in item response?
An estimate of your level of ability
147
How do you determine precision in item response?
Using the area under the curve. The space between -2 to 2 (95%)
148
What does the 95% precision tell us?
Based on the option picked, we can infer with 95% certainty that their severity level falls within the 95% of the area under the curve
149
In what ways can you describe a curve in item analysis?
Is it flat? Sharp? Where is the peak (most common area) Does one curve override another? Is a curve high for too long?
150
What is principle components analysis (PCA)?
The examination of the degree to which individual items are related to one or more underlying dimensions of variation (factors)
151
What are the goals of PCA?
Variable reduction Structural analysis
152
Why do we use PCA?
To reduce the redundancy in tests and see if the same construct can be better explained by a short form test
153
What is a factor pattern matrix?
A visual representation of the relation of items to the factor(s) on a test
154
What is the example of a factor pattern matrix that we have seen in class?
The red and blue squares of the NEO PI-R
155
Using a factor pattern matrix, how do we know if the items are good indicators of the factor(s)?
Strong blue squares Using eigenvalues
156
What are Eigenvalues?
Numbers that show the proportion of variance that each factor contributes
157
What is a good eigenvalue?
Any above 1
158
When creating a short form, how do we know what eigenvalues to get rid of?
The ones under 1 or where the curve goes flat, the smallest correlation, if an item correlates to multiple factors,
159
How do we observe incremental validity?
By comparing two measures - an existing and a new -to a gold standard
160
How can incremental validity be represented?
Graphically, through models
161
What are the sections of a graphical representation of incremental validity?
Just the gold standard, measure 1, or measure 2 The single overlap: GS-M1, GS-M2, M1-M2 The total overlap
162
What is model testing in terms of incremental validity?
The ability to create a predicted score on the gold standard, based on observations on the other two+ measures
163
Based on model testing, how do we know if a test has incremental validity?
If adding this scale to the calculation of predicted score on the GS closes the gap between the predicted and observed score, there is incremental validity
164
What is the theory about adding measures when model testing for incremental validity?
The more tests you add, the closer you SHOULD be to the observed score on the GS
165
How do you calculate predicted score when model testing for incremental validity?
SSE= ΣN(y-ŷ)^2 Sum of squares of error = sum of (observed -predicted scores) squared
166
What are the two prediction models when model testing for incremental validity? Explain
1. Benchmark - the existing tests vs the GS 2. (Existing test + new test) vs GS- does adding your test contribute anything
167
How can model testing be represented by the line of best fit?
Data points are the observed scores on the GS Each measure has its line of best fit The space between a point and the line shows the discrepancy between observed and predicted scores
168
How can you make your measure look better in model testing? Why?
Compare it to a poor benchmark If the benchmark does a poor job when compared to the GS, it will make your scale look better
169
What are the outcomes of model testing?
1. Both measures have incremental utility, one is not better than the other - retain both 2. One measure has more incremental utility than the other - keep the better measure 3. The measures do not contribute uniquely - choose one 4. The measures have completely unique proportions of variation - retain both
170
How would you write a comparison of two tests? USE- GS:HRSD, M1: CESD, M2:BDI
The CESD accounts for variance in the HRSD above and beyond the variance accounted for by the BDI
171
What is confirmatory factor analysis?
The examining of the structure of questionnaires and decision of what model best fits the data
172
What is used for confirmatory factor analysis?
Structural equation models
173
What is a structural equation model?
The imposition of a model on the data to evaluate fit
174
What is a latent variable?
The factors of a construct that cannot be directly observed, they are inferred using related questions
175
What are the observed indicators?
The questions
176
What are factor loadings in structural equation models?
Values that show how the latent variables relate to each other, and how the questions relate to the variables
177
In a visual SEM, what are the different parts?
Latent variables - circles - factors Factor loadings - top r score - correlations Error - bottom r scores
178
What is the saturated model of SEM?
Explanatory model in which EVERYTHING is related The benchmark
179
What is the independence model in SEM?
A model in which none of the variables are correlated
180
For the saturated and independence models of SEM, r=what?
Saturated: r = 1 Null: r = 0 Other: 1>r>0
181
How does dimensionality factor into SEM?
Models can be uni-factoral and multi-factorial
182
What is a uni-factorial model in SEM?
Only one latent variable (circle)
183
What is a multi-factorial model in SEM?
Multiple latent variables (circles)
184
What is a nested model in SEM?
A model within another
185
How do you calculate the fit of a model in SEM?
By comparing the discrepancy between predicted and observed values to find which pattern of correlations is actually close to what has been observed
186
What are the 3 big psychometric wrongdoings?
1. Creating a test that does not account for the behaviours of the target population 2. Not having enough items 3. Not using a test how it was intended
187
What is an example of no accounting for the behaviour of the target population?
Teenscreen - used to screen teens for those at risk of suicide, but the at risk ones typically don’t show up
188
Why are 1 item tests not a good measure?
Responses might be wrong, there is nothing else to verify
189
What is an example of not using a test how it was intended?
Using the WISC to identify children that are gifted