Test Construction Flashcards

Question

2 types of single-subject designs

Answer 1

Case study (describes an individual by using tests or naturalistic observation) or experimental (determine how the introduction of a factor affects behavior)

Answer 2

Autocorrelation (when measured on the same variable multiple times, the variable becomes correlated with itself); Time-intensive (multiple assessments or intense observations are time-consuming); Generalizability (may not generalize); Practice effects (scores may increase from practice)

Answer 3

An approach to personality that focuses on groups of individuals and tries to find the commonalities between individuals.

Answer 4

Very high multiple correlations among some of all predictors in an equation.

Answer 5

Systematic empirical exploration or relationships; deductive, rather than inductive. Involves the collection and statistical analysis of quantitative data, whose results can often be generalized.

Answer 6

Refers to the consistence or repeatability of data; pertains to quantitative research

Answer 7

Test for differences in the mean scores of groups based on one or more variables. DV must be continuous and IV must be categorical. Tests the null hypothesis that the means of the group are equal.

Answer 8

Independence of observations (each participant in only one cell); Normality (distribution of scores cluster around the mean with fewer observations fallen farther from the mean; AKA bell-chaped curve); and Homogeneity of Variance (variance of every group is same as variance of every other group, AKA homoscedasticity.

Answer 9

One-Way ANOVA (test the main effect of one IV); or Two-Way ANOVA (tests main effects of first IV (A), second IV (B), and the interaction of the two (A*B))

Answer 10

The effect of one IV on the DV differs depending on the level of the other IV

Answer 11

Ratios of effect variance to error variance. In One-Way ANOVA, there is one F-ratio of the effect of the IV. In a Two-Way ANOVA, there are three F-ratios (main effect A, B, and interaction effect)

Answer 12

Includes interaction effects; increases power; reduces familywise error rate.

Answer 13

The violation of the assumption of homogeneity, such that the variances of the groups are not equal. ANOVA is robust to such a violation, if there are no outliers, sample sizes are large and fairly equal, sample sizes within levels are relatively equal, and the hypothesis is two-tailed.

Answer 14

Statistical method of testing for an association between categorical variables; specifically, it tests for the equality of expected and observed frequencies or proportions.

Answer 15

An extension of ANOVA methods to cover cases where there is more than one DV and where the DVs cannot simply be combined. The MANOVA combines the DVs in such a way as to maximize differences between groups. In addition to identifying whether changes in the IV have a significant effect on the DV, the technique seeks to identify the interactions among the IVs and the DVs, if any.

Answer 16

A general linear model with one continuous DV and one or more IVs, plus a covariate. ANCOVA is a merger of ANOVA and regression for continuous variables. ANCOVA test where IVs have an effect after removing the variance for which one of more covariates account; the inclusion of covariates can increase statistical power because it accounts for some of the variability.

Answer 17

Continuous variables assume an intermediate value between two other values and there can be an infinite amount of possible values between those two values. Dichotomous variables have only two values (yes or no)

Answer 18

Examines the relationship between a dichotomous variable and a continuous variable. Can only be used with TRUE dichotomous variables.

Answer 19

Examine the relationship between an artificially-created (made form a continuous variable) dichotomous variable and a continuous variable

Answer 20

This correlation coefficient is used when measuring the relationship between two ranked variables giving a rank-order correlation.

Answer 21

This correlation coefficient is used when measuring the relationship between two continuous variables

Answer 22

Measures the amount of variance in a set of tests or items that can be accounted for by an underlying factor. Used in factor analysis and principal components analysis. Often converted into percentages to determine percentage of variance in a set of test items accounted for by an underlying factor. Factor analysis will provide same # of eigenvalues as there are items or tests.

Answer 23

An underlying factor is explaining a large amount of variance in a set of items or tests.

Answer 24

Deal with formulating conclusions and making inferences from collected data

Answer 25

A manner of regression analysis in which one or more predictor variables are used to predict a single criterion variable.

Answer 26

A statistical technique that identifies underlying patterns in a data set.

Answer 27

Identify underlying factors that are responsible for variation in a set of items, variables, or tests; Reduce a large set of variables to a smaller number of underlying factors.

Answer 28

Produced by factor analysis (along with eigenvalues) and provide a measure of the correlation between an item and an underlying construct. Higher factor loadings indicate that the underlying factor is accounting for a large amount of variance in the item.

Answer 29

To aid in interpretation of factor loadings from a factor analysis

Answer 30

A statistical method utilized to predict group membership.

Answer 31

A correlational technique that tests directional hypotheses among multiple IVs and multiple DVs simultaneously.

Answer 32

Changes the relationship between a predictor and a criterion variable; equivalent to an interaction effect in ANOVA; background variables such as gender, and SES, are common moderators. When moderator variables are present, a test has differential validity (validity differs depending on the level of the moderator, such as whether one is male or female)

Answer 33

Administering a test to a new sample, one that is different from the original validating sample, so as to evaluate the test's validity on another sample of subjects

Answer 34

A misleading increase in a test's validity, in which raters give subjects scores on the criterion variable after being privy to the subject's scores on the predictor variable.

Answer 35

A variable that is measure on either nominal or ordinal scales

Answer 36

A result of corss-validation in which there is a decrease in the validity coefficient due to sample differences.

Answer 37

A variable that affects the dependent or criterion variable, but is of now interest to the researcher

Answer 38

AKA directional test; test for rejection in only one tail; greater chance of rejecting null hypothesis

Answer 39

AKA non-directional test; tests for rejection in both tails; able to reject null hypothesis in both tails, but each tail has a greater chance of rejecting the null hypothesis

Answer 40

Standardized score that allows for a participant's score to be compared to the norm group. Mean of 50 with a standard deviation of 10

Answer 41

Mean of 10 and standard deviation of 1

Answer 42

68% of scores, or T-score between 40-60.

Answer 43

95% or T-scores between 30-70

Answer 44

Mean of 5 and standard deviation of 2.

Answer 45

An extension of ANOVA. Identifies trends in data when the IV varies from highest to lowest. (Ex. if one group is given 5 mg of meds, a 2nd group gets 10 mg and a third group is given 15 mg.

Answer 46

Means are arranged in a line

Answer 47

Means arranged in a U shape

Answer 48

Means arranged around two points of inflection

Answer 49

Means arranged around three points of inflection

Answer 50

Means arranged around four points of inflection

Answer 51

Deals with the correlation between an optimally weighted linear combination of predictors and a criterion. (Multiple regression deals with defining optimal weighting and is a test of prediction).

Answer 52

A universal measure of relationship that can be used regardless of the form of the relationship; it is obtained by computing the variance in Y about any curve of the relationship. Eta is a universal measure of relationship because it (1) applies regardless of the form of the relationship, (2) can be used with either a predicted cure of a relationship or a best-fitting curve obtained after the data are collected, and (3) applies equally well to continuous or categorical independent variables.

Answer 53

Rejecting the null hypothesis when it is true. Usually set to 0.05 in the social sciences

Answer 54

Failing to reject the null hypothesis when it is false. Related to Type II error is power (1-B), the probability of rejecting the null hypothesis when it is false.

Answer 55

Rejecting the null hypothesis, but for the wrong reason. Because of sampling error, two groups can be correctly identified as being significantly different, but the direction of the difference is the opposite of reality. These are relatively rare.

Answer 56

Type II Error becomes larger

Answer 57

Sample size (larger samples increase power); Alpha (smaller alpha levels decrease power, e.g. 0.01 or 0.001 rather than 0.05); Effect size (greater effect sizes increase power, in other words, larger difference between the two groups); and Test used (different statistical tests have more power, two-way ANOVA is more powerful than a one-way ANOVA)

Answer 58

Compares the test-taker's performance to an objective standard of achievement. Can be Domain-Referenced (examines the degree to which the test taker has mastered a specific area) or Objectives-Referenced (examines the degree to which the test taker has achieved instructional objectives)

Answer 59

Compares test-taker's performance to other test-taker's performance. Requires large standardized sample that is representative of population.

Answer 60

High correlations between measures of the same trait

Answer 61

Low correlations between measures of different traits

Answer 62

Correlation between two tests that measure one trait using one method

Answer 63

Correlation between two tests that measure one trait using different methods

Answer 64

Correlation between two tests that measure different traits using one method

Answer 65

Correlation between two tests that measure different traits using different methods

Answer 66

Developed by Cronbach and Meehl (1995) stating that in order to prove that a given measure had construct validity, a "lawful network" for the measure had to be developed; this network includes the theoretical framework for what the instrument is attempting to measure (the construct), an empirical framework for how the construct will be measured (observable manifestations), and the interrelationships among and between the the two frameworks.

Answer 67

Synonymous with IV; the variable that is sued to predict variance in the criterion; plotted on the X axis

Answer 68

Synonymous with DV; variance of the criterion is predicted by the predictor; potted on the Y axis

Answer 69

Beta (B) weights (strength of a predictor when all other predictors are held constant), R2 (unique predicitive strength of a predictor); Zero-order correlation (relationship between predictor and criterion ignoring all other predictors); Multicollinearity (i.e. highly correlated predictors, may not reduce predictive ability of predictors)

Answer 70

Correlation between predictor and criterion; squared validity coeffecient indicates the proportion of variance in criterion that is accounted for by the predictor; Greater ranges of scores in both predictor and criterion increases validity coefficient, restricted range decreases validity coefficient; Few validity coeffecients exceed 0.60

Answer 71

Theoretical standard that researchers seek to understand

Answer 72

Operational or actual standard that researcher actually assess

Answer 73

Portion of the conceptual criterion that is not measured by the actual criterion

Answer 74

Degree of overlap between the actual criterion and the conceptual criterion

Answer 75

Available criterion measure is a composite of separable attributes

Answer 76

A criterion that inaccurately differentiates between groups, resulting in majority of members being overrepresented in comparison to minority groups

Answer 77

The number of individuals in a given group who exceed cutoff on both predictor and criterion

Answer 78

The number of individuals in a given group who exceed cutoff on predictor but fail to exceed cutoff on criterion.

Answer 79

Number of individuals in a given group who fail to exceed cutoff on both predictor and criterion

Answer 80

The number of individuals in a given group who fail to exceed cutoff on predictor but exceed cutoff on criterion

Answer 81

Error in the employed values of a variable due to the presence of distorting influences on the assessment, such as momentary distractions, error in recording or understanding, and influences of other variables on responses to particular items. These are uncorrelated with the "true scores" by definition and treated as "random". A reduction of the correlation coefficient because of error is known as shrinkage.

Answer 82

History (any event between pretest and posttest), maturation (natural changes in participants), testing (practice effects), mortality (dropping out), selection, regression effects, demand characteristics

Answer 83

Simples version of this design in which a baseline (A) is tracked, and then some treatment (B) is applied; if there is a change then the treatment is said to have effect. Weak design because it is subject to many different hypotheses.

Answer 84

AKA quai-experiemental design, refers tot he pretesting and posttesting of one group of subjects at different intervals. The purpose might be to determine long-term effects of treatment, and, therefore, the number of pretests and posts can vary from one to many. Sometimes there is an interruption (follow up test) to assess strength of treatment over time.

Answer 85

Usually used for multiple choice exams; Corrected score = R-W/(n-1). R is the number of right answers obtained, W is the number of wrong answers, and n is the number of possible answers per question.

Test Construction Flashcards

(109 cards)