Flashcards in Stats Deck (127):

1

## 2 Basic Mathematical Principles important for EPPP

###
Squaring Decimals

Square rooting Decimals

2

## Critical Factor in determining the type of stat test to be used

### Type of data, particularly for the DV

3

##
4 Types of Data

*NOIR

###
Nominal

Ordinal

Interval

Ratio

4

## Nominal data

###
Non ordered categorical data, assigned a number for identification purposes but no further meaning to numbers

Sex, political party, race

Can compute percentages

5

## Ordinal Data

###
Ordered categorical data

Ex-grouped according to SES

6

## Interval Data

### Numerical scores, but no zero score, or zero is not absolute (e.g. temp in celcius or farenheit)

7

## Ratio data

###
Numerical score, has an absolute zero

Ex- money in bank, EPPP score, weight

Means can be calculated as well a comparisons across values

8

## 2 Broad classes of statistics

###
Descriptive

Inferential

9

## With descriptive stats, the data collected is ____, whereas with inferential stats, the goal is to make inferences about the ___ from the ___

###
simply described

population

sample

10

## 2 basic groups of Descriptive stats

###
1. Stats on on whole group's data

2. Stats describing ind's score relative to the group

11

## Descriptive stats on group data include

###
measures of central tendency

measures of variability

Graphs

12

## Measures of Central Tendency

###
Mean-avg score

Median- score at 50th percentile

Mode-most frequently occurring score

13

## The best measure of central tendency is typically the ___

### mean

14

## If data is skewed (extreme scores present) the most accurate measure of central tendency is ___

### median

15

## Measure of Variability

###
Standard Deviation-avg spread from the mean

Variance-

Range-diff between lowest & highest score obtained

16

## Standard deviation is the __ __ of the variance

### square root

17

## Variance is the standard deviation

### squared

18

## Data that are not normally distributed are ___ or ___, meaning that scores are not equally distributed above & below the mean

### skewed, kurtotic

19

## In a positive skew, how are measures of central tendency impacted?

### Mode is lowest, mean is highest

20

## In a negative skew, how are measures of central tendency impacted?

### Mode is highest, mean is lowest

21

## Leptokurtic distribution

### Very sharp peak

22

## Platykurtotic Distribution

### Flattened

23

## Normal Distribution

### Bell shaped

24

## Norm referenced score

### provides info as to how a person scored relative to the group

25

## The most informative norm referenced score is the ___ ___.

### Percentile rank

26

## Graphs for percentile ranks are ___ or ___

### flat, rectangular

27

## Standard scores

### based on standard deviation of the sample

28

## Examples of standard scores

###
z-scores

t-scores

IQ scores

SAT scores

EPPP scores

29

## z-score

###
most basic standard score

corresponds directly to standard deviation units, mean of 0, SD of 1

Ex- z score of +2 means the score is 2 SDs above the mean

Shape of z score distribution always same as raw score distribution

30

## z-score formula

### z= score - mean/standard deviation

31

## Parameters vs. Statistics

### Population values vs Sample Values

32

## mu

### population mean

33

## sigma

### population standard deviation

34

## Sampling Error

### Samples are not perfectly representative of the population (sample means not identical to pop mean)

35

## Standard Error of the Mean

### The avg amount of deviation in a distribution of sample means

36

## Standard Error of the Mean formula

### SD population/square root of N

37

## Central Limit Theorem

###
If an infinite number of equal sized samples are drawn from a population, the means of these samples will be a normal distribution.

The mean of the means (the grand mean) will equal the population mean

The standard deviation of the means will equal the SD of the population divided by the square root of the sample size (standard error of the mean)

*the shape of a sampling distribution of means approaches normality as sample size increases

38

## Standard Error of the mean helps up to determine

###
If an obtained mean is most likely due to treatment/experimental effects vs chance (sampling error)

Ex: if SEM of IQ is 3 and testing the effectiveness of a IQ enhancement program yields a mean sample IQ of 103 this difference is likely due to chance. as opposed to sample IQ of 110, which would be 3 standard errors away from the mean (meaning that this is likely statistically significant)

39

## Key concepts in hypothesis testing

###
Null Hypothesis

Alternative Hypothesis

Rejection Hypothesis

40

## Null Hypothesis

###
States that there are no differences between groups, experimental research always hopes to reject the null hyp

*results almost always stated in terms of the null hypothesis

41

## Alternative Hypothesis

### Directly states that there are differences between groups

42

## Rejection region/Region of Unlikely Values

### The tail end of the curve; unlikely that a researcher will obtain means in this region simply by chance. Suggests that treatment did have an effect & null hyp is rejected

43

## Size of the rejection region corresponds to the ___ ___

###
alpha level

Ex: alpha of .05 indicates that rejection region is 5% of the curve

44

## Acceptance/Retention region

### No sig diffs between groups, null hyp is accepted

45

## 2 Factors contributing to conclusions re: stat significance

###
1. Treatment Effects

2. Sampling Error

46

## The only way to know w/certainty if a tx effect is significant is to:

### Replicate study numerous times

47

## 4 Possible Outcomes in terms of Correctness of Research Findings

###
Type I Error

Type II Error

Power

Correct Decision w/no name

48

## Type I Error

### Null is rejected, but later turns out to be a mistake, or diffs are found when they do not actually exist

49

## The size of ___directly corresponds to likelihood of making Type I Error

### Alpha

50

## Conventional cutoff for alpha (.05, .01. .001) indicate that:

### obtained means are different enough to be attributed to tx effects and not to chance

51

## Type II Error

### Null is accepted, but this is a mistake, or no diffs are found where differences do actually exist

52

## The value of ___ corresponds to the probability of making Type II error

### beta

53

## Power

###
Null is rejected, and this is correct

Defined as the ability to correctly reject the null

54

## Factors affecting Power

###
Increased w/:

Large Sample Size

Small random error

Magnitude of intervention is large

Statistical test is parametric

Test is one tailed

55

## ___ has the most sig measurable effect of power; as ___ increases, so does power.

### Beta; Alpha

56

## Correct Decision w/no name

### Null is accepted and this is correct

57

## In determining the appropriate statistical test, you must first determine:

### what type of question is being addressed in the research

58

## Commonly asked questions in research

###
Questions of Difference between groups

Questions of Relationship & Prediction

Questions of Structure or Fit

59

## Steps to Select the Appropriate Test of Difference

###
1. Type of Data of the DV (Nominal, Ordinal, Interval, Ratio)

2. Number of IVs and Levels of IVs

3. Sample/Group Independence vs. Correlation

60

## If the DV is Nominal or Ordinal, a ___ test test will be used

### non-parametric, for example chi-square, Mann-Whitney, Wilcoxin

61

## If the DV is interval or ratio data, a ___ test will be used

### parametric, for example t-test or ANOVA

62

## If there is more than one DV (interval or ratio data), a ___ will the stat test of choice

### MANOVA

63

## Independent Groups

### Subjects randomly assigned to conditions or are grouped based on a pre-existing characteristic (gender or ethnicity)

64

## 3 Factors Resulting in Correlated Groups

###
1. Repeated measures

2. Subjects matched prior to assignment to groups (i.e. matched on income, IQ, etc)

3. Inherent relationship between subjects (twins, siblings, spouses)

65

## In order to use a parametric test, what 3 assumptions must be met?

###
1. Data is interval or ratio

2. Homoscedasticity-similar variability or SDs in the different groups

3. Data must be normally distributed

*If one of these is not met, stat of choice will typically be one use for ordinal data

66

## Assumption for the chi square test

###
Non parametric test

Answer: Independence of observations (no repeated measures design)

67

## Degrees of freedom

###
# of possible variations in outcome that can be obtained

*calculated differently based on the type of stat test

68

## Single Sample Chi Square

###
Nominal data collected for 1 IV

Ex: 100 psychologists sampled as to their political affiliation (political party seen as columns or groups)

69

## Single Sample Chi Square degrees of freedom formula

### df= #columns - 1

70

## Multiple Sample Chi Square degrees of freedom formula

###
Nominal data collected for 2 IVs

df= (#rows - 1) x (#columns -1)

71

## Standard Error of the mean has a direct relationship with the ____ ____ ____ and an indirect relationship with ___ ___

###
population standard deviation

sample size

*SEM increases as SD increases and sample size decreases

72

## 2 Way ANOVA calculates:

### calculates 3 F ratios (one for each main effect and one for the interaction)

73

## df formula for single sample t test

###
df=N - 1

(N- number of subjects)

74

## when do we use a one sample t test?

###
interval or ratio data collected for one group of subjects

Ex-BDI obtained for 30 subjects

75

## when do we use a t test for matched or correlated samples?

###
interval or ratio data collected for 2 correlated groups of subjects

Ex- BDI obtained for 2 matched groups of 15 people (so 30 total)

76

## df formula for matched samples t test

### df= #pairs - 1

77

## when do we use a Multiple sample chi square?

###
nominal data collected for 2 IVs

Ex- 100 psychologists sampled as to voting pref and ethnicity

78

## when do we use a t test for independent samples?

###
interval or ratio data collected for 2 independent groups of subjects

Ex-BDI obtained for 2 group of 15 randomly assigned subjects (30 total)

79

## df formula for t test for independent samples

### df= N -2

80

## One Way ANOVA

###
interval or ratio data collected for more than 2 groups of subjects

Ex- 60 subjects assigned to one of 4 tx groups

81

## Formulas for df in one way ANOVA

###
df total= N - 1

df between groups= #groups - 1

df within groups= dftotal - dfbetweengroups

82

## Formula for Expected Frequency in Chi Square when N & the groups are given

###
Expected Freq= N/total # of cells

Ex- 4x2 chi square with a sample of 160

total # of cells is 8

160/8=20

expected freq in each cell=20

83

## Formula for expected freq in any cell when data are given for a chi square

### Expected freq for any cell= (sum of the row x sum of the column)/ N

84

## When do you use a one-way ANOVA?

###
when more than 2 groups are being compared on one IV

Ex- comparing 4 diff depression txs

preferable to using multiple t tests to avoid increasing probability of Type I error

85

## Stat for One Way ANOVA

###
F Ratio

Want to find high variability between groups and low within

86

## Formula for F Ratio; Guidelines for significance

###
F ratio= Mean Square between groups/Mean Square within groups

*Mean square is measure of avg variability

F Ratio= 1, no significance

Typically sig when above 2.0

87

## A significant F Ratio with an ANOVA means:

### There are differences between groups, but you do not know which ones. Must perform post hoc analyses

88

## Post hoc analyses following significant ANOVA involve:

### many pairwise comparisons

89

## Possible post hoc tests following sig ANOVA, in order from most to least protection from Type I error

###
Scheffe

Tukey

Duncan

Dunette

Neuman-Kuels

Fisher's least sig diff

*reverse order for protection from Type II error

90

## When to use a Two Way ANOVA & main advantage over 2 separate one way ANOVAs

### Groups are being compared on 2 IVs (ex- sex and treatment); examines main effects for each IV and interaction effects

91

## In a 2 way ANOVA, if there are sig main & interaction effects, which is interp first?

### Interactions

92

## To calculate Main & Interaction effects of a 2 Way ANOVA on the test you:

###
1. Find the sum of each column (if sums are different, there is a main effect for that IV)

2. Find the sum of each row (if sums are different, there is a main effect for the second IV)

3. Divide the table into squares and the diagonal means for each square (if sums are diff, there is an interaction effect for those IVs)

93

## When do we use a MANOVA?

### When there is more than one outcome measure or DV

94

## When an IV is quantitative, how do we analyze the data?

###
Trend Analysis

Ex: IV is dosage of a drug, length of time, etc

Data is non-linear, so less interested in group diffs but trends in the data

95

## Stats depicting relationships between variables are termed ____, while stats that predict are termed ___ or ___

###
correlations

regressions/analyses

96

## Bivariate correlations

### look at relationship between variables, X (predictor) and Y (criterion)

97

## Range of Correlation Coefficient

### -1.0 to +1.0 (describes strength and direction of the correlation)

98

## Graphic depictions of correlations

### data point reps ind's score on both X and Y, the closer the points are clustered, the stronger the correlation

99

## Correlation coefficient tells you

###
how the variability or spread of Y scores for any given X score compares to the total variability of Y scores

Ex- if there is no correlation at all (coefficient of 0.0), for any given X, the range of possible Y could be anywhere from bottom to top of possible scores

100

## Coefficient of Determination

###
correlation coefficient squared

Represents amount of variability in Y that is explained or accounted for by X

Ex- correlation coefficient of .50 for level of education and income

.5 squared= .25, meaning that 25% of variability in income is explained by education level

101

## Simple Linear Regression Equation

### Derived anytime the correlation coefficient is other than 0.0, based on line of best fit through the scatter plot of scores

102

## 3 basic assumptions of bivariate correlations

###
Linear relationship between X and Y

Homoscedasticity-similar spread of scores across scatter plot

Unrestricted range of scores on both X and Y

103

## Impact of restriction of range

### Correlation, reliability and validity is always dramatically lower when the range of either variable is restricted

104

## For Bivariate correlations, if both X and Y are interval or ratio data, you use

### Pearson r

105

## For Bivariate correlations, if both X and Y are ordinal (rank ordered) data, you use

### Spearman's rho or Kendall's Tau

106

## Zero Order Correlation

###
most basic correlation

analyzes rel btwn X and Y when no extraneous variable affect relationship

107

## Partial Correlation ( First Order)

###
examines rel btwn X and Y when effect of a third, confounding variable is removed

Ex: examine relationship btwn GPA & SAT scores after removing impact of parental education

108

## Part (Semipartial) Correlation

### examines rel btwn X and Y when the effect of a third, confounding variable is removed from only one of the orig variables

109

## Moderator Variable (in Bivariate Corr)

###
A variable that influences the strength of relationship between predictor & criterion

Ex- relationship between income & smoking may be different strength at diff ages

110

## Mediator Variable (in Bivar Corr)

###
Explains why there is a rel between predictor & criterion

Ex- if effect of education removed from link btwn SES and smoking, corr goes down to almost 0

111

## Multivariate Tests of correlation & prediction

###
Involve several predictors or IVs & one or more criterions or DVs

Multiple R

Multiple Regression

Canonical R & Canonical Analysis

Discriminant Functional Analysis

Loglinear Analysis

Path Analysis

Structual Equation Modeling

112

## Multiple R

### Correlation btwn 2 or more IVs and one DV, where Y is always interval or ratio data and at least one X is interval or ratio data

113

## Coefficient of Multiple Determination

### Index of amt of variability in criterion Y that is accounted for by all predictors (Xs).

114

## Multiple Regression

###
Uses Multiple R to derive equation that allows prediction of the criterion based on values of the predictors

*To optimally predict, want low corr btwn predictors (Xs) and moderate to high corr btwn each predictor and the criterion

*Compensatory technique b/c low scores on one predictor can be compensated for by high scores on another

115

## Multicollinearity

### Problem that occurs w/multiple regression equation when predictors are highly correlated with one another

116

## 2 most common subtypes of multiple regression

###
Stepwise-computerized, forward or backward

Hierarchical-researcher controls, adds variables to regr analysis in order most consistent w/theory proposed

117

## Canonical R & Canonical Analysis

###
Extension of multiple R

Corr btwn 2 or more IVs (rpedictor set) and 2 or more DVs (criterion set)

*compensatory approach

118

## Discriminant Fx Analysis

###
Used when there are 2 or more predictors (Xs) and one nominal (categorical) criterion variable

Ex: predicting likelihood of passing or failing EPPP (categorical Y) based on time spent studying and number of practice tests completed

*compensatory

119

## Loglinear Analysis

###
Used to predict categorical criterion (Y) based on categorical predictors (Xs)

Ex: type of grad program (categorical X) and sex (categorical X) used as predictors for passing or failing EPPP (cat Y)

*compensatory

120

## 2 Approaches that apply correlational techniques to causal modeling

###
Path Analysis

Structural Equation Modeling

121

## Tests of Structure

###
determine which variables in the set fit best together or form coherent subsets that are relatively independent of one another

Includes:

Factor Analysis, Cluster Analsysis

122

## Factor Analysis

### Extracts as many sig factors from the data (strongest to weakest), stronger the factor the more it will account for variability in scores

123

## Eigenvalue

### indicates strength of a factor, less than 1.0 are not interpreted

124

## Factor Analysis starts w/___ ___ and computes ___ ___, which are correlations between a variable and the underlying factor

###
correlation matrix

factor loadings

125

## Factor Rotation

### Makes factor loadings more distinct & interpretable

126

## 2 types of factor rotation

###
Orthogonal (axes remain perpendicular)

Oblique

127