Statistics Flashcards

Question

µ

Answer 1

population mean

Answer 2

sample mean

Answer 3

population standard deviation

Answer 4

sample standard deviation

Answer 5

population variance

Answer 6

sample variance SS/n (df w/ sample)

Answer 7

population portion that have particular attributes

Answer 8

sample proportion that have particular attributes

Answer 9

population correlation coefficient

Answer 10

sample correlation coefficient

Answer 11

population number of elements

Answer 12

sample number of elements

Answer 13

null hypothesis

Answer 14

alternative hypothesis

Answer 15

alpha probability of a type 1 error

Answer 16

beta probability of a type 2 error

Answer 17

incorrect rejection of a null hypothesis false positive thinking there is an effect when there isnt

Answer 18

incorrectly retaining a false null fals negative thinking there isnt an effect when there is one

Answer 19

organized tabulation of the number of individual scores located in each category on the scale of measurement - takes disorganized scores and placed them in order from highest to lowest - see entire set of scores at glance - categories based odd measurement scale - can be graph or table

Answer 20

when the data covers a wide range of values and it is unrealistic to list individual scores - rule 1: ~10 class intervals - rule 2: relatively simple width (2, 5, 10) - rule 3: interval starts with a score that is multiple of the width - rule 4: all intervals should be the same width

Answer 21

uses horizontal or vertical bars to show comparisons among categories - nominal/ordinal

Answer 22

curve of the cumulative frequency distribution or cumulative related frequency distribution - express simple frequency as percentage of total frequency - cumulate and plot these percentages (e.g. lowest scores makes up 5%, next score makes up 6% but the cumulative frequency is 11% so that is what is plotted for score 2)

Answer 23

a line drawn to join all the midpoints of the top bars of a histogram - like an ogive, but does not use cumulative frequencies or smooth lines - to convert to ogive, add up percentages before each bar

Answer 24

an area diagram -\> bars portray frequencies of possible values of a variable - continuous variables (this is why the bars touch) - set of rectangles along the intervals between class boundaries - areas proportional to the frequencies in corresponding classes

Answer 25

cant find absolute frequency but can find relative frequencies e.g. don't know how many fish encompass the population in a lake -\> don't know how many trout or salmon, after research can say that there are twice as many trout as salmon

Answer 26

score point below which a specified % of the scores in a distribution fall - compute the percent \* N - round this figure so that it ends in .0 or .5 whichever is closer - if rounded value ends in .5 the desired centile is the next higher value, if ending in .0 split the difference with the next higher score

Answer 27

precent of cases which are below a specific point in the distribution - write down exact limits of the interval which contain the score whose rank is to be obtained - interpolate between the cumulative percents to dind desired CR exact limit/ cum % Y/A X/B Z/C X-Z/Y-Z = B-C/A-C

Answer 28

descriptive statistical measure to determine a single score that defines the center of a distribution goal: find one score that is most representative of the group most common method of summarizing/describing distribution

Answer 29

average; sum of scored divided by number of scores appropriate when... no extreme outliers, no nominal scales ∑X/N

Answer 30

the score that divides the distribution of scores exactly in half appropriate when... there are extreme outliers, no nominal scales, skewed distribution N/2

Answer 31

score or category that has the greatest frequency appropriate when... you want answer to be correct as often as possible, nominal scales, discrete variables (hair color frequency)

Answer 32

will change mean, unless score is the same as the mean

Answer 33

same constant is added/subtracted to the mean e.g. 1,2,3 M = 2; now add 2 to each score: 3,4,5 M = 4

Answer 34

mean changes in the same way e.g. 1, 2, 3 M = 2; now multiple all scores by 2: 2, 4, 6 M = 4

Answer 35

when choosing which measure is most valuable... normal dist: all equal skewed dist: median negatively skewed: mean \< median \< mode positively skewed: mode \< median \< mean

Answer 36

quantitative measure of the degree to which scores in a distribution are spread out or clustered together no variability: no difference between scores small variability: small difference large variability: large difference

Answer 37

the distance between the largest score and the smallest score must compute in terms of real limits problem: solely determined by two extreme outliers of distribution calculate: substract lowest number from highest number

Answer 38

ignores any extreme outlier scores -\> measures the range covered by the middle 50% of the distribution separates scores into 4 equal parts with "cuts" either between or on certain scores interquartile range is distance between Q1 and Q3 (top 25% to lowest 25%) calculate: order from least to greatest, find median/middle number, calculate the median of the first half, calculate median of the 2nd half, substract the smaller half from the larger half

Answer 39

half of the inter-quartile range middle 25% divide interquartile range in half

Answer 40

most commonly used and most important measure of variability takes into account all values of a variable mean = reference point; measures variability by considering distance between each score and the mean determines whether scores are generally near or far from mean, how much they deviate from the mean

Answer 41

∑(X - µ)² find the deviation score: x - µ compute this for each score, be mindful of +/- square each deviation score (X - µ)² add up all the deviation scores ∑(X - µ)² this is SS

Answer 42

take SS divide by N ∑(X-µ)²/ N large score = more variability = more scores are spread out = BAD

Answer 43

take square root of variance SS/N = σ²\<- this is variance √σ² \<- standard deviation

Answer 44

find deviation score x - M compute for each score square each deviation score (x - M)² add up all deviation scores ∑(x - M)² \<- this is SS

Answer 45

take SS divide by n-1 ∑(x-M)² / n - 1 = s²

Answer 46

square root variance for standard deviation √s²

Answer 47

unbiased statistic is an accurate representation of the population n - 1 in sample variance will correct for bias in sample variability

Answer 48

provides a precise description of a location in a distribution describes number of SD forom mean describes how common/exceptional a score is compared to others positive z-score = above the mean, negative z-score = below the mean

Answer 49

compare scores across test forms same shape as origianl distribution (scores renamed, but same location) e.g. z-score distribution when transforming x scores to z-scores, new M = 0, new s = 1

Answer 50

likelihood that something will happen way to quantify randomness smaller # -\> less likely over the long run p = (# of certain outcome)/(#of all possible outcomes) probability is similar to findign percentile rank: what is the probability of having an IQ of 120 is the same as percentile rank of x = 120

Answer 51

act of flipping a coin or dice

Answer 52

cannot happen at the same time - rolling a 2 and 6 on a die cant happen simultaneously

Answer 53

probability of being selected is independ of the individuals already selected each individual in population has equal chance of being selected ensures that the probability of particular outcome does not depend on previous outcomes

Answer 54

returning selections back to the population probability of picking out a red m&m 1/10 - pick out an m&m, replace. probability is stil 1/10 instead of 1/9, 1/8, etc.

Answer 55

transform score to a z-score (z = x-M/s) (x = M + zs) look up in unit normal table - proportions are always positive, even if z-score is negative negative z-score: tail is on the left, body on the right positive z-score: tail on the right, body on the left

Answer 56

set of means from all possible random samples (w/ replacement) of n from a population the larger the n, the smaller the st. error of the mean (means from multiple trials) -\> because there is less error between the sample mean and the population mean. the more people in the study, the less error between the sample and the population - sample means should be centered around population mean - expected that M = µx - the sample mean is an unbiased estimator of the population mean - distribution of sample means will approach a normal distribution even if original dist. is skewed.

Answer 57

σM = σ/√n

Answer 58

each sample mean, M, has a location in the distribution of sample means can be described in a z-score calculate: Z = (M-µ)/σM M of sample means - individual mean/standard error of the mean

Answer 59

determining whether the sample is representative of the population or merely the result of chance

Answer 60

suggests that there are no difference between groups no effect assume null hypothesis is true unless data prove otherwise

Answer 61

suggests there IS a difference between groups there is an effect

Answer 62

of standard errors the sample value is removed from the null value use to determine whether to reject the null compared your data with that is expected under the null e.g. z-score

Answer 63

probability of making a type 1 error decreasing significance level -\> decreases chance for type 1 error but increase chance for type 2

Answer 64

composed of the extreme sample values that are very unlikely to be obtained if the null is true boundaries determined by alpha level if sample data fall in the critical region, null is rejected calculate: 1. define alpha 2. use unit normal table to find which z-score to be larger (+) or smaller (-) than the critical region levels

Answer 65

1. state hypothesis (one tailed or two tailed - lower response vs. have a effect) 2. set the criteria - alpha level - find critical regions 3. collect data and evaluate - calculate standard error - calculate z-score 4. make a decision - reject null -\> sample data in criical region, tx had an effect - fail to reject null -\> treatment doesnt have an effect, not in critical region

Answer 66

magnitude of the treatment effect

Answer 67

.2 = small effect .5 = medium effect .8 = large effect calculate: µtx - µnotx / s

Answer 68

probability that the test will correctly reject the null hypothesis helps determine # of participants needed related to effect size -\> higher effect size = higher chance of rejecting the null (both provide magnitude of tx effect) decrease standard error between two distributions -\> increase # of subjects factors that affect power: sample size, alpha level, 1 tailed vs. 2 tailed

Answer 69

another way to calculate effect size - the amount of variability/percentage of variance accounted for .1 = small effect .09 = medium effect .25 = large effect

Answer 70

z stat used with unknown populatio mean and known standard deviation t stat used to test hypothesis about an unknown population mean when the standard deviation is unknown only difference between t and z is estiamted standard error calculate: t = M - µ / Sm difference between sample mean and population mean divided bt difference expected by chance

Answer 71

1. set up hypothesis H0: M1 = M2; H1: M1 doesn not = M2 2. set the criteria - set alpha - find critical region 3. collect data and evaluate - calculate variance or SD (s²= ss/n-1 = ss/df) - calculate estimated standard error (sm = s/√n) - calculate t-stat (t = M - µ/ sm) 4. make a decision

Answer 72

r² = t²/ t²+df

Answer 73

comparing means of 2 independent groups uses separate sample for each of the tx populations compared examine difference between population means of 2 independent groups assumptions - independent obersvations -\> one observation doesnt affect probability of other observations - normal distribution - populations have equal variance -\> homogeneity of variance

Answer 74

1. state H0 and H1 - H0: µ1 = µ2 OR µ1 - µ2 = 0 - H1: µ1 ≠ µ2 OR µ1 - µ2 ≠ 0 2. identify critical regions based on alpha - calculate total df (df = df1 + df2) - find critical region boudaries in t distribution table 3. evaluate assumptions 4. compute statistics - pooled variance - estimated standard error - independent samples t statistic 5. make decision regarding H0 - independent measures t test gives us total amount of error involved in using 2 sample means to estimate 2 population means - tells average distance between the sample difference and population difference - estimate the standard error using the sample standard devision or variance and, since there are two samples, we must average the two sample variances.

Answer 75

account for both standard errors, find them separate and then add together.

Answer 76

measures treatment effect mean difference divided by standard deviation (estimated standard error b/c its a t-test) M-µ/s

Answer 77

repeatedly measures same individuals to assess change (within-subjects) - same sample, test twice, before/after tx - same subjects are being tested under different conditions

Answer 78

difference score (D) - change in an individuals score between two measures 1. state null and alternative H0: D = 0 H1: D ≠ 0 2. select alpha and criticial values 3. compute the t statistic (do not have to compute pooled variance because it is one group) - estimates standard error - dependent sample t statistic 4. make your decision

Answer 79

advantages - allows researcher to exclude effects of individual differences (own control group) - requires fewer participents -\> easier to recruit - study individuals over time disadvantages - order effects - variance reduced - other things can affect -\> history, maturity, attrition, testing, instrumentation

Answer 80

advantages - order effects is not a problem - does not require as many materials as repeated measures because different people are being studies so you can reuse materials disadvantages - individuals differences

Answer 81

measures and describes a relationship between two variable

Answer 82

calculate mean for x and y find deviation scores (x-M) multiply deviation score x and deviation score y add these (possibly more) take this amount minus (∑X)(∑Y)/n all together... ∑XY - (∑X)(∑Y)/n

Answer 83

spearman uses ranks, one or both variables are ordinal d = differece in rank scores tied scores? - list scores smallest to highest - assign rank - if tied, compute mean fo their ranked positions and assign this value as final rank for each score

Answer 84

line of best fit y = mx + b - m = slope of the line - b = y-intercept

Answer 85

approach in regression to find the approximate solution of overdetermined systems (set of equations with more questions than unknowns)

Answer 86

all you need is slope and the y-intercept to create a line of best fit y = bx + a b = SP/SSx

Answer 87

used to evaluate the diffrence between two or more sample meansm, compared variances ANOVA is used because multiple t-tests -\> more error compares between tx variance with within tx variance advantage: performs all tests with one hypothesis and one alpha, avoids the problem of inflated experiement-wise alpha hypotheses: null = all means are equal, alternative = there is at least one mean difference among the populations

Answer 88

number of independent variables between subjects = different subjects used for different levels of the factor within subjects = same subjects used for the different levels of the factor

Answer 89

number of conditions

Answer 90

measures diffrences caused by - systematic tx effects - random, unsystematic factors

Answer 91

measures differences caused by - random, unsystematic factors

Answer 92

post tests are used when significant results are found and when additional exploration of the differences among means is needed provided specific info on which means are significanly different from each other

Answer 93

r² = ssb/ss total - this is the percentage of variance accounted for by the treatment

Answer 94

determines association between 2 categorical variables - when scores violate assumptions of a parametric test - \> not normally distributed - \> unequally high variances - usually high variance - undetermined or infinite scores - this test determines how well the obtained sample proportions fit the population proportions specified by the null hypothesis e. g. relationship between personality and color preference

Answer 95

hypotheses H0: equal proportions or no difference from a known population Example: Men 50%, women 50% H1: unequal proportions or a difference from known population F0 = observed frequency - represent rela individuals - always whole numbers Fe = expected frequency (proportion times n) - predicted from the proportios in the null hypothesis and the sample sie - defines an ideal, hypothetical sample distribution that would be obtained if the sample proportions were in perfect agreement with the proportions specified in the null chi-square stat df = C-1 (C= # of categories) use table to determine if stat is in crtiical region

Answer 96

small - small value for chi-sqaure - conclude there is a good fit between data and hypothesis - fail to reject null large - large chi-sqaure - reject the null - want a large value for chi square!

Answer 97

variables are independent when there is no consistent, predictable relationship between them - two variables independent -\> frequency distribution for one variable has same shape for second variable - if there is no relationship between 2 variables (null) -\> distributions have equal proportions (null) each individual classified on each of the 2 variables - frequency distribution for sample tests hypothesis about corresponding frequency distribution for population - H0: distributions are the same (no differences, no relationship)

Answer 98

.1 = small .3 = medium .5 = large

Answer 99

df small medium large 1 .1 .3 .5 2 .07 .21 .35 3 .06 .17 .29

Answer 100

r²= t²/t²+df

Statistics Flashcards

(129 cards)