Flashcards in 5.Review of Descriptive statistics and hypothesis Deck (117):

1

## what are descriptive statistics?

### statistics simply to describe data collected, whether it be sample or population data. It is a Screen of the data and observation of trends

2

## what are inferential statistics?

###
use sample statistics to infer something about a population

to test whether a difference/relationship seen in sample data is sufficiently large to accept it may be real in the population

allows us to test hypotheses and make decisions based on sample data

3

## what do equations aim to do?

### achieve specific things for specific purposes

4

## what are equations made up of?

### subcomponents all of which do something useful for achieving that purpose

5

## what do equations produce?

### numbers that are meaningful with respect to that purpose

6

## what is the first thing we want to do when we imagine a set of data?

### have a look at its distribution and we might want to think about how to characterise that distribution numerically

7

## what are the characteristics of a data set?

### central tendency, variability and shape

8

## what is central tendency

###
mean

median

mode

9

## what is variability?

###
sum of squares

variance

standard deviation

range

standard error

10

## what is the normal distribution?

### a function that represents the distribution of many random variables as a symmetrical bell-shaped graph.

11

## what is modality (with regard to the shape of a distribution)?

### the number of central clusters that a distribution possess

12

## what are the two types of modality?

### unimodal and bimodal

13

## unimodal

### scores vary around one central point

14

## bimodal

### scores vary around two "central" points

15

## what does kurtosis mean?

### "Peakedness" - how tightly clustered are scores arond the mean?

16

## skew

### the symmetry of the rails of the distribution

17

## what are the characteristics of "normality" curve?

###
distribution is unimodal

has moderate peakedness

and has symmetric tails.

18

## what does Sigma designate?

### "The sum of" - so simply add them up

19

## what is the symbol for sigma?

### ∑

20

## what does ∑x mean?

### the sum of all values of x

21

## what does the mean tell us?

### something useful about the center of the data-set

22

## what does the mean not tell us?

### doesnt tell us anything about the variability around the mean

23

## what is the equation of the mean?

### mean=M= (∑x)/n

24

## what is a simple way we can calculate how each participant's score varies with respect to the mean?

###
subtract the mean from each participant's score.

X-M

25

## how does subtracting the mean from each participant's score characterise the data set as a whole?

###
when using sigma

thus ∑(X-M)

This will always sum to zero.

This is because we have subtracted the mean from each score that contributes to the mean. All we have left is the variability around the mean (which is 0)

26

## what is ^2 (to the power of 2) also known as?

### squared

27

## what does X^2 designate?

### X squared or X * X (X multiplied by itself)

28

## why is using "square" handy?

### because the square of negative numbers is positive

29

## what is the abbreviation of sum of squared deviation?

### SS

30

## what is another way to say "Sum of squared deviation"

### sum of squares

31

## what is the equation for sum of squares or sum of squared deviations?

### SS= ∑(X-M)^2

32

## what does the sum of squared tell us?

### it tells us something about the total variability in the data set, but does not really characterise the degree to which each participant varies around the mean

33

## what is the abbreviation for variance?

###
SD^2

or

σ^2

34

## how do we calculate the variance?

###
by dividing the sum of squares by the number of operations minus 1

That is:

σ^2= SS/(n-1)

35

## what is the complete equation for variance?

###
σ^2=

(∑(X-M)^2 )

----------------

(n-1)

36

## what happens when you take the square root of the variance?

### we can calculate the standard deviation

37

## what is the abbreviation for standard deviation?

### σ or SD

38

## what is the complete equation for the standard deviation?

### σ = √( SS / (n-1) )

39

## what is the standard deviation?

### the average amount of variability around the mean.. This is useful as any information about the degree of variability around the mean is important

40

## what is the degrees of freedom?

### the number of values in the final calculations of a statistic that are free to vary

41

## what is the abbreviation of degrees of freedom?

### df

42

## what is the initial degrees of freedom equal to?

### the number of observations

43

## what is the abbreviation for the number of observations?

### N

44

## what is the equation for degrees of freedom when testing variability?

### N-1

45

## why do we minus 1 from N when calculating variabiliy (standard deviation) using degrees of freedom?

### because when calculating the SD you first have to calculate the mean. In doing so, you use up one of your degrees of freedom. Therefore the df that remains for calculating the SD is N-1

46

## what does using a degree of freedom where N-1 allow?

### more accurate estimate of population parameters, which is what we want to do since we want to make inferences

47

## what is the usual chosen measure of central tendency?

### the mean

48

## what doe the chosen measure of central tendency provide

### provides an estimate of the level of performance in each condition

49

## what is the usual measure of variability?

### standard deviation

50

## what does the measure of variability tell us?

### how reliable the estimate is

51

## What can outliers or extreme scores do?

### effect both the measures of central tendency (especially the mean) and variability

52

## what is the golden rule when measuring central tendency?

### the measure of central tendency without an companying measure of variability cannot be accurately interpreted

53

##
finish the sentence:

Depending on the characteristics of the distribution the measure of central tendency may...

### not be a good indicator of how the subjects performed

54

## what is the measure of central tendency an estimate of?

### effect size

55

## what is the measure of variability an estimate of?

### error

56

## what is the general form that most of the statistical tests can use?

###
Stat=

(Estimate of Effect Size) / (Estimate of Error)

This is known as the stat value

57

## how do we do inferential statistics?

### compare the stat value against an approproate probability distribution

58

## what can we infer if the stat value is sufficiently far from the center of the probability distribution?

### that the stat value is significantly different from the mean

59

## what does it mean to be significantly different or significantly far?

### when p

60

## when do we decide our p value (or significant level)

### before we do out statistical test

61

## what is the central tendency characteristic of a normal distribution?

### mean = median = mode

62

## what is the standard normal distribution?

###
Mean = 0

SD = 1

where every score or point on the distribution is associated with a probability of how often that score arises

63

## what is the Z score?

### the standardised normal distribution. IT is basically telling us how many SDs away from the M a particular score is

64

## how can we calculate the z score?

### if we know the mean (M) and the standard deviation (SD) of our set of data set, we can convert any score (X) to a Z-score simply by subtracting the mean, and the scaling (i.e. dividing) by the SD

65

## what is the equation for a z score?

### Z= (X-M) / SD

66

## how does one find the Z score?

### at the back of any leading stats book or using an online Z calculator

67

## How do we know if a score is an outlier

### if the Z score is > 3

68

## what do we do with a Z > 3

### we would exclude these scores from further analysis

69

## what is an appropriate estimate of effect size?

###
the difference between an individual's score and the mean of the distribution of the group of (individual's) scores

This is appropriate because we are treating the group as the population of interest

70

## what is an appropriate estimation of error>

###
the SD of the distribution of the group of (individual) scores

this is appropriate because we are essentially treating the group as the population of interest

71

## what is finding a z score a case of?

### hypothesis testing

72

## what is the Z test asking?

###
does this particular individual belong to or differ from a particular population (of which we know the mean and SD).

more generally, we are asking questions about a group of people, where the population mean and the SD may be unknown

73

##
what do we need to do if we want to compare the mean of a group of peoples' scores?

This is normally the case in an experiment

###
we need to compare this against a distribution of group mean scores

74

## what is another way to say a distribution of group mean scores?

### a distribution of means

75

## the larger the set of means...

### the smaller the variability

76

## what is the comparison between a distribution of sampling means and a distribution of any given sample?

### it has a much lower variability. This is proportional to the square root of the number of observations

77

## what should we do if we wat to test a sample mean?

### compare it to a distribution of sample means. But we do not need a whole bunch of sample means to form a distribution to test our particular sample mean of interest against

78

## what does the behaviour of a normal distribution allow us to do?

### to make an estimate of the variance (error) of the distribution of sample means

79

## What is the Stsandard error of the mean

### is the sample SD divided by the square root of the number of observations in the sample

80

## what is the abbreviation of the standard error of the mean?

### S(little)M

81

## what is the equation of the standard error of the mean>

### S_M= σ / √n

82

## If we have a known population mean ( μ =100) and standard deviation (SD =10), we can determine whether a sample mean (M=104.75, n=20) is “significantly” different to the population. How can we do this?

###
using the Z equation.

Z= (M-μ) / S_M

= 104.75 – 100 / 10√20

=4.75 / 2.24

=2.12

as Z=2.12 is inside the critical region (below -1.97 or above 1.96) we can rejuct the null hypothesis and say there is a significant difference

83

## what is the z equation for determining whether a sample mean is significantly different to the population?

### Z= (M-μ) / S_M

84

## when would you use a one sample t-test?

### sometime the population parameters are not known. where the populatin mean is known by the Sd is not what can we use a one sample t-test

85

## what is the equation for a one sample t-test?

### t = (M-μ) / ( S / √n )

86

## what is S in the one sample t-test equation?

### S = estimated population standard deviation

87

## wheat is not needed when estimating the population distribution?

### the standard normal distribution (since one or more population parameter is unknown)

88

## What is used instead of the standard normal distribution when estimating the population distribution?

### a special family of distributions called t distribution

89

## what are t distributions

### approximations of the Z distribution, which changes shape according to the size of the degrees of freedom

90

## why do t distributions change shape according to the size of the degrees of freedom?

### because the larger our sample, the more accurate our sample statistics estimate the population parameters

91

## what is the degrees of freedom?

### the number of observation (N) minus the number of estimates made (e.g. the mean)

92

## what do we need when using a table of the t distribution?

### need to know the df and need to specify if we want a one-tailed (p

93

## what does a larger degrees of freedom do to a t distribution?

### makes the distribution taller, and when the df is smaller makes it flatter

94

## how do t distributions and distributions differ>

### they are similar but slight different for each sample size. get closer to normal as the sample size increases.

95

## why is there more error involved in a t distribution?

### because we have estimated population variabce so slight more distribution in tails

96

## what does a smaller sample size of a t distribution indicte?

### the smaller the df, the larget the critical t value that must be exceeded

97

## what are the types of t tests

###
single sample t test

repeated measures t test

98

## what is the equation for a single sample t test?

### t= (M-μ) / S_M

99

## what is the equation for a repeated measures t test?

###
same as single sample (t= (M-μ) / S_M ) but calculated from difference scores not raw scores.

Remember µ = 0 in Ho no difference

100

## when dealing with difference between means, what is something we need

### a corresponding distribution and error term

101

## what is the equation of independent groups design t test?

### t = (M_1 - M_2) / S_Diff

102

## what is the equation for S_Diff?

### S_difference = √ (S^_M1 + S^2_M2)

103

## how do you calculate effect size for a z distribution?

### individual score - sample mean?

104

## what is the error of a z distribution?

### sample standard deviation

105

## what is the general statistic form?

### Stat = (Estimate of effect size) / (Estimate of error)

106

## what is the equation for testing an individual against a sample?

###
Z=(X-M)/SD

107

## what is the equation for testing a sample against a known population (where the population SD is unknown)

### t= (M-μ) /(S/√n)

108

## what is the equation for testing a sample where population parameters are known?

### Z= (M-μ) / (σ/√n)

109

## what is the equation for testing two samples against eachother>

### t= (M_1-M_2) / S_Diff

110

## what is the question of error and statistical significance?

### Is the difference we see sufficiently large given the amount of associated error. It is more likely to be an effect of IV or just sampling error.

111

## what are the tree assumptions to be made when making a statistical test?

###
1. all observations are independent

2. Distributions are normally distributed

3. Variance of one group is not too much larger than the other

112

## what is the assumption that all observations are independent?

### usually a methodological question. Ensure no one person’s performance is affected by or affects someone else's

113

## whatis the assumption that distributions are normal?

###
o check histograms of both groups + skewness & kurtosis

o if samples N>30 then sample distribution less important as theoretical distribution of the difference between the means will be normal

o Homogeneity of variance

114

## what is the assumption that variance of one group is not too much larger than the other

###
o If doing manually; largest variance > x4 smallest variance problematic

o SPSS checks this automatically using Levene’s test

o Breaches to homogeneity assumption can inflate Type 1 Error

115

## what does a statistically significant result not prove>

### IV caused DV

116

## what does causation and interpretation of results depend heavily on?

### the nature and integrity of the research design

117