5.Review of Descriptive statistics and hypothesis Flashcards Preview

PYB210 > 5.Review of Descriptive statistics and hypothesis > Flashcards

Flashcards in 5.Review of Descriptive statistics and hypothesis Deck (117):
1

what are descriptive statistics?

statistics simply to describe data collected, whether it be sample or population data. It is a Screen of the data and observation of trends

2

what are inferential statistics?

use sample statistics to infer something about a population
to test whether a difference/relationship seen in sample data is sufficiently large to accept it may be real in the population
allows us to test hypotheses and make decisions based on sample data

3

what do equations aim to do?

achieve specific things for specific purposes

4

what are equations made up of?

subcomponents all of which do something useful for achieving that purpose

5

what do equations produce?

numbers that are meaningful with respect to that purpose

6

what is the first thing we want to do when we imagine a set of data?

have a look at its distribution and we might want to think about how to characterise that distribution numerically

7

what are the characteristics of a data set?

central tendency, variability and shape

8

what is central tendency

mean
median
mode

9

what is variability?

sum of squares
variance
standard deviation
range
standard error

10

what is the normal distribution?

a function that represents the distribution of many random variables as a symmetrical bell-shaped graph.

11

what is modality (with regard to the shape of a distribution)?

the number of central clusters that a distribution possess

12

what are the two types of modality?

unimodal and bimodal

13

unimodal

scores vary around one central point

14

bimodal

scores vary around two "central" points

15

what does kurtosis mean?

"Peakedness" - how tightly clustered are scores arond the mean?

16

skew

the symmetry of the rails of the distribution

17

what are the characteristics of "normality" curve?

distribution is unimodal
has moderate peakedness
and has symmetric tails.

18

what does Sigma designate?

"The sum of" - so simply add them up

19

what is the symbol for sigma?

20

what does ∑x mean?

the sum of all values of x

21

what does the mean tell us?

something useful about the center of the data-set

22

what does the mean not tell us?

doesnt tell us anything about the variability around the mean

23

what is the equation of the mean?

mean=M= (∑x)/n

24

what is a simple way we can calculate how each participant's score varies with respect to the mean?

subtract the mean from each participant's score.

X-M

25

how does subtracting the mean from each participant's score characterise the data set as a whole?

when using sigma
thus ∑(X-M)
This will always sum to zero.
This is because we have subtracted the mean from each score that contributes to the mean. All we have left is the variability around the mean (which is 0)

26

what is ^2 (to the power of 2) also known as?

squared

27

what does X^2 designate?

X squared or X * X (X multiplied by itself)

28

why is using "square" handy?

because the square of negative numbers is positive

29

what is the abbreviation of sum of squared deviation?

SS

30

what is another way to say "Sum of squared deviation"

sum of squares

31

what is the equation for sum of squares or sum of squared deviations?

SS= ∑(X-M)^2

32

what does the sum of squared tell us?

it tells us something about the total variability in the data set, but does not really characterise the degree to which each participant varies around the mean

33

what is the abbreviation for variance?

SD^2

or

σ^2

34

how do we calculate the variance?

by dividing the sum of squares by the number of operations minus 1

That is:
σ^2= SS/(n-1)

35

what is the complete equation for variance?

σ^2=

(∑(X-M)^2 )
----------------
(n-1)

36

what happens when you take the square root of the variance?

we can calculate the standard deviation

37

what is the abbreviation for standard deviation?

σ or SD

38

what is the complete equation for the standard deviation?

σ = √( SS / (n-1) )

39

what is the standard deviation?

the average amount of variability around the mean.. This is useful as any information about the degree of variability around the mean is important

40

what is the degrees of freedom?

the number of values in the final calculations of a statistic that are free to vary

41

what is the abbreviation of degrees of freedom?

df

42

what is the initial degrees of freedom equal to?

the number of observations

43

what is the abbreviation for the number of observations?

N

44

what is the equation for degrees of freedom when testing variability?

N-1

45

why do we minus 1 from N when calculating variabiliy (standard deviation) using degrees of freedom?

because when calculating the SD you first have to calculate the mean. In doing so, you use up one of your degrees of freedom. Therefore the df that remains for calculating the SD is N-1

46

what does using a degree of freedom where N-1 allow?

more accurate estimate of population parameters, which is what we want to do since we want to make inferences

47

what is the usual chosen measure of central tendency?

the mean

48

what doe the chosen measure of central tendency provide

provides an estimate of the level of performance in each condition

49

what is the usual measure of variability?

standard deviation

50

what does the measure of variability tell us?

how reliable the estimate is

51

What can outliers or extreme scores do?

effect both the measures of central tendency (especially the mean) and variability

52

what is the golden rule when measuring central tendency?

the measure of central tendency without an companying measure of variability cannot be accurately interpreted

53

finish the sentence:

Depending on the characteristics of the distribution the measure of central tendency may...

not be a good indicator of how the subjects performed

54

what is the measure of central tendency an estimate of?

effect size

55

what is the measure of variability an estimate of?

error

56

what is the general form that most of the statistical tests can use?

Stat=
(Estimate of Effect Size) / (Estimate of Error)

This is known as the stat value

57

how do we do inferential statistics?

compare the stat value against an approproate probability distribution

58

what can we infer if the stat value is sufficiently far from the center of the probability distribution?

that the stat value is significantly different from the mean

59

what does it mean to be significantly different or significantly far?

when p

60

when do we decide our p value (or significant level)

before we do out statistical test

61

what is the central tendency characteristic of a normal distribution?

mean = median = mode

62

what is the standard normal distribution?

Mean = 0
SD = 1
where every score or point on the distribution is associated with a probability of how often that score arises

63

what is the Z score?

the standardised normal distribution. IT is basically telling us how many SDs away from the M a particular score is

64

how can we calculate the z score?

if we know the mean (M) and the standard deviation (SD) of our set of data set, we can convert any score (X) to a Z-score simply by subtracting the mean, and the scaling (i.e. dividing) by the SD

65

what is the equation for a z score?

Z= (X-M) / SD

66

how does one find the Z score?

at the back of any leading stats book or using an online Z calculator

67

How do we know if a score is an outlier

if the Z score is > 3

68

what do we do with a Z > 3

we would exclude these scores from further analysis

69

what is an appropriate estimate of effect size?

the difference between an individual's score and the mean of the distribution of the group of (individual's) scores

This is appropriate because we are treating the group as the population of interest

70

what is an appropriate estimation of error>

the SD of the distribution of the group of (individual) scores

this is appropriate because we are essentially treating the group as the population of interest

71

what is finding a z score a case of?

hypothesis testing

72

what is the Z test asking?

does this particular individual belong to or differ from a particular population (of which we know the mean and SD).
more generally, we are asking questions about a group of people, where the population mean and the SD may be unknown

73

what do we need to do if we want to compare the mean of a group of peoples' scores?

This is normally the case in an experiment

we need to compare this against a distribution of group mean scores

74

what is another way to say a distribution of group mean scores?

a distribution of means

75

the larger the set of means...

the smaller the variability

76

what is the comparison between a distribution of sampling means and a distribution of any given sample?

it has a much lower variability. This is proportional to the square root of the number of observations

77

what should we do if we wat to test a sample mean?

compare it to a distribution of sample means. But we do not need a whole bunch of sample means to form a distribution to test our particular sample mean of interest against

78

what does the behaviour of a normal distribution allow us to do?

to make an estimate of the variance (error) of the distribution of sample means

79

What is the Stsandard error of the mean

is the sample SD divided by the square root of the number of observations in the sample

80

what is the abbreviation of the standard error of the mean?

S(little)M

81

what is the equation of the standard error of the mean>

S_M= σ / √n

82

If we have a known population mean ( μ =100) and standard deviation (SD =10), we can determine whether a sample mean (M=104.75, n=20) is “significantly” different to the population. How can we do this?

using the Z equation.

Z= (M-μ) / S_M

= 104.75 – 100 / 10√20
=4.75 / 2.24
=2.12

as Z=2.12 is inside the critical region (below -1.97 or above 1.96) we can rejuct the null hypothesis and say there is a significant difference

83

what is the z equation for determining whether a sample mean is significantly different to the population?

Z= (M-μ) / S_M

84

when would you use a one sample t-test?

sometime the population parameters are not known. where the populatin mean is known by the Sd is not what can we use a one sample t-test

85

what is the equation for a one sample t-test?

t = (M-μ) / ( S / √n )

86

what is S in the one sample t-test equation?

S = estimated population standard deviation

87

wheat is not needed when estimating the population distribution?

the standard normal distribution (since one or more population parameter is unknown)

88

What is used instead of the standard normal distribution when estimating the population distribution?

a special family of distributions called t distribution

89

what are t distributions

approximations of the Z distribution, which changes shape according to the size of the degrees of freedom

90

why do t distributions change shape according to the size of the degrees of freedom?

because the larger our sample, the more accurate our sample statistics estimate the population parameters

91

what is the degrees of freedom?

the number of observation (N) minus the number of estimates made (e.g. the mean)

92

what do we need when using a table of the t distribution?

need to know the df and need to specify if we want a one-tailed (p

93

what does a larger degrees of freedom do to a t distribution?

makes the distribution taller, and when the df is smaller makes it flatter

94

how do t distributions and distributions differ>

they are similar but slight different for each sample size. get closer to normal as the sample size increases.

95

why is there more error involved in a t distribution?

because we have estimated population variabce so slight more distribution in tails

96

what does a smaller sample size of a t distribution indicte?

the smaller the df, the larget the critical t value that must be exceeded

97

what are the types of t tests

single sample t test
repeated measures t test

98

what is the equation for a single sample t test?

t= (M-μ) / S_M

99

what is the equation for a repeated measures t test?

same as single sample (t= (M-μ) / S_M ) but calculated from difference scores not raw scores.

Remember µ = 0 in Ho no difference

100

when dealing with difference between means, what is something we need

a corresponding distribution and error term

101

what is the equation of independent groups design t test?

t = (M_1 - M_2) / S_Diff

102

what is the equation for S_Diff?

S_difference = √ (S^_M1 + S^2_M2)

103

how do you calculate effect size for a z distribution?

individual score - sample mean?

104

what is the error of a z distribution?

sample standard deviation

105

what is the general statistic form?

Stat = (Estimate of effect size) / (Estimate of error)

106

what is the equation for testing an individual against a sample?

Z=(X-M)/SD

107

what is the equation for testing a sample against a known population (where the population SD is unknown)

t= (M-μ) /(S/√n)

108

what is the equation for testing a sample where population parameters are known?

Z= (M-μ) / (σ/√n)

109

what is the equation for testing two samples against eachother>

t= (M_1-M_2) / S_Diff

110

what is the question of error and statistical significance?

Is the difference we see sufficiently large given the amount of associated error. It is more likely to be an effect of IV or just sampling error.

111

what are the tree assumptions to be made when making a statistical test?

1. all observations are independent
2. Distributions are normally distributed
3. Variance of one group is not too much larger than the other

112

what is the assumption that all observations are independent?

usually a methodological question. Ensure no one person’s performance is affected by or affects someone else's

113

whatis the assumption that distributions are normal?

o check histograms of both groups + skewness & kurtosis
o if samples N>30 then sample distribution less important as theoretical distribution of the difference between the means will be normal
o Homogeneity of variance

114

what is the assumption that variance of one group is not too much larger than the other

o If doing manually; largest variance > x4 smallest variance problematic
o SPSS checks this automatically using Levene’s test
o Breaches to homogeneity assumption can inflate Type 1 Error

115

what does a statistically significant result not prove>

IV caused DV

116

what does causation and interpretation of results depend heavily on?

the nature and integrity of the research design

117

what does statistical significance indicate?

that the results seen is highly unlikely to happen by chance alone.