Module 14: Descriptive statistics Flashcards

1
Q

2 most common ways to sumamrize data

A
  1. measure of central tendency
  2. Measure of variability
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
1
Q

Measure of central tendency (4)

what it is + represented by

A

A measure of the typical value in a collection of numbers or a data set
- measured by mean, median and mode

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Mean (2)

+ how to find?

A

The average
Sum of all the scores divided by the total number of scores

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Population mean

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Sample mean

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Median (2)

How to find?

A

The value that lies in the middle of the data when the data set is ordered
- First rank the data, then the position of the median is equal to the number of enteries plus one divided by 2

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Odd number of entries when caculating median:

A

median is the middle data entry

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Even number of entries when calculating median:

A

Median is the mean of the 2 middle data entries

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Mode

A

The most frequent value

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

If no data set is repeated then the data has no

A

mode

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

If two entries occur with the same greatest frequency each entry is a — and is called

A
  • mode
  • bimodal
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Finding the mode

A

finding the greatest frequency

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Advantage of using the mean (2)

A
  • most common statistic
  • Takes into account every entry of a data set
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Disadvantage of using the mean (2)

A
  • greatly affgected by extreme scores (outliers)
  • Knowledge about individual cases is lost with averages
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Advantages of using the median (2)

A
  • Little influence by extreme scores
  • Reasonable estimate of what most people mean by the center of a distribution
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Disadvantage of using the median

A
  • may not be good to ignore extreme values
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Advanatges of using the mode (2)

A
  • the most frequently obtained score
  • not influenced by extreme score
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Disadvanatge of using the mode (2)

A
  • may not represent a large proportion of the scores
  • ignores extreme values
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Variability

A

numbers which describe how spread out a set of data is

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Examples of variability meausres (4)

A
  • range (interquartile range)
  • deviation
  • variance
  • standard deviation
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Range+ formula (2)

A

length of the smallest interval that contains all the data

range= largest value - smallest value

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

range is sensitive to

A
  • sample size: small samples= less range (less respresentative range)
  • extreme scores (tells you smallest and largest but not bulk)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

Interquartile range (2)

+ formula

A

Measure of distance between first and third quartiles
- IQR= Q3-Q1

23
Q

second quartile is the

A

median

24
Q

benefits of IQR (2)

A
  • less affected by extreme values
  • helpful for identifying outliers
25
Q

Quartile (2)

What it is+median

A
  • positions in a range of values representing multiples of 25%
  • 50% of scores fall below median, 50% scores above
26
Q

First quartile (Q1)

A

25% of scores fall below Q1, 75% above

27
Q

Third quartile (Q3)

A

75% of scores fall below Q3, 25% above

28
Q

deviation

A

The diference between each score and the mean of the data set

How far you are from the mean

29
Q

deviation formula

A

xi= xi-u

30
Q

deviation scores always sum to

A

0

31
Q

Difference between deviation and IQR/boxplots

A

Deviation scores show dispersion around the mean, IQR and boxplot show dispersion around the median

32
Q

Variance

A

single number representing the average amount of variation in a set of scores/ how spread out the scores are

33
Q

Steps for finding the sample variance (5)

A
34
Q

Standard variation

A

Measure of the spread of scores out from the mean of the sample

35
Q

How to cauclate standard deviation

A
  1. calculate the variance
  2. find the square root
36
Q

Population standard deviation formula

A
37
Q

Standard deviation is a measure of the typical amount an entry deviates from the mean, thus the more entries are spread out, the

A

greater the standard deviation

38
Q

Descriptive statistics (2)

A
  • cannot make predictions or generalizations
  • only drawing conclusions about current sample and not extrapolating or going beyond
39
Q

inferential statistics (2)

A
  • can make predictions or generalizations
  • allow conclusions about the population based on data from a sample
40
Q

Data matrices

A

a table or worksheet that organizes the data together with all the variables of interest

41
Q

Frequency distributions

A

A table indicating the frequency of each value in a data set

42
Q

Histogram (3)

What it is+ illustrates+can help identify

A
  • A graphical representation of the frequency of a variable
  • illustrates the distribution of scores
  • can help identify outliers or violations of normal distribution assumptions
43
Q
A

symmetrical

44
Q
A

Negative skew or left skew

45
Q
A

Positive skew/right skew

46
Q

Central tendency

A

helps identify the typical or most common value in data

47
Q

Measures of central tendency

A

Mean
median
mode

48
Q

measure of central tendency for symmetrical distribution/ skewed

A
49
Q

If the average is 100 and the standard deviation is 10, then there is

A

2/3 of the data that falls between 90 and 110

50
Q

for data that is skewed or has outliers, —- may be better choice to describe the centre of the distribution

A

median

51
Q

Q position

A

Qposition= [(Q#)(n+1)]/4

Q#= number of quartile your trying to find

52
Q

Round Q position to

A

the median

53
Q

How to find outlieers with IQR

A
54
Q

Scatterplots

A
  • visualize the form, direction and strength of 2 variable relationships
55
Q

correlation coefficients

A

indicate the degree of covariance between variables: how much one variable changes in relation to another

56
Q

Data points that are more closely positioned around the best fit line represent

A

a stronger relationship than when data points are further from the lines