Describing Data with Numbers Flashcards Preview

Quantitative Methods in Health > Describing Data with Numbers > Flashcards

Flashcards in Describing Data with Numbers Deck (57):
1

What are quartiles?

Quartiles are what is achieved by dividing the data into 4 equal parts in terms of the number of observations.

2

What units do you use for standard deviation?

The same units as the data is represented as

3

What is the formula used to calculate the coefficient of variation?

CV = (s/x̄)*100

4

What is Q3?

The middle of the top half of results.

5

What are three statistical tools?

Graphs
Measures of centre
Measures of dispersion

6

What does the five-number summary consist of?

Minimum
Q1 value
Median
Q3 value
Maximum

7

When using the coefficient of variation, what is the variability relative to?

The typical value.

8

What is Σx?

The sum of all numbers.

9

What type of data are bar graphs associated with?

Categorical data

10

What is Exploratory Data Analysis?

The process of using statistical tools to investigate data sets in order to understand their important characteristics

11

What are the two main measures of centrality?

Mean
Median

12

What are outliers?

Sample values that lie far away from the vast majority of other sample values.

13

What does the standard deviation measure?

The average deviation of observations from the mean.

14

What is the most common weakness with calculating a range?

Ranges fail to capture the main features of a data set. It only focuses on the extremes.

15

What is the calculation for the position of the median when the number (n) of observations is even?

Median = average of the n/2th and ((n/2)+1)th observations

16

What is a less-commonly considered weakness of calculating the range as a degree of spread?

Two completely different data sets can have the same range.

17

For what types of data sets would you use the mean?

Symmetric distributions without outliers.

18

What two things is the coefficient of variation used to do?

Measure changes in a population over time
Compare variability of two populations with different units

19

What is the calculation for the position of the median when the number (n) of observations is odd?

Median = value of the (n+1)/2th observation

20

What is the relationship between the mean and median for symmetric data sets?

Mean=median

21

What is the relationship between the mean and median for data sets which are skewed right?

Mean>median

22

How is the five-number summary represented?

Graphically with a boxplot.

23

What measure of variability is used with the median?

Interquartile Range

24

What is Q1?

The middle of the bottom half of results

25

What are the three measures of centre?

Mean
Median
Mode

26

What is the formula for calculating mean?

x̄=Σx/n

27

When would a bell curve appear rectangular?

When all values occur equally as frequently as one another.

28

What is the mean?

The arithmetic average value

29

What is n?

The number of observations

30

What are the four measures of spread (or measures of dispersion)?

Standard Deviation
Interquartile Range
Range
Coefficient of Variation

31

How do you choose the maximum value in a box and whisker plot?

Choose whichever is smaller out of:
The maximum value
Or
Q3+1.5*IQR
This ensures the degree of spread is minimised.

32

What type of data is mean most useful for?

Symmetrical data.

33

What is What is Σ?

Nothing

34

What measure of variability is used with the mean, when comparing data that has one consistent unit?

Standard Deviation

35

What is the interquartile range?

The range between the Q1 value and the Q3 value. It measures the middle 50% of the values.

36

What measure of centre is rarely used in numerical data?

Mode

37

What type of data are histograms associated with?

Numerical data.

38

Can you calculate the median?

No, but you can calculate the position of the median.

39

What is the interquartile range (put simply)?

The range for the middle 50%

40

What do quartiles do?

Give us the data, divided into four parts.

41

What measure of variability is used with the mean, when comparing data that has different units?

Coefficient of Variation

42

What is the median?

The middle value in a data set, when arranged in order of magnitude.

43

What is the first decision to make when choosing how to represent your data?

Choose whether you are going to use mean or median. The measure of variability will depend on which you choose.

44

What is the relationship between the mean and median for data sets which are skewed left?

Median>Mean

45

How do you choose the minimum value in a box and whisker plot?

Choose whichever is larger out of:
The minimum value
Or
Q1-1.5*IQR
This ensures the degree of spread is minimised.

46

When the bell curve is rectangular, what is the relationship between the mean and median?

Mean=median

47

What does the box plot look like for a rectangular bell curve?

The mean is in the centre
Q1 and Q3 are equidistant from the mean
The maximum and minimum are equidistant from the mean
The distance from Q1 to the minimum and Q3 to the maximum is the same as the distance between the mean and Q1 or Q3.

48

What is the only appropriate measure of variability when dealing with numbers consisting of different units?

Coefficient of variability.

49

What does the box plot look like for a symmetric bell curve?

The mean is in the centre
Q1 and Q3 are equidistant from the mean
The maximum and minimum are equidistant from the mean
The distance from Q1 to the minimum and Q3 to the maximum is larger than the distance between the mean and Q1 or Q3.

50

What are the two main weaknesses to using the mode as a measure of the centre?

The mode is not independently determined (may not represent the centre at all)
There may be no mode (particularly in numerical data)

51

For what types of data sets would you use the median?

Skewed distributions or distributions with outliers.

52

What is the mode?

The most frequently occurring value

53

Which measure of dispersion is least frequently used?

Range.

54

What can be said about the distribution about the median?

50% of data has a value less than the median
50% of data has a value greater than the median.

55

What units are used to represent the coefficient of variation?

The coefficient of variation is expressed as a percentage rather than in units of the particular data.

56

When comparing centre and dispersion of distributions, when do we use the median and Interquartile Range?

If at least one distribution is skewed or has outliers.

57

What is the range?

The difference between the maximum and minimum values in a data set