Flashcards in Statistics - Location and spread Deck (37):

1

## What is a measure of location?

### Single value which describes a position in a data set

2

## What is a measure of central tendency?

### Single value that describes the centre of the data

3

## What is the mean?

### Sum of data values / number of data values

4

## What is the median?

### The middle value when the data values are put in order

5

## What is mode?

### The value or class that occurs most often

6

## When should median be used?

###
Used when there are extreme values

Quantitative data

7

## When should mean be used?

###
Quantitative to represent all data

But affected by extremes

8

## When should mode be used?

###
Qualitative or quantitative

Either one or two modes

Not very informative if each value occurs once

9

## How can the mean of data in a frequency table be calculated?

### Mean = Sum of products of data and their frequencies / sum of the frequencies

10

## What is the lower quartile?

###
Q1

One-quarter of the way through the data set

11

## What is the upper quartile?

###
Q3

Three-quarters of the way through the data set

12

## How is data split if there is a 85th percentile?

###
85% of data is less than 85th

15% of data is more than 85th

13

## How can you calculate the lower quartile for discrete data?

###
n/4

If whole number, Q1 is halfway between this point and the one above

If not whole number, round UP and pick this data point

14

## How can you calculate the upper quartile for discrete data?

###
3n/4

If whole number, Q3 is halfway between this point and the one above

If not whole number, round UP and pick this data point

15

## How can you calculate Q1-3 for cumulative frequency table?

###
Q1 = n/4 th data set

Q2 = n/2 th data set

Q3 = 3n/4 th data set

NO ROUNDING

16

## Define percentile

### The value below which a percentage of data falls

17

## What is interpolation?

###
Technique to estimate the Q1-3 and percentiles

This assumes the data values are evenly distributed

18

## What is the equation for linear interpolation?

### (Quartile - freq. below / freq of group) x width + lower class boundary

19

## What is the range?

### Difference between largest and smallest values in the data set

20

## What is the interquartile range IQR?

###
Difference between upper and lower quartile

Q3 - Q1

21

## Why is IQR used?

###
It does not include extreme values

Only considers spread of middle 50% of the data

22

## What is the inter-percentile range?

### Difference between the values for two given percentiles

23

## What is variance a measure of?

### The spread of a data

24

## What is the equation for variance?

### (Sum of x^2/n) - (Sum of x/n)^2

25

## What is the equation for standard deviation?

###
Sqrt of (Sum of x^2/n) - (Sum of x/n)^2

Square root of variance

26

## How is variance/standard dev. different for grouped data in a frequency table?

### x is always times by its frequency

27

## What is coding?

###
A technique to simplify statistical calculations

Allows easier data to work with

28

## What is the equation for coding data?

### y = (x-a)/b

29

## What is the equation for the mean of coded data?

### mean of y = (mean of x - a)/b

30

## What is the standard dev. of coded data?

### Coded standard dev. = standard dev. / b

31

## What affects measures of location and spread in coding?

###
Add/subtract affects mean not spread

All affected by stretch x or /

32

## What is the formula for sample variance?

### Sum of (x - mean of x)^2 / (n-1)

33

## What is the formula for sample standard deviation?

### Square root: Sum of (x - mean of x)^2 / (n-1)

34

## What are the advantages and disadvantages of range?

###
Adv: Easiest measure of dispersion to calculate

Dis: Heavily affected by extreme values, no info on spread of the rest of the values

35

## What are the advantages and disadvantages of interquartile range?

###
Adv: Not affected by extreme values (used when outliers present)

Dis: Difficult to calculate for grouped data

36

## What are the advantages and disadvantages of variance?

###
Adv: Depends on all data values

Dis: Difficult to calculate, affected by outliers, different units from actual data values

37