Flashcards in Statistics - Representations of data Deck (24):

1

## What is an outlier?

### An extreme value that lies outside overall pattern of data

2

## What are the common definition of an outlier?

###
Any value that is:

Greater Q3 + k x IQR

Less than Q1 - k x IQR

3

## What is an anomaly?

### Outlier that is removed from data since it is clearly an error and it would be misleading to keep it

4

## What is cleaning the data?

### The process of removing anomalies from a data set

5

## What features does a box plot show with lines?

###
Lowest value

Highest value

Q1, Q2, Q3

Outliers are a cross

6

## What should be done when comparing two box plots?

###
Use the same scale

Compare medians, IQR and extremes

7

## What is bivariate data?

### Data which has pairs of values for two variables

8

## What is usually plotted on each axis?

###
x-axis - independent variable

y-axis - dependent variable

9

## What is a causal relationship?

###
When a change in one variable causes a change in the other

Correlation doesn't show causation

10

## What is the regression line?

### Straight line that minimises the sum of the squares of the distance of each data point from the line

11

## What is the equation of the regression line?

### y = a + bx

12

## When should you use a regression line to make predictions?

### When values are within the range of the given data

13

## What are histograms used for?

### Continuous variables

14

## What is the equation for frequency density?

### Frequency density = frequency / class width

15

## What is the equation for frequency and area?

### Frequency = k x area of the bar

16

## What does unimodal mean?

### Data has one point where the distribution peaks

17

## What does bimodal mean?

### Two points in data where the distribution peaks

18

## What is bivariate data?

###
Data made up of pairs

(x,y)

19

## What is a scatter diagram?

### Each variable is plotted along one of the axes

20

## What are scatter diagrams used for?

### Showing whether data is correlated

21

## What is important to remember about correlation?

###
Correlation does not mean causation

Could be linked by another factor

22

## What is linear regression?

### Process used to find equation of regression line (line of best fit)

23

## What is the explanatory variable?

### Independent variable - variable which is affecting the other, always on horizontal axis

24