Flashcards in Statistics - Representations of data Deck (12):

1

## What is an outlier?

### An extreme value that lies outside overall pattern of data

2

## What are the common definition of an outlier?

###
Any value that is:

Greater Q3 + k x IQR

Less than Q1 - k x IQR

3

## What is an anomaly?

### Outlier that is removed from data since it is clearly an error and it would be misleading to keep it

4

## What is cleaning the data?

### The process of removing anomalies from a data set

5

## What features does a box plot show with lines?

###
Lowest value

Highest value

Q1, Q2, Q3

Outliers are a cross

6

## What should be done when comparing two box plots?

###
Use the same scale

Compare medians, IQR and extremes

7

## What is bivariate data?

### Data which has pairs of values for two variables

8

## What is usually plotted on each axis?

###
x-axis - independent variable

y-axis - dependent variable

9

## What is a causal relationship?

###
When a change in one variable causes a change in the other

Correlation doesn't show causation

10

## What is the regression line?

### Straight line that minimises the sum of the squares of the distance of each data point from the line

11

## What is the equation of the regression line?

### y = a + bx

12