Statistics - Representations of data Flashcards Preview

A Level - Maths > Statistics - Representations of data > Flashcards

Flashcards in Statistics - Representations of data Deck (24):
1

What is an outlier?

An extreme value that lies outside overall pattern of data

2

What are the common definition of an outlier?

Any value that is:
Greater Q3 + k x IQR
Less than Q1 - k x IQR

3

What is an anomaly?

Outlier that is removed from data since it is clearly an error and it would be misleading to keep it

4

What is cleaning the data?

The process of removing anomalies from a data set

5

What features does a box plot show with lines?

Lowest value
Highest value
Q1, Q2, Q3
Outliers are a cross

6

What should be done when comparing two box plots?

Use the same scale
Compare medians, IQR and extremes

7

What is bivariate data?

Data which has pairs of values for two variables

8

What is usually plotted on each axis?

x-axis - independent variable
y-axis - dependent variable

9

What is a causal relationship?

When a change in one variable causes a change in the other
Correlation doesn't show causation

10

What is the regression line?

Straight line that minimises the sum of the squares of the distance of each data point from the line

11

What is the equation of the regression line?

y = a + bx

12

When should you use a regression line to make predictions?

When values are within the range of the given data

13

What are histograms used for?

Continuous variables

14

What is the equation for frequency density?

Frequency density = frequency / class width

15

What is the equation for frequency and area?

Frequency = k x area of the bar

16

What does unimodal mean?

Data has one point where the distribution peaks

17

What does bimodal mean?

Two points in data where the distribution peaks

18

What is bivariate data?

Data made up of pairs
(x,y)

19

What is a scatter diagram?

Each variable is plotted along one of the axes

20

What are scatter diagrams used for?

Showing whether data is correlated

21

What is important to remember about correlation?

Correlation does not mean causation
Could be linked by another factor

22

What is linear regression?

Process used to find equation of regression line (line of best fit)

23

What is the explanatory variable?

Independent variable - variable which is affecting the other, always on horizontal axis

24

What is the response variable?

Dependent variable - variable being affected, always on vertical axis