Probability, Correlation And Hypothesis Testing Flashcards by Tegan Woods

Comparative pie charts formula

How well did you know this?

Not at all

Perfectly

Outliers formula

How well did you know this?

Not at all

Perfectly

Comparative pie charts

The ratio of the sample size is the same as the ratio of the areas

How well did you know this?

Not at all

Perfectly

Population mean

How well did you know this?

Not at all

Perfectly

Sample mean

How well did you know this?

Not at all

Perfectly

‘Sum of’

How well did you know this?

Not at all

Perfectly

The sample mean when xi occurs with a frequency fi

How well did you know this?

Not at all

Perfectly

What is discrete data?

Data that can only take certain values which are often integers but sometimes aren’t , for example shoe size

How well did you know this?

Not at all

Perfectly

What is continuous data?

Can take any numerical value such as height

How well did you know this?

Not at all

Perfectly

What is the range?

Highest value - lowest value

How well did you know this?

Not at all

Perfectly

What is IQR?

Q3 - Q1

How well did you know this?

Not at all

Perfectly

Standard deviation formulas

How well did you know this?

Not at all

Perfectly

Variance formulas

How well did you know this?

Not at all

Perfectly

What is probability?

How well did you know this?

Not at all

Perfectly

What is a set?

A collection of numbers which cannot have repeats

How well did you know this?

Not at all

Perfectly

What is a subset?

All the elements in ‘A’ are in ‘S’

How well did you know this?

Not at all

Perfectly

What is an empty set?

An imaginary set with no elements

How well did you know this?

Not at all

Perfectly

What is a sample space?

All the possible outcomes of a random experiment

How well did you know this?

Not at all

Perfectly

Complement of A

A’ (not A)

How well did you know this?

Not at all

Perfectly

B is a subset of A

If B occurs so does A

How well did you know this?

Not at all

Perfectly

Mutually exclusive

The occurrence of one event excludes the possibility that any other events could occur (they cannot happen at the same time)
If A and B are exclusive the probability of A or B occurring is the probability of the sum of AUB

P(AUBUC) = P(A) +P(B) +P(C)

How well did you know this?

Not at all

Perfectly

Independent events

The probability of event A occurring is unaffected by whether or not B occurs
If A and B are independent then P(AnB) = P(A) x P(B)

How well did you know this?

Not at all

Perfectly

The addition law of probability

Study These Flashcards

Multiplication law

Study These Flashcards

What is Pearson’s Product Moment Correlation Coefficient

The PMCC is denoted by R and named after Pearson, an applied mathematician who worked on the application of statistics to genetics evolution

PMCC formulas

Interpreting PMCC values

R = 1 perfect positive correlation R = -1 perfect negative correlation R = 0 no linear correlation

What does a measure of correlation indicate?

A relationship between the two values however, it does not indicate a causal relationship

Spearman’s correlation coefficient formula

Spearman’s

Makes no assumptions about the original data and the original data does not need to be linear

PMCC

We can only do a hypothesis test here if the variables are jointly normally distributed

H0 and H1

H0: null hypothesis (no correlation) H1: correlation

Hypothesis testing

What is a regression line?

It should intersect the double mean point and should be linear for bivariate data The equation for the linear regression line is given as: Y = ax + b Where a is the gradient and b is the y intercept X is the independent value and y is the dependent

Things to consider when analysing the regression model

How do we interpret the model How can we interpret in context the coefficient of x How can we interpret in context the constant term

What is a residual?

An error the model produces when trying to predict a data point It is the distance between the data point and regression line For y on x regression it is only sensible to consider predictions for y

How to calculate a residual?

What does a positive residual indicate?

Where the model is giving an underprediction

What does a negative residual indicate?

An overprediction

What should we see when we plot predicted vs actual?

Strong positive correlation

What should we see when we plot predicted vs residual?

A uniform distribution clustered around zero with no patterns

Anscombe’s quartet

Each data set has the same summary statistics and are clearly different

Unstructured statistics

Each data set has the same summary statistics but they are visually different

The normal distribution diagram

The normal distribution formula

What is the z-value?

The number of standard deviations a value is above/below the mean Because the normal distribution is symmetrical we can use the positive z-value to calculate the negative

We can only use the z-table when…

The z-value is positive (on the right of the graph) We’re finding the probability to the left of this z-value

Changing the direction of the inequality

Changing the sign or direction of the inequality does ‘1-‘ If we do both they cancel out

Standardising formula

To find the z score?

To find the z value for a probability?

Use the z table backwards Find the value on the table and work backwards

Central limit theorem

If we continually take samples of the same size and record their corresponding sample means, they themselves will be normally distributed around the known population mean

How is the sample mean normally distributed

Standard deviation

Continuity corrections

We can convert discrete data to continuous

Approximating

To approximate a binomial distribution as a normal we can copy over the mean and variance of the binomial We must change the letter as it is a different distribution

Probability, Correlation And Hypothesis Testing Flashcards

(56 cards)