Flashcards in STA2300 Deck (75)

Loading flashcards...

1

## M1-4: What are quantitative, categorical and ordinal variables?

###
Quantitative: Take on numerical values. Can find the average (ie height, heart rate, etc.)

Categorical: Definite categories (ie male or female). Doesn't make sense to average. May be coded on SPSS.

Ordinal: Categorical data in a set order (ie survey - disagree, neutral, agree, etc.).

2

## M1-4: What graphs should we use for quantitative variables?

### Stem and leaf plot & histogram

3

## M1-4: What graphs should we use for Categorical variables?

### Bar chart & pie chart

4

## M1-4: What 3 features do we look at in graphs of quantitative variables (stem and leaf, boxplot & histogram)?

###
i) Shape - number of modes / peaks, symmetry, deviations, etc.

ii) Centre - a typical approximate value

iii) Spread - the range of values the data can take.

5

## M1-4: What is the 5 number summary?

### Minimum, Quartile 1, Median, Quartile 2, Maximum

6

## M1-4: What characterises the Normal model?

### Mean (mu) and SD (sigma) as well as bell-shaped approximation.

7

## M1-4: z-score is the number of standard deviations the observation is above the mean. Converting to a z-score, is a process called ________. What is the formula for this?

###
standardising

z = (y-μ) / σ

8

## M1-4: Converting z-scores to y is a process called _______? What is the formula for this? ** (not on formula sheet) **

###
unstandardising

y = μ + z σ

9

## M1-4: What is correlation?

### Measures the direction and strength of linear relationship between two quantitative variables. It is measured using the coefficient r (only if linear).

10

## M1-4: R^2 measures what?

### Strength only of a relationship between two quantitative variables. Normally expressed as a percentage.

11

## M1-4: What is the general form of a regression line? What do the components represent?

###
ŷ = b0 + b1x

ŷ denotes predicted value of y

b0 is the intercept

b1 is the slope

12

## M?? - What are the 5 guidelines to supporting P-values and conclusions?

###
> 10%: Insufficient evidence to support Ha (re-state Ha)

5-10%: Slight evidence to support Ha (re-state Ha)

1-5%: Moderate evidence to support Ha (re-state Ha)

0.1 - 1%: Strong evidence to support Ha (re-state Ha)

< 0.1%: Very strong evidence to support Ha (re-state Ha)

13

## M??? - There are rows on the formula sheet main page. What does each row provide the formulas and characters for, for both hypothesis testing and Confidence Intervals?

###
- The first row is for proportions

- The second row is for one-sample mean

- The third line is the two-sample mean

- The last line is for paired means

14

## M1-4: What are response and explanatory variables? What axis do they go on?

###
A response (dependent) variable is a particular quantity that we ask a question about in our study. We put it on the Y-AXIS.

An explanatory (independent) variable is any factor that can influence the response variable. We put it on the X-AXIS.

15

## M1-4: What are formulas for mean and standard deviations of a binomial?

###
The mean µ of a binomial is np.

The SD σ of a binomial is √npq

16

## M7: What is p-hat?

###
p̂ is a sample proportion statistic. It is a variable and has a distribution. Larger sample sizes means the mean stays similar, the spread gets smaller and sample proportion looks more Normal.

It is calculated by X / n, where n is the sample size and X is the number of occurrences of the desired event by sample size.

17

## M7: What is SD(y bar)?

###
SD(y bar) = sigma / square root of n.

It refers to the sample standard deviation.

Used in questions like: "The annual household income in Brisbane is known to be $72000 with a standard deviation of $12000. If we randomly select 80 incomes from this population, what is the probability that the average income in the sample is more than $75000?"

18

## M7: Law of large numbers states that as sample size increases from a population with mean µ, what happens to sample mean y¯ of observed values?

### It gets closer and closer to the population mean μ.

19

## M7: What is a standard error?

###
The SD of any sample proportion. It is found by the square root of (p hat x q hat / n).

So, where question is:

Suppose that 20% of a random sample of n = 64 Data Analysis students receive an A for the subject. What is the standard error of the sample proportion?

We get square root of ((0.2 x 0.8) / 64) = 0.05

20

## M7: How would you describe the distribution of sample proportions?

### The distribution of sample proportions is approximately normal with mean=p and standard error = square root of (pq / n).

21

## M7: What is a sample proportion and how can it be identified?

### It is when the question gives a p value. p and p-hat are not used in sample means (y and y hat are).

22

## M7: What is x bar in statistics?

### x-bar is used to represent the sample mean, a statistic, which is used to estimate the true population parameter, μ.

23

## M8: The statement "there is a 95% probability that the population mean is between 350 and 400" may also mean what?

### The 95% confidence interval for the population mean is (350, 400).

24

## M8: Does increasing the sample size increase or decrease the confidence interval width, and why?

### It decreases it, as it decreases the STANDARD ERROR, the statistic whereby n value is computed.

25

## * M8: What does statistical inference refer to?

### Drawing conclusions about parameters.

26

## M8: What is the Standard Error of the sampling distribution of a proportions question?

### SE(p-hat) = square root of ((p-hat x q-hat) / n)

27

## * M8: To halve the margin of error at the same level of confidence, what do you need to do?

### Find ME (critical value x SE(statistic)) and alter the n value in the SE(statistic) to work.

28

## M8: How do you find the ME?

### ME can be found from critical value x SE(statistic).

29

## M9: As the sample size increases, the Margin of Error ______ ?

###
Decreases.

The more samples / information you have, the more accurate your data is going to be, hence a smaller ME.

Large samples mean the ME nears zero.

30