QM LM7 Estimation and inference Flashcards

1
Q

What is a sample?

A
  • A method of obtaining information about a population’s parameters (mu and sigma)
  • Through sample statistics (XBar and S)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is probability sampling?

A

Where every member of a population has an equal chance of being selected
- Therefore samples will be more representative of the population

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is simple random sampling?

A
  • A form of probability sampling
  • Where a subset of a larger population is created such that each element has an equal probabiltiy of being selected
  • E.g. if population n = 500
  • Random number generator selects 50 numbers between 1 and 500
  • This creates a sample of 50
  • This method is useful when data are homogeneous
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is systematic sampling?

A
  • A form of probability sampling used when the population is too large to code
  • Select every kth element until the desired sample size is reached
  • When an auditor audits a company’s accounts it might look at every 10th accounts receivable because there are so many it is impractical to look at all
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is stratified random sampling?

A
  • A form of probability sampling used where the population is sub-divided based on one or more classifications
  • I.e., if surveying a large group of people we might subdivide by sex, age, and income level
  • Each sub sample is proportionate to the size of its sub population
  • This guarantees that population subdivisions are represented in the sample, making the statistics more precise
  • Simple random samples are drawn from each sub population, and each sample is then pooled to form the main sample
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is sampling error?

A
  • The difference between observed values of a statistic and population parameters
  • As a result of using just a subset of a population
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

How do we find sampling distribution of the sample means?

A
  • Take many samples from a population
  • Find their means. Their means will differ and themselves be random variables
  • Put their means together, they will form an approximately normal distribution
  • Find the standard deviation of this distribution
  • Done!
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is standard error?

A
  • take the sample standard deviation and divide by square root n of our sample’s size
  • Precision we can attach to our estimate created by sampling the population
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is cluster sampling?

A

Where population is divided into clusters, each of which is a mini representation
- Certain clusters are then selected as a whole using simple sampling. This is called “one stage cluster sampling”.
- If we sample WITHIN each cluster as well, this is called two stage cluster sampling

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What are the drawbacks of cluster sampling?

A
  • Usually results in lowest precision since a cluster may not be representative of the population
  • Is however time and cost effective
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is non probability sampling?

A
  • Depends on factors such as judgement or convenience (in terms of access to data)
  • Runs the risk that samples may be non representatve
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is convenience sampling?

A
  • A form of non probability sampling
  • Observations are selected that are easy to obtain or are accessible
  • Not necessarily representative, but low cost
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is judgemental sampling?

A
  • A form of non probability sampling
  • Select observations based on experience and knowledge
  • useful when there is a time constraint and/or the specialty of the researcher would result in better representation
  • I.e., during audit an auditor may look at specific accounts or kinds of transactions with the knowledge that if these are okay usually the rest are okay
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

How do we estimate population mean based on samples?

A
  • Take the mean of the sample means
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What happens when we take many samples from the population?

A
  • As the n of our samples increases the distribution of sample means (when they are plotted on a histogram) the tails shrink and head gets taller
  • When sample size is something like 1000 the sampling distribution of the sample means will almost be a straight line up the centre and will be very accurate
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Why is it incorrect to say that standard error is the standard deviation of the sampling mean?

A
  • SD is dispersion from the mean, a data description which is constant and finite
  • Sampling error is data inference, it is not constant and will change based on the size of the samples
16
Q

How do we deal with having only a single sample?

A
  • We can repeatedly draw samples from an original data sample in order to estimate population parameters
  • This may involve the bootstrap method, which involves resampling thousands of times to CREATE a distribution
17
Q

What is the jackknife method?

A
  • A method of dealing with only having one sample when you want to find the population
  • You omit one observation from a sample, one at a time
  • With this method we can only create as many samples as the n that we have
  • this will produce similar results from sample to sample, whereas the bootstrap method may not (2 samples may contain completely different numers)
18
Q

How do we find a 95% confidence interval for sample size 63 and sample mean 15?

A
  • It will be sample mean +- 1.96 (s / sqrt(n))
  • 1.96 is the number of standard deviations required to capture 95% of a normal distribution
  • (s / sqrt(n)), where n is 63 and s is the standard deviation of our sample, shows the standard error of our sample
    = This works because s / sqrt(n)) is extremely close to the standard error we would find if we took lots of samples of the population and found the standard error across all of them
19
Q

What is the difference between descriptive and inferential statistics?

A
  • Descriptive statistics merely describe the sample
  • We just use sample mean and sample standard deviation
  • Inferential statistics attempt to make inferences about the whole population based on the sample/s
  • In this case we use the standard deviation of the whole distribution using s / sqrt(n))
  • This is the standard error
20
Q

Is a confidence interval used in descriptive or inferential statistics?

A
  • Inferential
  • We use confidence interval when we have samples of the population but not the whole population
  • Confidence interval describes how confident we are that something like the mean lies within a range we might derive from our samples
21
Q

What is standard error?

A
  • The standard deviation of the distribution of sample means we get when sampling our population multiple times