Statistics Flashcards

(50 cards)

1
Q

Define population

A

The whole set of items that are of interest to

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Define census

A

Observes or measures every member of a population

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Define sample

A

A selection of observations taken from a subset of the population which is used to find out information about the population as a whole

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Advantages of disadvantages of a census and of a sample

A

Advantages of census
• It should give a completely accurate result

Disadvantages of census
• Time consuming and expensive
• Cannot be used when the testing process destroys the item
• Hard to process large quantity of data

Advantages of sample
• Less time consuming and expensive than
a census
• Fewer people have to respond
• Less data to process than in a census

Disadvantages of sample
• The data may not be as accurate
• The sample may not be large enough to give information about small subgroups of the population

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Define sampling units

A

Individual units of a population

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Define sampling frame

A

List where sampling units of a population are individually named or numbered

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

3 methods of random sampling

A

•simple random sampling
•systematic sampling
•stratified sampling

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Define and give Advantages and disadvantages of simple random sampling

A

the researcher randomly selects a subset of participants from a population

Advantages
• Free of bias
• Easy and cheap to implement for small populations and small samples
• Each sampling unit has a known and equal chance of selection of workers is not a whole number round to the nearest whole number.

Disadvantages
• Not suitable when the population size or the sample size is large as it is potentially time consuming, disruptive and expensive.
• A sampling frame is needed

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Advantages and disadvantages of systematic sampling

A

Advantages
• Simple and quick to use
•Suitable for large samples and large populations

Disadvantages
• A sampling frame is needed
•It can introduce bias if the sampling frame is not random

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Advantages and disadvantages of stratified sampling

A

Advantages
• Sample accurately reflects the population structure
• Guarantees proportional representation of groups within a population

Disadvantages
• Population must be clearly classified into distinct strata
• Selection within each stratum suffers from the same disadvantages as simple random sampling

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Define a simple random sample of size n

A

Every sample of size n has an equal chance of being selected

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Define systematic sampling

A

The required elements are chosen at regular intervals from an ordered list

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Define stratified sampling

A

The population is divided into mutually exclusive strata (e.g. males and females) and a random sample is taken from each

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Formula to calculate the number of people we should sample from each stratum

A

The number samples in a stratum = (number in stratum / number in population) x overall sample size

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

2 types of non-random sampling

A

•quota sampling
•opportunity sampling

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Define quota sampling

A

an interviewer or researcher selects a sample that reflects the characteristics of the whole population

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Define opportunity sampling

A

consists of taking the sample from people who are available at the time the study is carried out and who fit the criteria you are looking for

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Advantages and disadvantages of quota sampling

A

Advantages
• Allows a small sample to still be representative of the population
• No sampling frame required
• Quick, easy and inexpensive
• Allows for easy comparison between different groups within a population

Disadvantages
• Non-random sampling can introduce bias
• Population must be divided into groups, which can be costly or inaccurate
• Increasing scope of study increases number of groups, which adds time and expense
• Non-responses are not recorded as such

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Advantages and disadvantages of opportunity sampling

A

Advantages
• easy to carry out
• Inexpensive

Disadvantages
• Unlikely to provide a representative sample
• Highly dependent on individual researcher

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Define quantitative variables/data

A

Variables or data associated with numerical observations

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

Define qualitative variables/data

A

Variables or data associated with non-numerical observations

22
Q

Define continuous variable

A

A variable that can take any value in a given range

23
Q

Define discrete variable

A

A variable that can take only specific values in a given range

24
Q

Define mode or modal class

A

The value or class that occurs most often

25
Define median
The middle value when the data values are put in order
26
Formula of mean
_ x = Ex / n
27
Formula for mean in frequency table
_ x = Efx / Ef
28
Find median of both listed data and of grouped data
listed data: Find n -if decimal round up -if whole - halfway between this item and the one after Grouped data: find n/2 then use linear interpolation
29
Linear interpolation
Lower class boundary + ((amount into frequency / frequency of class) x class width)
30
P_57 n=43
43x0.57=24.51
31
Q_1 of 100 numbers
100/4=25th Interpolation using 25th number
32
P_10 of 41 numbers
41 x 10%=4.1 4.1st Interpolation using 4.1st number
33
Variance formula
Small sigma squared = (sum of squared values / number of values) - mean^2
34
Standard deviation
Sigma = root of variance
35
Coding standard deviation
Only multiply or divide affect
36
Common definition of an outlier
Either greater than Q_3 + k(Q_3 - Q_1) Or less than Q_1 - k(Q_3 - Q_1)
37
Cleaning the data
= the process of removing anomalies from a data set
38
Formula to calculate the height of each bar (frequency density) on a histogram
Area of bar = k x frequency
39
Frequency polygon from histogram
Join the middle of the top of each bar with equal class widths
40
When comparing data sets comment on:
A measure of location A measure of spread
41
What is Bivariate data
data which has pairs of values for two variables
42
What does Correlation describe
the nature of the linear relationship between two variables
43
causal relationship
44
regression line
45
The coefficient b tells you the change in y for each unit change in x How does correlation change b
• If the data is positively correlated, b will be positive • If the data is negatively correlated, b will be negative
46
When should you use the regression line
to make predictions for values of the dependent variable that are within the range of the given data
47
Venn diagram
48
Mutually exclusive events
P (A or B) = P(A) + P(B)
49
Independent events
P (A and B) = P(A) x P(B)
50
Tree diagram