Maths Flashcards

1
Q

Upper Quartile

A

75th percentile; Top 25%

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is Bias?

A

Prejudice and unconscious opinions that favors an option over another

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is a census?

A

The process of collecting data from all population members

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What does inference mean?

A

Drawing conclusions from evidence and reasoning.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Categorical variables
Numerical variables - give a definition of the two types of numerical variables

A

Groups (e.g. male, female).
Numerical - discreet (whole numbers e.g. 5)
Numerical - continuous (decimals e.g. 3.4)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Dot Plot: what is the modal number?

A

Most common number with the highest frequency.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What does PPDAC stand for?

A

Problem, Plan, Data, Analysis, Conclusion

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Sampling Type: Simple random sampling

A

Random sample from the population where each member has an equal chance of being selected.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

List 3 advantages of Simple random sampling

A

Easy to conduct because you don’t need to overcomplicate the selecting process.
Good representation of the population due to less biased results.
Convenient to randomly sample participants.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Sampling Type: Cluster Sampling

A

Separate the population into clusters (groups) before randomly selecting which clusters to sample.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

List 3 advantages of Cluster Sampling

A

Time efficient.
Suitable for large populations.
Convenient for geographically dispersed populations.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Sampling Type: Stratified Sampling
What does this use?

A

Divides the population into smaller subgroups (strata), then choose people within each strata at random to form a sample.
Uses proportional representation.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

List 3 advantages of Stratified Sampling

A

Reflects the population by capturing key population characteristics.
Ensures each subgroup in a population is represented in the sample.
Increased precision and accuracy by ensuring the subgroups of a population are proportionately represented.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Formula for Stratified Sampling

A

(# of members in strata/Total population) x desired sample size

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Stratified Sampling: What does the sample size need to be per group of participants?

A

30+ from each group/variable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

PPDAC cycle - Purpose.
How do you find the purpose?

A

Research the variables to gain an understanding of potential contexts.
Research the context of the variables in the investigation, and link this information to the observations and conclusions in the investigation.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

PPDAC cycle - Purpose
Provide 3 examples of research questions when investigating the purpose of an investigation?

A

How does Year level affect the weights of students?
Which year level tends to have larger/smaller weight?
Who is going to benefit from this investigation?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

PPDAC: Problem Outline

A

Is the Median (variable 1) of (Population) larger than the Median (variable 2) of (Population).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Prediction Outline

A

I predict that the (variable) of (group 1) will be larger than the (variable) of (group 2) because… . Research to back it up.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

PPDAC Cycle - Plan Outline: Pt 1
Stating what sampling method you are going to use.

A

I am going to use the stratified sampling method to extract a sample size of __ from __ on NZ Grapher.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

PPDAC Cycle - Plan Outline: Pt 2
Reasons for using the sampling method.

A

The reason I will use the stratified sampling method is because… (list advantages of stratified sampling method)
…it reduces selection bias, is efficient, accurate, and a fair sampling method that ensures that each group from the population (state population in brackets here) is properly represented in an analysis.

22
Q

PPDAC Cycle - Plan Outline: Pt 3
Discussing the sample size.

A

My sample size will be __ because this ensures that I will have a sample of at least 30 __ from each subgroup (state two groups here). It is crucial that I have a sample of at least 30 from each group, to give enough data from both to be confident in my conclusions.

23
Q

PPDAC Cycle - Plan Outline: Pt 4
Confidence in sample selection.

A

I am confident that my sample reflects the population characteristics accurately because it’s more than 30 and also shows the same proportion of both __ and __ in the population. I will use NZ Grapher to generate the box and whisker graph and dot plot and analyze them to draw an appropriate conclusion.

24
Q

PPDAC Cycle - Plan Outline: Pt 5
Stratified sampling calculations

A

Group 1 (group 1): (# of members in strata/Total population) x desired sample size
Group 2 (group 2): (# of members in strata/Total population) x desired sample size
The sample size of (group 1) in this investigation will be (#) and the sample size of (group 2) will be (#).

25
PPDAC cycle - Analysis: Name the 5 types of sample distribution
Centre (median), Shape, Spread, Overlap, Unusual Features.
26
PPDAC cycle - Analysis: What criteria (3) should you consider when analyzing the Centre?
Which subgroup is higher? State the median and how big the difference is between group 1 median and group 2 median. Provide reasons (prior knowledge, or what you noticed that supports your point) and online research (with link) that supports your point.
27
PPDAC cycle - Analysis: Centre Outline Pt 1 "I notice" statement.
Looking at the sample distribution, I notice that the median (variable) for (group 1)(group 1 median) is (difference) (input calculation: larger median - smaller median) greater than the (group 2) (group 2 median).
28
PPDAC cycle - Analysis: Centre Outline Pt 2 Discussing which group's median is larger, and how big the difference is.
This means that on average, the (variable) for (group) tends to be larger/smaller than that of (group). However, it should be noted that in this sample, the difference in the (variable) for (group) and (group) is relatively small/clear but not very large/big and significant.
29
PPDAC cycle - Analysis: Centre Outline Pt 3 Expectations with justification (reasons). Alternative addition in the event of an opposite trend.
I expected the (group) (variable) to be larger than the (group) (variable) because... (list reasons). However, the data shows the opposite trend, suggesting factors such as sampling variation and __ may influence the results.
30
PPDAC cycle - Analysis: Centre Outline Pt 4 Final part of Centre.
Research, with link, that confirms/supports reasons from Pt 3.
31
PPDAC cycle - Analysis: Shape List 5 attributes to take note of.
1) Look at the overall shape of the sample. 2) Write what you see: give numerical values as to where there is more data. 3) State if box & whisker plot is left or right skewed, or symmetrical. 4) State if dot plot is bi-modial, unimodial, etc. 5) State the approximate shape using the phrase, "Tends to be..."
32
PPDAC cycle - Analysis: Shape List 7 types of shapes and a description of the shape.
Normal Distribution: hill/mound shape, symmetrical, bell-shaped curve. Right Skew: Tail on the right side. Left Skew: Tail on the left side. Bimodial: 2 peaks. Triangular: triangular Irregular: No real pattern.
33
PPDAC cycle - Analysis: Shape Outline 1 Describing the shape for group 1 dot plot. Describing the shape for group 1 box and whisker plot. Reasons. Describing the shape for group 2 dot plot. Describing the shape for group 2 box and whisker plot. Reasons. Research.
In the sample distribution, I notice that the dot plot for (group 1) tends to be (shape). The box and whisker plot also shows that the (group 1) sample is left/right skewed. This may be due to (reason/outlier description) than the rest of the (group 1). In particular, (outlier description). This may be due to (plausible reason). The (group 2) dot plot sample shows (shape) with the box and whisker plot being left/right skewed. This may be due to (reason/outlier description). This again may be due to (reason). Research to back up.
34
PPDAC cycle - Analysis: Spread What two features do you talk about? How do you calculate the IQR?
Talk about position and size of IQR IQR = Max – Min (show calculations)
35
PPDAC cycle - Analysis: Spread Outline 1 Group 1 IQR. Group 2 IQR. Difference between IQR of two groups. Indication. Reasons and research.
Looking at the sample distribution, I notice that the box and whisker plot for (group 1) shows that the width of the box (which represents the interquartile range, IQR) is (IQR) (UQ - LQ). In comparison, the IQR for (group 2) is (IQR) (UQ - LQ). This means that the IQR for (group) is (difference between IQR's) wider/more than for (group), indicating that the middle 50% of (group) are more/less spread out compared to (group). Incorporate reasons then research to back it up.
36
PPDAC cycle - Analysis: Overlap
Refers to how much the datasets overlap.
37
PPDAC cycle - Analysis: Overlap Outline 1 Percentage of box plot for Group 1 that is greater than the median of Group 2. State which group has a majority variable over the other group. State which group has a greater average variable. Reasons and research.
The sample distribution box plot shows that _% of (group 1) (variable) is greater than the median (variable) for (group 2). This means that the majority of the sampled (group 1 as a population) have a greater (measurement: weight, percentage, height, etc) of (variable) than (group 2). This suggests that (group 1 as a population) will on average have a greater (variable) than (group 2 as a population) out of all (participant title: students, kiwis, rugby players, etc) from (source of data). A reason for this could be that...(research). This will mean that...(one sentence explanation/result of research).
38
PPDAC cycle - Analysis: Unusual Features List 4 types of unusual features. Formula for identifying outliers. Formula for identifying extreme values. NZ Grapher tip for identifying outliers.
Can be unusual clusters, unusual points, outliers, extreme values Identifying outliers: 1.5 x IQR Identifying extreme values: 3 x IQR NZ Grapher tip: Select “Box plot (no outlier)” to easily identify outliers.
39
PPDAC cycle - Analysis: Unusual Features Outline Identify the number of outliers and associated group. Show calculations for the selected unusual features formula to identify the value. State whether the unusual feature can be considered an outlier with reference to the value "1.5x".
Looking at the sample distribution for (occupation: rugby players, students, etc) for (group 1) and (group 2), I can see that there is (#) (occupation) in the (group 1) who has a (measurement: weight, height, etc) of (#). The IQR for (group 1) is (IQR) (UQ-LQ). 1.5 x IQR = 1.5 x (IQR) = (V). The UQ for (group 1) is (UQ) and therefore, (UQ) + (V) = (#). Therefore, this player can/cannot be considered as an outlier because the (measurement) is within/not within 1.5 times the IQR. However, this can be considered as an unusual point. Reasons and research to back it up.
40
What is a confidence interval?
Interval within which we expect the population median to lie.
41
Comparing Confidence Intervals Outline: Expectations of group 1 median value, and group 2 median value. Compare the confidence intervals, referring to evidence, overlap of intervals, and conclusion of which group median is greater.
We would expect the (group 1) median (measurement) to lie between (#) and (#), and we would expect the (group 2) median (measurement) to lie between (#) and (#). Comparing the confidence intervals, there is enough evidence for us to see that the confidence intervals do/do not overlap and therefore we can/cannot conclude that the median (measurement) for (group 1) is greater/smaller than the median (measurement) for (group 2).
42
PPDAC cycle - Conclusion: List the 5 characteristics of the conclusion
Describe confidence intervals. Make the call (second part of confidence interval outline). Describe sampling variability. Describe how the confidence intervals will be affected by a bigger sample size. Add reflections.
43
PPDAC - Conclusion: Sampling Variability Definition
Each time a sample is taken from the same population (with the same population size), it is likely that the sample estimate (mean/median) and the confidence intervals will be different because different members of the population can be selected, which will have different sample data. However, we would expect the median of the population to still lie within the confidence intervals generated by the sample.
44
PPDAC - Conclusion: Sampling Variability Outline: State the result of taking another sample from the same population, and the reason behind it. State position of the median.
If we take another sample from the population and investigate if (variables in context), we will get a different mean, median, etc. This is because each time we take a sample, different (variables in context) will be selected (each data has equal chance of being selected). However, we would expect the median of the population to still lie within the confidence interval generated by each sample.
45
PPDAC - Conclusion: Name the two scenarios where the confidence intervals are affected by a bigger sample size:
No overlap. Overlap.
46
PPDAC - Conclusion: Two scenarios where the confidence intervals are affected by a bigger sample size: No Overlap
The confidence intervals will be shorter if I increase my sample size. A larger sample will produce shorter informal confidence intervals, decrease sample variability, and result in more accurate sample estimates. Since my results already do not overlap, further shortening the intervals will make it even less likely that they will overlap. Therefore, I do not think increasing the sample size will change my overall inference.
47
PPDAC - Conclusion: Two scenarios where the confidence intervals are affected by a bigger sample size: Overlap
The confidence intervals will be shorter if I increase my sample size. A larger sample will produce shorter informal confidence intervals, decrease sample variability, and result in more accurate sample estimates. Since my results already overlap, further shortening the intervals will make it even more likely that they will continue to overlap. Therefore, I do not think increasing the sample size will have any impact on my inference.
48
PPDAC - Conclusion: List 8 reflection questions
Was my sample size adequate? Was there any bias in my samples? How did I minimize any bias in my samples? Was my sample representative of the population? Was the source of my data reliable? Is my sample data taken from my target population? Do my conclusions seem reasonable? What are some limitations of my investigation?
49
PPDAC - Conclusion: Reflection: Adequate sample size outline
I believe that my sample size of (#), consisting of (#) (group 1) and (#) (group 2), was appropriate given that the population included only (population total). If my sample size had been smaller, I would not have had enough data to draw meaningful conclusions. I felt that my sample provided adequate representation.
50
PPDAC - Conclusion: Reflection: Minimize bias in sample outline
I minimized bias in my samples by using the stratified sampling method, ensuring (advantages of stratified sampling method)
51
PPDAC - Conclusion: Reflection: reliable data source outline
My data source is reliable... (reasons)
52
PPDAC - Conclusion: Reflections and improvements
If I were to repeat my investigation around the median (variable) of (group 1) and median (variable) of group 2) from NZ and a sample of 100 (groups category - e.g. students) from the NZ Grapher (URL to NZ Grapher relevant data) database. I would perhaps consider (at least 3 considerations): - the location of (group category) because (reasons). This means (explanation). This external/internal factor would influence our investigation because (explain/reasons - e.g. not one of the focused on variables). - Consider anything that can affect the chosen variables of the investigation (e.g. species, fertility stage, gender, etc.), with reasons. Therefore, if I were to repeat my investigation, I would consider the 3 factors of (list 3 factors).