Data presentation and interpretation Flashcards
(27 cards)
what is discrete data ?
data that needs to be counted
What is the purpose of a frequency table for grouped data ?
It is used for large amounts of continuous data to show the frequency of data values within particular groups or classes.
How can you clarify grouping for discrete data ?
Use groups like 10–19 and 20–29 to avoid overlap
How should continuous data be grouped to avoid ambiguity ?
Use inequalities like
10 ≤ 𝑥 < 20 10 ≤ x < 20 and 20 ≤ 𝑥 < 30 20 ≤ x <30.
How should you adjust boundaries if there are gaps in continuous data ?
Modify the boundaries to avoid gaps, such as changing
10 ≤ x ≤ 19 and 20 ≤ x ≤ 29 to
9.5 ≤ x < 19.5 and 19.5 ≤ 𝑥 < 29.5
How do you find the modal class in a grouped frequency table ?
The modal class is the class with the greatest frequency.
What can you estimate from a grouped frequency table ?
You can estimate the mean and median, but not the exact values.
How do you estimate the mean from a grouped frequency table ?
- Find the midpoint of each class.
- Multiply each midpoint by its corresponding frequency.
- Find the sum of these values and divide by
𝑛 (the total frequency).
What is variance ?
Variance measures how spread out or varied a set of data is from the mean.
What does a high variance indicate ?
A high variance means the data is more spread out from the mean.
How is standard deviation related to variance ?
The standard deviation is the square root of the variance.
What is the symbol for population standard deviation and variance ?
Standard deviation: σ , Variance: σ²
What are the main features of a cumulative frequency graph ?
Cumulative frequency is plotted on the y-axis.
The x-axis typically shows the upper boundaries of the classes.
The graph accumulates frequencies from each class, including those below it.
Data points are connected with a smooth curve (or straight lines).
What is the difference between a histogram and a bar chart ?
A histogram displays grouped continuous data, whereas a bar chart is for discrete or qualitative data.
In a histogram, there are no gaps between bars, unlike in bar charts.
The height of the bar represents frequency density (not frequency).
What are the key features of a histogram ?
No gaps between bars (unless there are missing data).
Class widths may vary.
Frequency density is plotted on the y-axis.
Area of each bar represents the frequency for that class.
How do you calculate frequency density ?
FrequencyDensity=
ClassWidth / Frequency
How are outliers calculated using the interquartile range (IQR) ?
A value is an outlier if it is:
Less than 𝑄1 − 1.5 x IQR
Greater than 𝑄3 + 1.5 x IQR
How are outliers calculated using standard deviation ?
Less than
𝑥ˉ - 2𝜎
Greater than
𝑥ˉ + 2𝜎
What is linear regression ?
Linear regression is used when there is a strong linear correlation in a scatter diagram. A line of best fit is drawn to approximate the relationship between two variables, representing a linear relationship.
What is the Product Moment Correlation Coefficient (PMCC) ?
The PMCC is a numerical measure of the linear correlation between two variables in bivariate data. It is denoted by r.
What range can the PMCC (r) take ?
−1 ≤ r ≤ 1
Why do we use hypothesis testing for correlation ?
Hypothesis testing is used to determine if the product moment correlation coefficient (PMCC) from a sample is representative of the relationship in the entire population. It’s often impractical to collect data from the whole population.
What is the PMCC for the whole population and the sample ?
The PMCC for the whole population is denoted by ρ, and the PMCC for a sample is denoted by r.
What is the null hypothesis (H₀) for a correlation test ?
The null hypothesis is always: H₀: ρ = 0