6 - Data Analysis and Probability Skills Flashcards
This deck focuses on interpreting data from charts, graphs, tables, and spreadsheets, identifying trends and patterns, and making inferences. It also covers calculating and comparing mean, median, mode, and range, as well as solving various problems using probability concepts.
Identify:
What are 3 real-world examples of the application of statistics and probability?
- College acceptance rates.
- Baseball analysis.
- Probability of winning the lottery.
Define:
graph
Visual of data presented on axes that sort relationships between variables.
Identify:
The three major types of graphs.
- Bar graphs
- Line graphs
- Pie charts
Explain:
The purpose of a graph title.
To explain what the graph is about.
Explain:
What does the x-axis represent in a graph?
It is horizontal and can be made up of categories or numbers.
Explain:
What does the y-axis represent in a graph?
It is vertical and typically made up of numbers.
Explain:
The function of a graph legend.
To help interpret results, especially when there are multiple lines or bars.
When should you use a bar graph?
When the numbers being looked at are independent of each other.
Explain:
When is line graph most appropriate to use?
To show change over time with connected data points.
Explain:
When is pie chart most appropiate to use?
When representing parts of a whole, often in percentages.
Identify:
Where does the x-axis runs along the graph?
bottom
Identify:
Where does the y-axis run along the graph?
left
Describe:
line graph
A graph with a line connecting data points.
Line graphs can show trends over time and provide visual insights into data.
Explain:
How to read a line graph.
Look at the x-axis, draw an imaginary line up to the data point, and then draw another line to the y-axis for the value.
This method allows you to accurately determine the value associated with a specific data point.
Explain:
What it means if the line on a graph increases?
It suggests that the measured variable is increasing over time.
This can indicate growth, improvement or an upward trend in performance.
Explain:
What is indicated if the line on a graph decreases?
The variable being measured is declining.
This indicates a decrease or reduction in the performance or value over time.
Explain:
The primary use of a bar graph.
To compare different sets of data within a group.
Bar graphs can also show changes over time.
Explain:
The primary use of a pie chart.
Show how different sets of data compare to each other and to the whole.
Identify:
What do bars in a bar graph represent?
Data measurements.
The length of each bar relates to the measurement in the data.
Describe:
How is data presented in a bar graph?
One axis displays the categories, while the other axis shows the range of values.
Identify:
What is an important aspect of reading a bar graph?
Intervals on the scale.
The scale often increases by larger units marked by tick marks.
Identify:
What does a pie chart represent?
The entire amount or 100%.
It uses sectors to show the size or measurement of each part of the data.
Identify:
What is the main requirement for all sectors in a pie chart?
They should add up to 100%.
This allows for easy comparison of the parts.
Explain:
How can you calculate the number represented by a sector in a pie chart?
Multiply the percentage (as a decimal) by the total.
In this example, the amount of sales for Starbucks can be calculated by taking 14%, written as 0.14, and multiplying it by the total ($85 billion).
Identify
What type of graph uses rectangular bars to represent data?
bar graph
Bar graphs can be displayed horizontally or vertically.
Define:
measures of central tendency
A set of numerical values that best represent or summarize the middle of a data set.
Measures of central tendency include mean, median and mode.
Define:
descriptive statistics
A type of statistics used to describe and summarize values in a data set.
Define:
mean
The average of the individual values of a data set.
Calculated by dividing the sum of all values by the total number of values.
Explain:
3 steps to find the mean.
- Determine the total number of values within the dataset.
- Determine the sum of the values in the data set.
- Divide the sum of all the values by the total number of values in the dataset.
This is synonymous with finding the average.
Define:
median
The number in the middle of a data set with an equal number of values higher and lower than it.
It can be found differently depending on whether the data set has an odd or even number of values.
Explain:
How can you find the median when the data set has an odd quantity of numbers?
- Organize the values in the set “by magnitude” (from least to greatest).
- Find the value located directly in the middle of the set.
Explain:
How can you find the median when the data set has an even quantity of numbers?
- Organize values in the set by magnitude.
- Find the two values that are located directly in the middle of the set.
- Add the the two values together then divide the sum by two.
Define:
mode
The number or value that occurs the most in a data set.
Useful for categorical data, which can be organized into groups.
Explain:
Steps to find the mode.
- Organize data by groups or magnitude.
- Identify the value that occurs most frequently within the set.
Define:
range
The numerical difference between the maximum and minimum values of a data set.
It indicates the spread of the data but can be misleading if there are extreme outliers.
Identify:
Formula for finding the range.
Maximum value minus minimum value.
Range provides the spread of a data set.
In what type of distribution are the mean, median and mode all at the center?
Symmetrical distribution.
Often visualized as a bell-curve.
Describe:
skewed distribution
When the tail of a distribution on a graph is longer than the other side.
Can be left-skewed or right-skewed depending on which side the tail is longer.
Define:
descriptive statistics
A type of statistics used to describe and summarize values in a data set.
It does not provide information about individual values.
Define:
categorical data
Data that can be organized into groups but does not have mathematical meaning.
Examples include zip codes and phone numbers.
Define:
outlier
A value that is significantly higher or lower than the rest of the values in the data set.
It can impact measures like the mean.
Explain:
How does skewness affect the mean?
The more a distribution is skewed, the less accurate the mean.
The median may provide a better representation in skewed distributions.
Define:
probability
Probability is the likelihood that an event will happen, ranging from impossible (0) to certain (1).
Examples include rolling a die or predicting weather outcomes.
Identify:
The range of probability values.
0 to 1
Zero indicates an impossible event, while one indicates a certain event.
Identify:
The formula for calculating probability.
Probability = Favorable Outcomes / Total Outcomes
This formula helps determine the likelihood of specific events.
Define:
simple probability
The probability of one event happening.
It uses the formula Probability = Favorable Outcomes / Total Outcomes.
Identify:
simple probability example
Probability of rolling an even number on a die.
For example, if you roll a six-sided die, the chance of getting a two is 1 out of 6.
Define:
sequential probability
The probability of two or more events occurring.
It can involve dependent or independent events.
Define:
dependent events
Events where the outcome of one affects the outcome of another.
Also called conditional events.
Example: Taking a marble from a bag without replacement.
Define:
independent events
Events where the outcome of one does not affect the outcome of another.
Example: Flipping a coin and rolling dice.
Explain:
How do you calculate sequential probability?
- Find the probability of the events like in a simple probability.
- Multiply the probabilities together.
Keep the probabilities in fraction form when multiplying.
Identify:
3 ways to express probability
- Fraction
- Ratio
- Percentage
Each format is useful in different contexts.
Explain:
What is the probability of picking an apple from a bowl containing 4 apples and 10 total fruits, expressed as a fraction?
4/10
This can also be expressed as a ratio (4:10) or percentage (40%).
Explain:
What does a probability line indicate?
It visually represents the likelihood of events from impossible (0) to certain (1).
Markings on the line can show specific probabilities like 1/4 or 1/2.
Explain:
Why is probability relevant in daily life?
It helps predict outcomes in situations like weather forecasts or sports events.
Understanding probability aids decision-making based on likelihood.
Define:
marginal probability
The likelihood of an event occurring without the influence of other events, notated as P(A).
P stands for probability and A for an event occurring.
Define:
The complement of an event.
The event not occurring, commonly notated as A′ or Ac.
Define:
joint probability
The likelihood of two events occurring at the same time, notated as P(A⋂B).
Define:
conditional probability
The likelihood of an event occurring given that a different event has already occurred, notated as P(A|B).
Identify:
The formula for basic (marginal) probability.
P(A) = number of ways A can occur / total number of possible outcomes.
Identify:
The formula for complement probability.
P(A′) = 1 - P(A)
Identify:
The joint probability formula for independent events.
P(A⋂B) = P(A) ⋅ P(B)
Explain:
What does it mean for events to be mutually exclusive?
They can never happen simultaneously, notated as P(A⋂B) = 0.
Explain:
What is the addition law of probability?
It is useful when we want to calculate the probability of either one event or another happening.
P(A⋃B) = P(A) + P(B) - P(A⋂B).
Identify:
The formula to calculate the probability of occurrence for three or more disjoint events.
P(A⋃B⋃C) = P(A) + P(B) + P(C)
Explain:
What is conditional probability?
If the probability of an event occurring depends on another or a prior event, it is said to be conditional.
Identify:
Conditional probability formula.
P(A|B) = P(A⋂B) / P(B)
Identify:
The formula for calculating the Law of Total Probability.
P(A) = P(A⋂B) + P(A⋂C)
Explain:
The Bayes’ Theorem
It is a formula used to find the probability of an event based on prior knowledge of related events.
It converts one conditional probability to another.
Explain:
Law of Total Probability
It states that the probability of an event is equal to the sum of the probabilities of its components.
Identify:
3 laws of probability
- Multiplication rule
- Addition rule
- Complement rule
Describe:
The multiplication law of probability.
It is used when calculating the probability of A and B. The two probabilities are multiplied together.
Describe:
The addition rule of probability.
- It is used when calculating the probability of A or B.
- The two probabilities are added together.
- Then, the overlap is subtracted so it is not counted twice.
Describe:
The compliment rule of probability.
It is used when calculating the probability of anything besides A. The probability of A not occurring is 1-P(A).