Exam 1 Flashcards

(62 cards)

1
Q

Descriptive

A

-Descriptive Statistics Numbers that describe and summarize data
Ex: Unemployment rate in December 2024 was 4.1%

-Descriptive Statistics – “Just describing the data”

-Think of descriptive statistics as summarizing and organizing the information you already have. It doesn’t try to make predictions—just explains what’s in front of you.

-Example: You survey 100 students about their favorite ice cream flavor.
40 like chocolate
35 like vanilla
25 like strawberry

-That’s descriptive statistics! You’re just reporting what’s in the data.

  • Other examples of descriptive stats:
    The average test score in a class is 85.
    The tallest player on a basketball team is 6’8”.
    The most common car color in a parking lot is blue.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Inferential

A

-Inferential Statistics Make an inference from the data Infer from a sample what is going on with the population.
Ex. Political Polling

-Inferential Statistics – “Making guesses about a bigger group”

-Inferential statistics takes a small group (sample) and makes a prediction about a bigger group (population).

-Example: You survey 100 students at your school about their favorite ice cream flavor, and 40% say chocolate.
You then infer (or guess) that about 40% of all students in your school like chocolate, even though you didn’t ask everyone.

-Other examples of inferential stats:
A political poll surveys 1,000 voters and predicts that 55% of the whole country supports a candidate.
A researcher studies 50 patients and concludes that a new drug works for most people.
A scientist tests a few drops of ocean water and estimates how much pollution is in the entire ocean.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Descriptive vs. Inferential

A

The Key Difference
Descriptive Statistics = “Just the facts” (describing the data you collected).

Inferential Statistics = “Making an educated guess” (using a sample to predict something about a whole population).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Population vs. Sample

A

Population: Entire group you are interested in studying

Sample: Set of individuals selected from a population

Researchers analyze samples because studying an entire population is often impractical. The goal is to use inferential statistics to generalize findings from a sample to the full population.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Independent and Dependent variables

A

Independent variable (IV): manipulate
Dependent variable (DV): measure

Independent Variable (IV): The variable manipulated by the researcher (the cause).

Dependent Variable (DV): The outcome variable that is measured (the effect).
Example: A study tests the effect of sleep (IV) on test scores (DV).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Random Assignment

A

Assign participants to a group based on a random process Minimizes confounds Confound: extraneous variable that may influence the DV and explain the results

In an experiment, participants are randomly assigned to different conditions to ensure that groups are similar at the start.

helps reduce bias and increase the validity of research findings.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Random Sampling

A

Each member of the population has an equal chance of being in the sample Important for 2 reasons Estimate parameters of the population Confidence in the accuracy of your estimates

helps reduce bias and increase the validity of research findings.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Correlational Methods

A

Correlational designs are those that look at the relationships between two variables

Is your height related to your shoe size?
Variable: something that can be measured or counted

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Laboratory Experiments

A

Examine causal relationships
Independent variable (IV): manipulate
Dependent variable (DV): measure

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Variable

A

something that can be measured or
counted

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

measurement

A

The process of assigning numbers or categories to variables.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Scales of Measurement

A

Distinguish variables that have different
values

Four scales:
Nominal
Ordinal
Interval
Ratio

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Nominal

A

Nominal Scale

What it is: Just names or labels for different groups.

Key idea: There’s no order or ranking.

Example: Sorting people by their favorite color (red, blue, green).
You can’t say one color is “more” than another: they’re just different.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Ordinal

A

What it is: Categories that do have an order.

Key idea: You can rank them (like 1st, 2nd, 3rd), but the difference between ranks isn’t always equal.

Example: A race where runners finish 1st, 2nd, and 3rd.

We know 1st is better than 2nd, but we don’t know how much better.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Interval

A

An interval scale is a way of measuring things where:
1️⃣ The difference between numbers is always the same (equal intervals).
2️⃣ There is NO true zero (zero doesn’t mean “nothing”).
3️⃣ You can add and subtract, but you can’t multiply or divide meaningfully.

Ex.1
Think of a Thermometer! 🌡️
Imagine a thermometer that shows temperature in Celsius or Fahrenheit:

The difference between 10°C and 20°C is the same as between 20°C and 30°C (equal steps ✅).
0°C doesn’t mean “no temperature”, it’s just another point on the scale (NO true zero ❌).
🔹 That’s why temperature in Celsius/Fahrenheit is an interval scale!

ex.2
✔️ IQ Scores – A person with 140 IQ isn’t “twice as smart” as someone with 70 IQ.
✔️ Years on a Calendar – The year 0 doesn’t mean “no time,” it’s just a point in history.
✔️ Shoe Sizes – The difference between size 7 and 8 is the same as between 10 and 11, but size 0 doesn’t mean “no shoe.”

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Ratio

A

A ratio scale is a way of measuring things where:
1️⃣ The difference between numbers is always the same (equal steps).
2️⃣ There is a true zero (zero means “nothing” or “none of the thing”).
3️⃣ You can add, subtract, multiply, and divide (all math works!).

Think of a Measuring Tape! 📏
Imagine measuring height in centimeters:

The difference between 150 cm and 160 cm is the same as between 160 cm and 170 cm (equal steps ✅).
0 cm means “no height at all” (true zero ✅).
Someone who is 180 cm is literally twice as tall as someone who is 90 cm (multiplication makes sense ✅).
🔹 That’s why height is a ratio scale!

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Operational Definition

A

Way to describe and define your variable in a clear and measurable manner

Example: Instead of defining “intelligence” abstractly, a researcher might operationally define it as a score on an IQ test.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Discrete Variables

A

1️⃣ Discrete Variables
🔹 Definition: Whole numbers; can be counted, but not divided into smaller meaningful parts.
🔹 Examples:

Number of students in a class (You can’t have 2.5 students!)
Number of pets (You can’t have 3.7 dogs.)
Number of cars in a parking lot

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Continuous Variables

A

2️⃣ Continuous Variables
🔹 Definition: Can take any value within a range, including decimals and fractions.
🔹 Examples:

Height (Someone can be 170.5 cm tall.)
Weight (You can weigh 65.7 kg.)
Temperature (It can be 98.6°F.)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Discrete vs. Continuous variables

A

Discrete Whole numbers, counted People, pets, cars

Continuous Any value, measured Height, weight, temperature

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

Statistical Notation

A

X or Y variable
X: set of scores
N = number of scores in a population
n= number of scores in a sample
Σ (sigma) = Summation
μ (mu) = Population mean
M (or X̄) = Sample mean
σ (sigma) = Population standard deviation
s = Sample standard deviation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

Summation

A

where x=[2,5,3,6].

E3x = (3×2)+(3×5)+(3×3)+(3×6)=6+15+9+18=48

Ex = 2+5+3+6=16

E(x-2) = (2−2)+(5−2)+(3−2)+(6−2)=0+3+1+4=8

Ex^2 = 2^2+5^2+3^2+6^2=4+25+9+36=74

where x=[2,5,3,6] and y=[1,4,2,3].

Exy = (2×1)+(5×4)+(3×2)+(6×3)=2+20+6+18=46

∑ 1/X = 1/2+1/5+1/3+1/6 =0.5+0.2+0.333+0.166=1.199

where A= [ 1 2 3]
[ 4 5 6]

​∑ ∑ a = 1+2+3+4+5+6=21

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

Charts & Graphs:

A

simplify the organization and
presentation of data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

Frequency Distribution Tables

A

A Frequency Distribution Table shows how often each value appears in a dataset.

🔹 The left column (X) lists the values (scores).

🔹 The right column (f) shows how many times each value appears.

✅ Used to: Organize data before making a graph.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Histogram
Key Features of a Histogram ✅ Bars Touch Each Other → Unlike a bar graph, the bars in a histogram touch because the data is continuous. ✅ Shows Frequency of Data in Ranges → Instead of showing individual categories, it groups numbers into intervals (bins). ✅ Best for Interval & Ratio Data → It works well for exam scores, weights, heights, temperatures, etc.
26
Bar Graph
Used with nominal or ordinal data X-axis = category Y-axis = frequency /count
27
Frequency Distribution Graphs
These graphs visually display a frequency table. Types of Frequency Graphs 📊 Histogram (For Continuous Data) Bars touch each other because the data is continuous. Used for age, weight, temperature, etc. 📈 Polygon (Line Graph) A dot is placed at each frequency and connected with a line. Good for showing trends over time. Ratio Data (✔️ Best) Interval Data (✔️ Good) 📉 Bar Graph (For Categorical Data) Bars do not touch each other because categories are separate. Used for colors, brands, favorite foods, etc. 📄 Pie Chart (Less Common in Statistics) A circle divided into sections. Used for percentages or proportions. BEST WITH Nominal Data
28
Group Frequency Table & Histogram Rules
1. Between 6-12 intervals 2. Simple interval width 3. All intervals the same width 4. No gaps or overlaps
29
Symmetrical
Normal distribution has perfect symmetry Bell-shaped curve Looks the same on both sides of the center. Example: Heights of people in a group. Mean = Median = Mode all are the same
30
Unimodal
Same as symmetrical but both sides arent the same Has one peak (one mode). Example: Most students score around the same on a test.
31
Bimodal
Two Peaks (Modes): The graph has two high points where the frequency of values is highest. Data Grouped into Two Distinct Areas: Each peak represents a group of data that is more frequent than other values. Possible Causes: Bimodal distributions can occur when there are two different groups or populations within the dataset.
32
Positive Skewness
📌 Tail on the right (more low values, few high values). ✅ Example: Income (most people earn low, few earn high). High point is on the left MEAN IS ON THE TAIL AT THE RIGHT. MODE IS THE HIGH PEAK. MEDIAN IS IN BETWEEN MODE AND MEAN LEANING MORE TO THE MODE
33
Negative Skewness
📌 Tail on the left (more high values, few low values). ✅ Example: Test scores where most students did well. High point on the right MEAN IS ON THE TAIL AT THE LEFT. MODE IS THE HIGH PEAK. MEDIAN IS IN BETWEEN MODE AND MEAN
34
how to deal with Skewness
Log transformation - Non-linear transformation that changes the shape of the distribution. Log 10^x
35
Steps for Summing of the Deviations about the Mean
1. Calculate the mean ( M ) 2. Subtract: X minus the mean (X – M) 3. Add all the deviations 4. MUST equal ZERO!
36
Mean and it's properties
Sum of all the scores (∑X) divided by the number of scores (n) Properties of the mean -Most reliable and most used measure of central tendency -The mean does not need to be a value in the distribution -Strongly influenced by outliers -Sum of the deviations about the mean equals zero
37
Median and it's properties
Middle value when numbers are in order -Not affected by outliers -Splits the data into two equal halves -If there are two number in the middle because it’s an even amount of numbers you add both numbers and divide the total by 2 and that’s the median
38
mode and it's properties
Most common number in a dataset Can be used with categorical data There can be multiple modes (bimodal/multimodal) Not affected by outliers
39
Outlier:
an extreme score Several outliers can skew the distribution
40
-Selecting a measure of central tendency what measure of central tendency would be best to use when you have categories? When there are outliers?
Nominal (Categories: gender, colors, brands) best Mode Ordinal (Ranked: race positions, satisfaction levels) BEST Median Interval (Equal spacing, but no true zero: IQ, temperature in °C/°F) BEST Mean Ratio (Has a true zero: height, weight, age, income) BEST Mean When there are outliers in your data, it's generally better to use the median as the measure of central tendency.
41
Graphs appropriate for each measure of central tendency
1️⃣ Mode → Best for Nominal Data Best Graphs: Bar Graph (compares categories, highlights most frequent one) Pie Chart (shows the most common category as the largest section) Shapes of Distributions- BIMODAL 2️⃣ Median → Best for Ordinal Data or Skewed Distributions Best Graphs: Boxplot (Box-and-Whisker Plot) (shows median as a middle line) Histogram (useful for skewed data, where median is a better center) Shapes of Distributions- UNIMODAL AND NEGATIVE AND POSITIVE SKEWNESS REASONS WHY The median is less affected by outliers compared to the mean. This is because the median is simply the middle value of the ordered dataset, and outliers (extremely high or low values) do not significantly change the position of the median 3️⃣ Mean → Best for Interval & Ratio Data (Symmetrical Distributions) Best Graphs: Histogram (shows the distribution and where the mean falls) Line Graph (tracks mean changes over time) Boxplot (shows mean as a marker, often with the median)' Shapes of Distributions-SYMMETRICAL AND UNIMODAL
42
Transformations of scales
Adding/Subtracting a constant The mean & median shift by the same constant. The mode also shifts. Example: If you add 5 to every test score, the mean increases by 5. Multiplying/Dividing by a constant The mean, median, and mode are multiplied/divided by the same constant. Example: If all salaries are doubled, the mean, median, and mode also double.
43
Range
The difference between the Highest Value and Lowest Value The simplest measure of variability Formula: Range=Highest Value−Lowest Value Example: If scores are 5, 8, 10, and 15, then Range Range=15−5=10 Limitations: The range only considers the highest and lowest values, ignoring the rest of the dataset.
44
VARIANCE
Indicate how much the values are spread out/clustered around the mean​​ Mean is the reference point​ σ2 = variance​
45
Standard deviation
“Average deviation” from the mean ​ The distance between each score and the mean σ = standard deviation
46
Population vs. sample formulas
Population mean µ = ΣX/N Variance σ2 = Σ(X−µ)^2/N Standard deviation σ=√σ^2​ Sample Mean M= ΣX/n VARIANCE s2 = Σ(X−M)^2/n−1 STANDARD DEVIATION s=√s^2 -Use population formulas when you have data for the entire population (e.g., all students in a school). -Use sample formulas when you have a subset of the population (e.g., 100 students from the school).
47
When to use n – 1 correction Why do we use it?
The n−1 correction is used in sample variance and sample standard deviation to account for bias when estimating the population parameter. Why? A sample typically underestimates variability in the population. Using n−1 makes the variance slightly larger to compensate for this underestimation. Effect: Using n−1 slightly increases the variance, making it a better estimate of the true population variance.
48
Degrees of freedom
df = n – 1 ​ Number of scores in a sample that are independent and free to vary Example: If you have 5 numbers with a known mean, 4 of them can vary freely, but the 5th number is determined by the mean. Example to Understand Degrees of Freedom ✅ The first four numbers can be anything (they are free to vary). ✅ But the 5th number is forced to make sure the mean stays at 10. Example: If we choose four numbers: 8, 9, 12, and 11, the fifth number must be: X5=(5×10)−(8+9+12+11)=10 So, even though we started with 5 values, only 4 numbers were free to vary. Thus, the degrees of freedom (df) = n - 1 = 5 - 1 = 4.
49
Sum of Deviations
The sum of each value’s difference from the mean Always equals 0 because positive and negative deviations cancel out. Step 1: Deviations = X – M​ Step 2: Sum Deviations Σ(X – M) = 0 ​ SAME THING FOR POPULATION
50
Sum of Squares (SS) – you will need to know/recognize this formula
you will need to know/recognize this formula The sum of squared deviations from the mean Used in calculating variance and standard deviation. Step 3a: Square the Deviations (X – M)^2​ Step 3b: SS (Sum of Squares) ​ ss=(X – M)^2 How to Calculate Sum of Squares (SS) 1. Find the Mean (X) Add up all the numbers and divide by how many there are. 2. Subtract the Mean from Each Value This gives you how far each value is from the mean (called a deviation). Some deviations will be negative, some will be positive. Square Each Deviation This makes all values positive (so they don’t cancel out). Add Up All the Squared Deviations This final sum is the Sum of Squares (SS).
51
Transformations of scales
How transformations affect variability: Adding/Subtracting a Constant Does not change standard deviation or variance. Example: If all numbers increase by 5, the spread remains the same. Multiplying/Dividing by a Constant Standard deviation changes by the same factor. Variance changes by the square of the factor. Example: If all numbers are doubled, standard deviation doubles, but variance quadruples.
52
z scores o Definition
A z-score tells you how many standard deviations a value is above or below the mean. z = X−m/s for sample z = X−µ/σ for a population IF THERE ARE ALOT OF NUMBERS FOR X WHATEVER YOU GET FOR X-M U GOT TO SQUARED IT THEN ALL ADD ALL THE NUMBERS. IF THERE ARE MULTIPLE NUMBER FOR X
53
Standard scores
Z-scores are standardized scores, meaning they allow us to compare scores from different distributions. A z-score converts a raw score into a common scale with: A mean of 0 A standard deviation of 1 Example Calculation Let’s say: X=80 (Your score) μ=70 (Class average) σ=10 (Standard deviation) z=80-70/10=10/10=1 Interpretation: A z-score of +1.0 means your score is one standard deviation above the mean.
54
Uses of z
✅ Comparing Scores: Example: If you scored 85 on Test A (mean = 75, SD = 5) and 90 on Test B (mean = 85, SD = 10), z-scores can tell you which performance was better. ✅ Identifying Outliers: A z-score greater than +2 or less than -2 is often considered unusual. ✅ Converting to Percentiles: A z-score tells us where a value falls in a normal distribution (e.g., a z-score of +1.96 includes 95% of data).
55
Properties of standardized distribution
Mean = 0 Standard Deviation = 1 The shape of the distribution remains the same when converting to z-scores. If the original distribution is normal, the standardized version will also be normal.
56
how to compute z scores
Example: Let’s say a test has: Mean = 50 Standard Deviation = 10 Your score = 65 Step 1: Apply the formula z = X−µ/σ —-----> Z= 65-50/10 = 15/10 Step 2: Interpret the z-score A z-score of 1.5 means your score is 1.5 standard deviations above the mean. This means you scored better than most students!
57
What are the requirements for Probability and Sampling?
Random Sampling – Every individual in the population has an equal chance of being selected. Independent Events – The probability of one event does not affect the probability of another. Probability Range – Probabilities are always between 0 and 1 (or 0% to 100%). Sum of Probabilities – The total probability of all possible outcomes must equal 1 (or 100%). Independent random sampling Probability of being selected is constant
58
Probability and the Normal Distribution - For scores greater than X, less than X, between X and X * With X being a score (i.e., 500 on the SATs)
Many real- world data sets follow a normal distribution, which is a bell-shaped curve. In a normal distribution: The mean (μ) is at the center. Probabilities (areas under the curve) can be found using z-scores. The total area under the curve = 1 (100%). Example 1: Probability of a Score Greater than X Problem: Mean SAT score = 500, Standard deviation = 100 What is the probability of scoring above 600? Step 1: Convert to z-score z=600-500/100=100/100=1.0 Step 2: Find probability from the z-table A z-score of 1.0 corresponds to 0.8413 (area to the left). Since we need scores greater than 600, we subtract from 1-0.8413=0.1587 Final Answer: 15.87% of people score above 600. Example 2: Probability of a Score Less than X Find the probability of scoring below 400: Step 1: Convert to z-score Z= 400-500/100= -100/100= -1 Step 2: Find probability from the z-table A z-score of -1.0 corresponds to 0.1587 (area to the left). Final Answer: 15.87% of people score below 400. Example 3: Probability Between Two Scores (450 and 550) Step 1: Convert to z-scores Z450=450-500/100= -0.5 Z550= 550-500/100= 0.5 Step 2: Find probabilities from the z-table z=−0.5 → 0.3085 (area to the left). z=0.5 → 0.6915 (area to the left). Step 3: Find the area between them 0.6915−0.3085=0.3830 Final Answer: 38.30% of people score between 450 and 550.
59
how to use the z score table - Tail, body, and mean to z
Mean (μ) to Z: The area between the mean and a z-score represents the probability from the center of the distribution to that score. Body: The larger portion of the normal curve (closer to the mean). Tail: The smaller portion of the normal curve (further from the mean). Example: If a score falls at z = +1.5, the body (left side) is 0.9332 (93.32%), and the tail (right side) is 0.0668 (6.68%). -the graph when it's greater than if the tail in the right -when is less than the tail in the left shade the area bETWEEN THE NUMBER GIVEN
60
Formulas for Midterm #1 PSYC 3400 WHAT ARE THIS FORMULAS FOR -M= ΣX/n -s^2 = Σ(X−M)^2/n−1 -µ = ΣX/N -σ2 = Σ(X−µ)^2/N -z = X−µ/σ -σ=√σ^2​, -s=√s^2 -ss= Σ(X-M) ^2
-M= ΣX/n -Sample Mean -s^2 = Σ(X−M)^2/n−1 -Sample Variance -µ = ΣX/N - Population Mean -σ2 = Σ(X−µ)^2/N -Population Variance -z = X−µ/σ -Z-Score -σ=√σ^2​, -Population Standard Deviation -s=√s^2 -Sample Standard Deviation -ss= Σ(X-M) ^2 sum of Squares (SS)
61
How to calculate the z score if given probability of a tail
Example 1: Given a Tail Probability Problem: The right tail probability is 0.10 (10%). What is the z-score? ✅ Step 1: Convert tail probability to body probability Body Probability=1−0.10=0.90 Body Probability=1−0.10=0.90 ✅ Step 2: Look up 0.9000 in the Z-table → z = 1.28 ✅ Answer: The z-score is 1.28 Example 3: Given a Left-Tail Probability Problem: The left-tail probability is 0.025 (2.5%). What is the z-score? ✅ Step 1: Since it's a left tail, look up 0.0250 directly in the Z-table. ✅ Step 2: The Z-table gives z = -1.96 ✅ Answer: The z-score is -1.96
62
How to calculate the z score if given probability of a bODY
Example 2: Given a Body Probability Problem: The body probability is 0.70 (70%). What is the z-score? ✅ Step 1: Find 0.7000 inside the Z-table → z = 0.52 ✅ Answer: The z-score is 0.52