Chapter 2 - Summarizing And Comparing Distributions (Lecture Slides) Flashcards

1
Q

What are descriptive statistics?

A

Quantitatively describes or
summarizes features from data in a compact, easily understood fashion.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is a distribution?

A

Describes the concentration of different
score values on the same variable.

We can use graphs, tables, or mathematical functions to describe how likely to observe each score.

A graph, table, or mathematical
function describing the frequency of each score.

There are both discrete variables &
continuous variables

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What do discrete variables measure?

A

Frequency = count

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What do continuous variables measure?

A

Density

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is a normal distribution?

A

A theoretical function that describes many physical, physiological, and
psychological traits.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is the anatomy of a normal distribution? What does it look like?

A

This is a bell shape

Whatever value is at the very peak of the bell shape is called the “center.”

The far left side is considered “low” / “lower tail.”

The far right side is considered “high” / “upper tail.”

The further up you go on the Y-axis the more often/likely it is to occur

The further down you go on the Y-axis the less often/likely is it to occur

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What does the center of a distribution represent?

A

A common or typical score value.

A distribution’s center is often where scores are highly concentrated (e.g., at the typical value).

The center of the distribution is often used as the basis for comparison because it represents a “typical” individual in each group.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What does the distribution’s spread capture?

A

The degree to which scores are similar or different.

The variability or dissimilarity of scores.

High spread (variability) indicates greater score differences among individuals and will look like a longer/lower/more spread out bell shape.

Low spread indicates similarity among the scores and will look like a shorter/taller/less spread out bell shape.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What does the distribution shape describe?

A

The symmetry or asymmetry of scores.

Distribution shape characterizes the concentration of scores at different values.

Distributions can be symmetric or asymmetric (skewed).

Shape is a useful descriptive tool that conveys a picture of the data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Describe a symmetric distribution

A

The left side of the distribution mirrors the right side (e.g., normal distribution).

Everything is the same on both sides. You can think of a perfect bell curve for example.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Describe a uniform distribution

A

This is also considered a symmetric
distribution and looks like a flat line all the way across.

All scores are equally likely, no single score occurs more frequently than others. It’s the same at 1 as it is at 2, 3, 4, 5, and so on.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Describe a positively skewed distribution (AKA right-skewed distribution)

A

Most scores are in the low range (far left side where you’ll find the peak of the bell curve) and higher scores in the upper tail (far right side) are less frequent.

It is also called right-skewed
distribution - the tail is the longest on the right side.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Describe a negatively skewed distribution (AKA left-skewed distribution)

A

Most scores are in the high range (the far right side where you’ll find the peak of the bell curve) and lower scores are in the lower
tail (far left side) are less frequent.

It is also called left-skewed distribution - the tail is the longest on the left side.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is a bar plot (bar chart)?

A

A bar plot (bar chart) uses bar height to represent the frequency or count of responses in each bin (category).

You’ve seen this many times, it’s just a graph with bars (with breaks in the horizontal axis).

It shows the relationship between a numeric and a categoric variable. Each entity of the categoric variable is represented as a bar. The size of the bar represents its numeric value.

It presents categorical data with rectangular bars with heights or lengths proportional to the values that they represent. The bars can be plotted vertically or horizontally.

The ordering of categories in nominal variables (marital status) is arbitrary in the bar chart, you can put them in any order on the chart.

The ordering of categories in ordinal variables (home value) is NOT arbitrary! They must go in order.

I believe it is used in nominal and ordinal variables - double-check this!

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is a frequency distribution?

A

A frequency distribution is a tabular display of the information from the bar graph.

The table includes a column of scores and their corresponding counts and percentages.

It’s a visual display that organizes and presents frequency counts so that the information can be interpreted more easily.

It looks like a table.

I believe it is used in nominal and ordinal variables - double-check this!

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is a histogram?

A

A histogram is a bar plot/chart for numeric variables (no breaks in the horizontal axis - they all touch one another).

A histogram is the most commonly used graph to show frequency distributions.

Helps you to more clearly see the shape of the data’s distribution.

They make the most sense when you have an interval or ratio scale variable and what you want to do is get an overall impression of the variable.

If your data are nominal scale then histograms are useless.

All you do is divide up the possible values into bins and then count the number of observations that fall within each bin.

This count is referred to as the frequency or density of the bin and is displayed as a vertical bar.

17
Q

What is a density plot/graph?

A

A density plot/graph applies a smoothing algorithm (kernel smoothing) that connects the histogram bars with a continuous function or curve for continuous variables.

A plot/graph displaying a smooth function
connecting the tops of bars in a graph.

It’s a smooth curve that shows the distribution of the data. The curve represents the proportion of the data in each range, rather than the frequency.

The peaks of a density plot help display where values are concentrated over the interval.

An advantage density plots have over histograms is that they are better at determining the distribution shape because they’re not affected by the number of bins used (each bar used in a typical histogram).

Density ≈ Bar height (similar to frequency).

I believe it is used in numerical variables - double-check this!

18
Q

What is skewness?

A

Skewness is asymmetry (positive or negative). Asymmetry in the distribution, positive (long tail pointing positive) or negative (long tail pointing negative).

A symmetric distribution (e.g., normal distribution, uniform distribution) has a skewness of 0 - I’m guessing this is because it’s the same on both sides - a mirror image.

Rules of thumb vary, but skewness and kurtosis values divided their standard errors exceeding ± 2 are often defined as large (a normal curve has values of zero).

It’s a descriptive statistic

19
Q

Positive, negative, and symmetric distribution:

A
  • If there are relatively more values that are far greater than the mean (the tail is on the positive/right side of the central value/mean), the distribution is positively skewed or right skewed, with a tail stretching to the right. The skewness value for a positively skewed distribution is positive.

*Negative or left skew is the opposite (the tail is on the negative/left side of the central value/mean). The skewness value for a negatively skewed distribution is negative.

  • Symmetric distribution has a skewness of 0 because data is distributed equally (normal distribution) on both sides of the peak or center (bell curve).
20
Q

What is kurtosis?

A

Put simply, kurtosis is a measure of the “pointiness” of a data set

Kurtosis is excessive peakedness (positive
values) or flatness (negative values).

A normal distribution has a kurtosis of 0 - I’m guessing because it’s not abnormally high or low.

By convention, we say that the “normal/bell curve” (black lines) has zero kurtosis, so the pointiness of a data set is assessed relative to this curve.

Rules of thumb vary, but skewness and kurtosis values divided their standard errors exceeding ± 2 are often defined as large (a normal curve has values of zero).

It’s a descriptive statistic

21
Q

Kurtosis: Platykurtic, Leptokurtic & Mesokurtic

A

Imagine a perfect bell curve…

  • If the data is not pointy enough (doesn’t reach the top of the bell curve, looks flat), the kurtosis is negative and we call the data Platykurtic.
  • If the data is too pointy (goes past the top of the bell curve), the kurtosis is positive and we say that the data is Leptokurtic.
  • If the data presents as a perfect bell curve in the middle (just pointy enough), we say it is Mesokurtic and has zero kurtosis.
22
Q

What is a between-group design?

A

Participants are randomly assigned to one of two experimental conditions.

23
Q

What are scale scores?

A

A scale score is a composite variable computed by summing or averaging the responses to questionnaire items
measuring the same construct.

A score computed by summing or
averaging questionnaire items that measure the same attribute.

Scale scores do a better job of characterizing psychological
characteristics than do single items.

A scale score variable is usually treated as a numeric variable.