Unit 1 Flashcards

1
Q

Categorical data

A

Data that represents groups or labels (also called qualitative).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Quantitative data

A

Data that takes on numerical values or amounts that are measured.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Discrete vs. continuous variable

A

Discrete variables take on a countable number of values (with gaps), while continuous variable take on an infinite number of values (with no gaps)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Variable

A

A characteristic that changes from one individual to another

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Frequency vs. Relative Frequency

A

Frequency represents the number of individuals in categories, while relative frequency represents the proportion or percent of individuals in categories.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Bar graph vs. Histogram

A

Bar graphs present categorical data (categories on bottom), while histograms represent numerical data (numbers on bottom)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Which graphs most often display categorical data?

A

Bar graphs & pie charts

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Dot plot vs. Stem & leaf plot

A

Dot plots use dots to represent numerical data while stem and leaf plots use leading digits and subsequent digits as the leaf.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Marginal distribution

A

The percent or proportion of individuals that have a specific value for one categorical variable (independent of other categories) (one-way down row or column).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Conditional distribution

A

The percent or proportion of individuals that have a specific value for one categorical variable among individuals who share the same value for another categorical variable (dependent of another variable) (looking at both rows and columns).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Describing the distribution should include:

A

Shape, center, variability (spread), and unusual features.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Shapes:

A

Symmetric, skewed left (high point to right), skewed right (high point to left).

Unimodal (single peak), bimodal (two peaks), uniform (no peaks).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Center

A

Most common value (sort of like mean)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Variability

A

Range of data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Unusual Features

A

Outliers (singular values far away), gaps, clusters.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Mean

A

Average of all data

17
Q

Median

A

Middle value of data.
Middle value of set if odd, average of two middle values if even.

18
Q

Q1

A

Median of the first half of the data. (don’t include median)

19
Q

Q3

A

Median of the second half of the data. (don’t include median)

20
Q

Range

A

Difference between minimum and maximum value in dataset

21
Q

IQR

A

Q3-Q1
(median of first half - median of second half).

22
Q

Standard deviation

A

The average distance away from the mean.

23
Q

Five-number summary

A

min, max, Q1, Q3, median.

24
Q

Percentile

A

The percent of data values that are less than or equal to a given value

25
Q

Z-score

A

(Data value - mean) / SD

26
Q
A