Module 1 (Ch. 1-3) Flashcards

(45 cards)

1
Q

Population

A

All of the entities of interest in a study (ppl, households, machines, etc.)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Sample

A

A subset of the population, often randomly chosen and representative of the pop as a whole

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Data Set

A

A rectangular array of data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Variable

A

Aka Field or Attribute A characteristic of members of a population (height, gender, salary) ROW (left to right)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Observation

A

Aka Case or Record A list of all variable values for a single member of a population COLUMN (up and down)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Numerical Variable

A

A variable where meaningful arithmetic can be performed on (age, children, salary)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Categorical Variable

A

A variable where NO meaningful arithmetic can be performed. (gender or state) Can either be ORDINAL or NOMINAL. Can be coded numerically or left uncoded. Opinion Variables - “strongly disagree”

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Date Variable

A

Treated differently from typical numbers

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Ordinal (Categorical Variable)

A

There is a natural ordering of its possible values

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Nominal (Categorical Variable)

A

No natural ordering of its possible values

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Dummy Variable

A

0 - 1 coded variable for a specific category (1 for all in the category, 0 for all not in the category)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Binned or Discretized Numerical Variable

A

Categorizing a NUMERICAL variable by putting data into discrete categories (called BINS)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Discrete Numerical Variable

A

If it results from a count (number of children)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Continuous Variable

A

Essentially continuous measurement (weight or height)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Cross-Sectional Data

A

Data on a cross sections of a population at a distinct point in time

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Time Series

A

Data collected over time.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Count of Categories

A

Count the number of observations (columns) in each category

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Mean

A

The average of all values. In Excel…Average Function

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Sample Mean

A

A sample from some larger pop. Denoted by a X with a line above.

20
Q

Population Mean

A

Represents the entire population. Denoted by a “U”

21
Q

Median

A

Middle observation when data is sorted from smallest to largest. In Excel…Median Function

22
Q

Mode

A

The value that appears most often. In Excel….Mode Function

23
Q

Range

A

A measure of Variability (flexibility).

Maximum value minus minimum value.

Very sensitive to extreme values

24
Q

Interquartile Range (IQR)

A

3rd quartile minus 1st quartile.

Less sensitive to extreme values

25
Variance
the average of the squared deviations from the mean ex: (x - mean)2
26
Sample Variance
Denosted by s2
27
Population Variance
Denotd by O2
28
Standard Deviation
The square root of variance
29
Sample Standard Deviation
Denoted by s, the square root of the sample variance
30
Population Standard Deviation
Denoted by "o", the square root of the population variance
31
Emperical Rules
The values of a variable are approximately "normally" distributed - 68% within one standard deviation - 95% within two standard deviations - 99.7% within three standard deviations
32
Mean Absolute Deviation (MAD)
The average of the absolute deviations Emperical Rule : for many variables, the standard deviation is approximeatley 25% larger than MAD
33
Skewness
Occurs when there is a lack of symmetry.
34
Skewed to the Right
A variable can be POSITIVELY scewed because of really large values (ex: large baseball salaries)
35
Skewed to the Left
Negatively Skewed Can be skewed to the left because of small values (ex: temp. lows in Antartica)
36
Kurtosis
"fatness" of the tails of the distribution relative to the tails of a normal distribution. High Kurtosis has many more extreme observations.
37
Histogram
Most common type of chart for showing the distribution of a numerical variable/CROSS SECTIONAL VARIABLE. Based on binning the variable. A column chart of the counts in the various categories. Great for showing the shape of a distribution.
38
Box Plot
Used for CROSS SECTIONAL VARIABLES aka Box-Whisker Plot an alternative type of chart for showing the distribution of cross-sectional variables.
39
Time Series Graph
Main interest is to see how variables change over time.
40
Outlier
a value or an entire observation (row) that lies well outside of the norm
41
Filtering
finding records that match particular criteria
42
Crosstabs
aka Contingency Table a way to examine relationships between two categorical variables through counts and corresponding charts of the counts
43
Comparison Problem
When you want to compare a numerical measure across two or more subpopulations
44
Trendline
a line or curv that fits the scatterplot
45