Intro Flashcards
Questionable conclusions could be a result of:
- Insufficient number of data points
- “Bad” data points
- Incomplete data points
- Misinformation
Inferential Statistics is….
Inferential Statistics: Drawing conclusions about population based on sample data
Why samples?
– Obtaining information on the entire population is expensive
– It may be impossible to examine every member of the population (e.g. battery life)
Two Data Types:
– Cross-Sectional Data
– Time-Series Data
Cross-Sectional Data:
Cross-Sectional Data: recording characteristics of many subjects at the same point in time, or without regards to differences in time
Time-Series Data:
Time-Series Data: over a period of time, certain group of people/events, objects
Structured Data:
Well defined structured (length, format), columns/rows with specific characteristics
- numbers, dates, groups of words
Unstructured Data:
No defined structured
texts, images, videos, social media posts, blogs
Variable:
A characteristic of interest that differs in kind or degree
Quantitative Variables:
discrete (countable: # of people) or continuous (uncountable: weight, height, time) variables
Measurement Scales for Qualitative Data:
Nominal Scale: CATEGORIZE - values differ by name/label (we can substitute them to mean a level: ratings 1-4 - poor, fair, good, excellent)
Ordinal Scale: CATEGORIZE & ORDER/rank, arithmetic operations
Measurement Scales for Quantitative Data:
Interval Scale:
- categorize
- rank
- difference between scale values are meaningful
- Ex. temp scales
- ratios are not meaningful
Ratio Scale:
- categorize
- rank
- arithmetic is meaningful
- value of zero is true zero
- ratios are meaningful
What is a frequency distribution table used for?
Frequency distribution table:
- qualitative data
- groups into categories
- Ex. Weather in Seattle for Feb. 2010
When do you use a relative frequency table?
When the totals for the frequency distribution table are not the same.
- use percents to compare to get relative frequencies
Guidelines for Constructing a Frequency Distribution
- classes are mutually exclusive (no overlap -> signified by no equal)
- classes are exhaustive
- total number of classes in a frequency distribution usually ranges from 5 to 20 (not too much detail, but not too little)