Module 1: Introduction to Data Flashcards
Concept
Answer
A frequency table exhibits how…
frequencies are distributed over various categories (known as a frequency distribution)
Associated variables
When two variables show some connection/relationship with one another
Blocking (experimental design)
Grouping the sample based on variables which may effect the outcome and then randomizing within groups
Categorical variable
The individual entries are categories, the possible values are called “levels”
Cluster sample
Break the population into groups and then sample a fixed number of those groups and include all observations from each group; helpful when there’s a lot of variability between cases within a cluster but the clusters themselves don’t differ much from one another
Confounding variable
A variable that is correlated with both the explanatory and the response variables
Continuous variable
A numerical variable that has no limitation (e.g. infinite decimal points for precision); e.x. height, weight (think how much)
Controlling (experimental design)
Mitigate the differences between groups
Convenience sample bias
When individuals who are more accessible are more likely to be included in the sample
Cumulative frequency
The total of a frequency and all frequencies below it in a frequency distribution; the running total of frequencies
Cumulative relative frequency
Cumulative frequency for that category/Sum of all frequencies
Data
Information we gather with experiments and with surveys
Description
Summarizing the data that are obtained
Descriptive statistics
Refers to methods for summarizing the data; describes the sample only (graphs, numerical summaries)
Design
Planning how to obtain data to answer the questions of interest (experimental design, sample size, power, etc.)
Discrete variable
A numerical variable that only takes number values in jumps (e.g. whole numbers); e.x. the number that appears when throwing a die (think how many)
Experiment
Used to investigate the possible causal connection between variables
Explanatory variable
The variable (first) that causually affects the other
Frequency
The number of elements that belong in a certain category
Graphical methods
Histogram, boxplot, bar graph, etc.
Graphs (categorical)
Bar chart, pie chart; focuses on frequencies or relative frequencies of the levels of the variable
Graphs (numerical/scale)
Dot chart (discrete variable), stem-and-leaf plot, histogram, boxplot, scatterplot
Histogram
A bar chart that gives the frequencies or relative frequencies of occurrances of a scale variable in certain intervals; the heights of the bars in the histogram are called the distribution of the sample