Exam 1 Flashcards
(91 cards)
Data file
the format in which statistical format is organized, typically in spreadsheet form. Rows contain measurements for a particular subject, columns contain measurements for a particular characteristic
Simulation
use of a computer to mimic what would actually happen if you selected a sample and used statistics in real life. These are done when it is not practical to physically perform an experiment. Probability sampling is used in designing simulations
Response variable
variable we are interested in measuring
component
what you are simulating through use of a random device
trial
One repetition of a simulation/experiment
Steps for building simulations
- Identify component to be repeated/simulated
- Explain how you will model the component’s outcome
- State response variable clearly
- Explain how to combine the components into a trial to model the response variable
- Run several trials
- Collect and summarize the results of the trials
- State your conclusion
3 reason for studying stats
- being informed
- making good decisions
- evaluate decisions that affect you
Definition of statistics
The science of learning from data in the presence of variability. variability is everywhere
Statistical problem solving process
- formulate a statistical research question
- collect data
- analyze data
- interpret results
Main components of statistics
- design: plan on how to obtain data to answer the question
- description: summarize and analyze the data
- probability: determine how sample differs from population
- Inference: make decisions and predictions
Variable
any characteristic observed in a study
data
the values of a variable for one or more people or things
Observation
(subject) an individual piece of data
data set
the collection of all observations for a particular variable
Categorical variable
(qualitative) Non-numerical variable with different categories, can still be a number depending on what that number represents
Quantitative variable(and types)
a numerical variable
Types
1. Discrete: values form a set of separate numbers. Typically something we count
- continuous: values form a continuum of values, infinite number of possible values. Typically something we measure
Reasons for identifying different data types
- Choose appropriate graphical display
2. Choose correct statistical method for inferential procedures
W’a and H for data
How, What, Where, When, Why, Who
Frequency distribution
A listing of distinct categories and their frequencies
Relative frequency distribution
A listing of distinct values and their relative frequencies(proportions and percentages). Used to compare samples of unequal size
Joint event
Event with two or more characteristics
How to tell if there is an association or not?
Association: relative frequencies differ
No association: relative frequencies are similar
Dot plots
- easy to make
- useful for comparing 2 or more data sets
- display individual values of data set
- good for smaller data sets
- shows raw data
Stem plots
- not useful with large data sets
- Usually displays more info than histograms
- include raw data
- useful for comparing 2 or more data sets
- Have “stem”(can have more than one digit) and “leaf” can not have more than one digit
- arranged in ascending order
- must have a key