Chapter 5 - Statistics Flashcards
(85 cards)
What are you conducting if you want to obtain data for every element of your population? Why is it not generally done?
You’re conducting a census. It’s not done because the resources needed to check every single one of your chosen population are huge!
You’ve identified that you need analyze data about films directed by Steven Spielberg. What would your population be?
Population would be all films made by Steven Spielberg
After finding all films made by Spielberg, you want to concentrate analysis ones made with a particular camera model, what name is given to this type of analysis?
Univariate Analysis
Pick the right word for the gaps….
______ pertain to the sample and ________ pertain to the population
statistics pertain to the sample and parameters pertain to the population
which branch of statistics summarizes and describes data?
Descriptive Statistics
what type of statistics do you use to help you understand the characteristics of your data?
Descriptive Statistics
in descriptive statistics, the first step is applying measures of what to your sample data? Why?
Using measures of frequency (like the count) to determine the size of the data set
It will help you determine if you can analyse the data simply on your laptop, or will require more processing power than a laptop provides.
What’s the most commonmeasure of frequency?
Count
when measuring the count of a dataset, what must you handle when doing this?
How to handle null values
What are the 3 measures of frequency mentioned in the book?
Count
Percentage
Frequency
A histogram is typically used to visualize what measure when conducting what kind of analysis?
Used to visualize frequency when conducting univariate analysis
What frequency of measure can help you identify biases in your dataset? Bias must be taken in the context of what?
Percentage measures
Bias must be taken within the context of your OBJECTIVES. It’s fine if the percentage of males in a sample is 100% if you’re only concerned with data that should only include men!
What are the 3 measured of central tendency?
Mean
Median
Mode
The Mean is also known as?
The average
When calculating the mid-point value (Median) of an even number of data observations, what must you do?
Add together the two values closest to the mid-point, divided by 2.
what is the calculation that tells you which POSITION (not value) in an ordered list of odd number observations is the median? Describe what ‘n’ is.
n+1 divided 2. n = the number of observations.
What central tendency measure that identifies the most frequently occurring observation?
the Mode.
what are the measures of dispersion mentioned in the book?
Range
Distribution
Variance
Standard Deviation
What’s the name given to the difference between a variable’s max and min values?
The Range
Why is it that calculating the range on temperature values by themselves won’t help you identify invalid data?
Because temperature values can vary widely and have positive and negative values. You need additional information like location and time of year to give context
Which tool is effective to visualize a probability distribution? Why?
Histogram. Because the shape you see provides additional insights as to how to proceed with analysis.
Which theorem states that as sample size increases, it becomes more likely that the sampling distribution will become normally distributed?
Central Limit Theorem.
Whilst they look very similar, a frequency histogram and a distribution histogram are different. How?
The frequency histograms focus on the raw counts that each interval occurs.
Distribution histograms focus on the shape and spread by looking at how often an interval value occurs in relation to the total number of values
Jon is taking a sample of which the parent population is normal. He takes several samples at varying sizes, some of them are less than 30. Would the distribution of these sampling means be skewed?
No. They would all be normal because the parent population is also normal.