Important Flashcards
(19 cards)
Frequency
A frequency is the number of times a data value occurs. For example, if ten students score 80 in statistics, then the score of 80 has a frequency of 10. Frequency is often represented by the letter f.
Cross tabulation
Also known as contingency tables or cross tabs, cross tabulation groups variables to understand the correlation between different variables. It also shows how correlations change from one variable grouping to another.
explanatory variable
is the variable that is manipulated by the researcher. Explanatory Variable. Also known as the independent or predictor variable, it explains variations in the response variable; in an experimental study, it is manipulated by the researcher.
Response Variable
Also known as the dependent or outcome variable, its value is predicted or its variation is explained by the explanatory variable; in an experimental study, this is the outcome that is measured following manipulation of the explanatory variable
Confounding Variable
A confounding variable is an “extra” variable that you didn’t account for. They can ruin an experiment and give you useless results. They can suggest there is correlation when in fact there isn’t. They can even introduce bias. That’s why it’s important to know what one is, and how to avoid getting them into your experiment in the first place.
IQR
The interquartile range is a measure of where the “middle fifty” is in a data set. Where a range is a measure of where the beginning and end are in a set, an interquartile range is a measure of where the bulk of the values lie. IQR = Q3 – Q1.
Comupute the IQR
Step 2: Find the median. 1, 2, 5, 6, 7, 9, 12, 15, 18, 19, 27. Step 3: Place parentheses around the numbers above and below the median. Not necessary statistically, but it makes Q1 and Q3 easier to spot. (1, 2, 5, 6, 7), 9, (12, 15, 18, 19, 27). Step 4: Find Q1 and Q3 Think of Q1 as a median in the lower half of the data and think of Q3 as a median for the upper half of data. (1, 2, 5, 6, 7), 9, ( 12, 15, 18, 19, 27). Q1 = 5 and Q3 = 18. Step 5: Subtract Q1 from Q3 to find the interquartile range. 18 – 5 = 13.
standard deviation vs interquartile range
The standard deviation takes into account all the values of a dataset, including any outliers. It is dependent on the mean, because the value is used to tell how much the data deviates from the mean of a dataset. Interquartile Range The Interquartile Range tells us how spread the data is. The larger this value is, the more spread out the data is, and conversely, the smaller the value, the less spread the data is. Unlike the standard deviation, however, it does not take into account all the values in the dataset, but mainly their positions when the data is ordered. It is not affected as much by outliers or data that is skewed or not normalized.
compute interquartile range
Subtract Q1 from Q3 to find the interquartile range.
What is the standard deviation?
Standard deviation is the measure of spread of a set of data from its mean. It measures the absolute variability of a distribution; the higher the spread. Low standard deviation means data are clustered around the mean, and high standard deviation indicates data are more spread out
Levels of Measurement
Ordinal, norminal, ratio and interval Different types are measured differently. To measure the time taken to respond to a stimulus, you use a stop watch. When it comes to measuring someone’s attitude towards a political candidate. A rating scale is more appropriate in this case (with labels like “very favorable,” “somewhat favorable,” etc.)
Ordinal level of measurement
“very dissatisfied,” “somewhat dissatisfied,” “somewhat satisfied,” or “very satisfied.” The items in this scale are ordered, ranging from least to most satisfied. This is what distinguishes ordinal from nominal scales
Norminal level of measurement
Gender, favorite color, and religion are examples of variables measured on a nominal scale. Nominal scales do not imply any ordering among the responses.
ratio level of measurement
Interval scales are numerical scales in which intervals have the same interpretation throughout. Fx Fahrenheit scale of temperature. The difference between 30 degrees and 40 degrees arethe same temperature difference as the difference between 80 degrees and 90
Interval level of measurement
example of a ratio scale is the amount of money you have in your pocket right now (25 cents, 55 cents, etc.). Money is measured on a ratio scale because, in addition to having the properties of an interval scale, it has a true zero point: if you have zero money, this implies the absence of money.
Inferential statistics
Predictions
Normal distribution
Also called gaussian distribution. With the mean being 0 and STD 1
What is statistical significance?
Statistical significance helps quantify whether a result is likely due to chance or to some factor of interest there are two main contributors to sampling error: the size of the sample and the variation in the underlying population. Sample size may be intuitive enough. Think about flipping a coin five times versus flipping it 500 times. The more times you flip, the less likely you’ll end up with a great majority of heads. The same is true of statistical significance: with bigger sample sizes, you’re less likely to get results that reflect randomness.
Find the mean from a scatterplot