Module 1 (Describing & Summarizing Data) Flashcards Preview

Business Analytics > Module 1 (Describing & Summarizing Data) > Flashcards

Flashcards in Module 1 (Describing & Summarizing Data) Deck (10)
Loading flashcards...
1
Q

The standard deviation is equal to?

A

The square root of the variance

The standard deviation: A measure of the spread of a data set’s values around its mean value. The standard deviation is measured in the same units (such as dollars or hours) as the observations in the data.

2
Q

The coefficient of variation means what, and equals?

A

It measures the size of the standard deviation relative to the size of the mean

It is the best statistic to compute to compare the variability of two data sets with different distributions

Standard deviation / mean

3
Q
Which of the formulas would calculate the statistic that is MOST APPROPRIATE for comparing the variability of two data sets with different distributions?
A - Mean/Standard Deviation
B - Standard Deviation/Mean
C - Mean-Median
D - Median-Mean
E - Variance/Mean
A

B - Standard Deviation/Mean

This is the formula for the coefficient of variation, the best statistic to compute to compare the variability of two data sets with different distributions. Dividing by the mean provides a measure of the distribution’s variation relative to the mean.

4
Q

Consider the four outliers in the 2012 revenue data: companies with revenue of $237 billion, $246 billion, $447 billion, and $453 billion.

If we removed these companies from the data set, what would happen to the standard deviation?

A - The standard deviation would remain the same.
B - The standard deviation would increase.
C - The standard deviation would decrease.
The answer cannot be determined without further information.

A

C - The standard deviation would decrease.

The standard deviation gives more weight to observations that are further from the mean. Therefore, removing the outliers would decrease the standard deviation.

5
Q

What does the correlation coefficient measure?

A

The strength of the linear relationship between two variables, ranges between -1 and 1

6
Q

What can be concluded from the fact that the correlation coefficient between the acceptance rate at the top 100 U.S. MBA programs and the percent of students in those programs who are employed upon graduation is -0.32?

A - On average, as the acceptance rate increases, the percent of students employed upon graduation increases.
B - On average, as the acceptance rate decreases, the percent of students employed upon graduation decreases.
C - On average, as the acceptance rate decreases, the percent of students employed upon graduation increases.
D - On average, as the acceptance rate increases, the percent of student employed upon graduation remains the same.

A

C - On average, as the acceptance rate decreases, the percent of students employed upon graduation increases.

-0.32 is negative which indicates that, on average, as acceptance rate decreases, the percent of students employed upon graduation increases.

7
Q

An internet marketing firm compiled a data set of the number of seconds website visitors stay on one of its client’s homepage before abandoning the site. The firm presented the summary statistics for the data set to the client.

The client asked why the mean of the data set is so much larger than the median. Which of the following is most likely true?

A - The distribution of the data is symmetric
B - The distribution of the data is skewed to the left
C - The distribution of the data is skewed to the right
D - The distribution of the data is bimodal

A

C - The distribution of the data is skewed to the right

When the distribution of data is skewed to the right, the mean is most likely greater than the median. The extreme values in the right tail pull the mean towards them.

8
Q

True or false - If the right tail of a graph is longer, we say the distribution is skewed to the right or “right-tailed.”

A

True

9
Q

True or false: the convention is to plot the so-called “independent” variable on the vertical axis and the “dependent” variable on horizontal axis.

A

False.

the convention is to plot the so-called “dependent” variable on the vertical axis and the “independent” variable on horizontal axis.

10
Q

True or false: this graph represents a correlation coefficient of -1.

A

True.

negative is downward sloping