Exam 1 Flashcards

1
Q

Variables take categories as their values such as “yes” “no “ or blue brown green

A

Categorical (qualitative)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Variables that have values that represent a counted or measured quantity

A

Numerical (quantitative)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Variables that arise from a counting process

A

Discrete / numerical (quantitative)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Variables that arise from a measuring process

A

Continuous/ numerical (quantitative )

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Facts and figures callected, analyzed, and summarized for presentation and interpretation

A

Data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

All the data collected in a particular study are referred to as the __________ for the study

A

Data set

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

The entities on which data are collected

A

Elements

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

A characteristic of interest for the elements

A

Variable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

The set of measurements obtained for a particular element is called _____

A

An observation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

A data set with n ________ contains n __________.

A

Elements, observations

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

How do you calculate the total number of data values

A

The total number of data values in a complete data set is the number of elements multiplied by the number of variables

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Nominal data

A

Defined categories such as eye color, political party, marital status

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Ordinal categories

A

Categorical - Ordered categories such as good, better, best or low, medium, high

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Data classified into distinct categories in which no ranking is implied

A

A nominal scale ex: do you have a facebook profile? Y or N; cellular provider? (verizon, AT&T, etc.)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Classifies data into distinct categories in which ranking is implied

A

Ordinal data - EX: grades, ratings, product satisfaction

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Data that has the properties of ordinal data and the ___ between observations is expressed in terms of a fixed unit of measure. it is always ___. The scale ____ contain a ____ value that indicates that noting exists for the variable at the _____ point.

A

Interval,, numeric, does not , zero x2

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Data that has all the properties of internal data and the ___ of two values is meaningful. The scale ____ contain a zero value that indicates that nothing exists for the variabe at the zero point,

A

Ratio, must

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Data that is collected at the same or approximately the same point in time.

A

Cross-sectional data. EX: data detailing the number of building oermits issued in Nov. 2019 in each of the counties of Ohio.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Data that is collected over several time periods

A

Time series - data detailing the number of building permits issued in lucas county, ohio in each of the last 36 months. Graphs of time series help analysts understand what happened in the past, identify any trendsa over time, and project future levels for the time series.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

The set of all elements of interest in a particular study

A

Population

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

A subset of the population

A

Sample

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

The process of using data obtained from a sample to make estimates and test hypothesis about the characteristics of a population

A

Statistical inference

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

Collecting data for the entire population

A

Census

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

Collecting data for a sample

A

Sample survey

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

Collecting data via sampling is used when doing so is:

A

Less time consuming tham selecting every item in the population. It is less costly than selecting every item in the population. It is less cumbersome and more practical than analyzing the entire population.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

Summarizes the value of a specific variable for a population.

A

Population Parameter

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
27
Q

Summarizes the value of a specific variable for sample data

A

Sample statistic

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
28
Q

Tallies the frequencies or percentages of items in a set of categories so that you can see differences between categories

A

A summary table

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
29
Q

A summary of data showing the number (frequency) of observations in each of several non-overlapping categories or classes.

A

Frequency distribution

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
30
Q

The ____ ______ of a class is the fraction or proportion of the total number of data items belonging to the class. What is the equation?

A

Relative frequency. Equation is ~ relative frequency of a class = frequency of class/ n

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
31
Q

How do you calculate percent frequency of a class?

A

The relative frequency multiplied by 100

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
32
Q

Used to study patterns that may exist between the responses of two or more categorical variables.

A

A contingency table

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
33
Q

It cross tabulates or tallies jointly the responses of the categorical variables

A

A contingency table.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
34
Q

A tabular summary of data for two variables.

A

A crosstabulation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
35
Q

Contingency table - For two variables, the tallies for one variable are located in the ____ and the tallies for the second variable are located in the ________.

A

Rows, columns

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
36
Q

A sequence of data, in rank order, from the smallesy value to the largest value.

A

An ordered array.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
37
Q

It shows range (minimum value to maximum value)

A

Ordered array.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
38
Q

May help identify outliers (unusual observations)

A

An ordered array

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
39
Q

A summary table in which the data are arranged into numerically ordered class

A

Frequency distribution

40
Q

You must give attewntion to selecting the appropriate number of _____ ______ for the table, determining a suitable width of a class grouping, and establishing the boundaries of each class grouping to avoid overlapping.

A

Class groupings.

41
Q

How do you determine the width of a class interval?

A

Divide the range (highest value-lowest value) of the data by the number of class groupings desired

42
Q

A ______ visualizes a categorical variable as a series of bars. The length of each bar represents either the _____ or ___ of values for each category. Each space is seperated by a space called ______

A

Bar chart, frequency or percentage, gap

43
Q

A ___ ___ is a circle that is broken up into slices that represent categories. The zie of each ___ varies according to the percentage in each category.

A

Pie chart

44
Q

A ___ ____ is the out part of a broken circle broken up into pieces that represent categories. The size of each piece varies according to the percentage of each category.

A

Donut chart

45
Q

Used to portray categorical data. A verticle bar chart where categories are shown in decending order of frequency. A cumulative polygon is shown in the same graph. Used to seperate the “___ ___” from the “___ ___.”

A

The pareto chart. “vital few,” from the “trivial many”

46
Q

Represents data from a contingency table

A

Side by side chart

47
Q

Can be used to represent the data from a contingency table

A

Doughnut chart

48
Q

Organizes data in groups (called ___) so that values within each group (the ____) branch out to the right of each row.

A

Stem-and-leaf display

49
Q

A vertical bar chart of the data in a frequency distribution is called a _______.

A

Histogram

50
Q

Formed by having the midpoint of each class represent the data in that class and then connecting teh sequence of midpoints at their respective class percentages.

A

Percentage polygon

51
Q

Displays the variable of interest along the X axis, and the cumulative percentages along the Y axis. Useful when there are two or more groups to compare.

A

Cumulative percentage polygon, or ogive.

52
Q

Used for numerical data consisting of paired observations taken from two numerical variables.

A

Scatter plots

53
Q

Used to examine possible relationships between two numerical variables.

A

Scatter plots

54
Q

Used to study patterns in the values of a numeric variable over time.

A

Time-series plot

55
Q

Contructed by tallying the responses of three or more categorical variables.

A

Multidimensional contingency table.

56
Q

Provides a measure of central location

A

Mean

57
Q

The average of all the data values

A

Mean

58
Q

Perhaps the most important measure of location.

A

The mean

59
Q

The value in the midddle when the data items are arraneged in ascending order

A

Median

60
Q

Whenever a data set has extreme values, the ____ is the preferred measure of central location

A

mean

61
Q

For an odd number of observations (in ____ order) the median is the ____ value

A

ascending, middle

62
Q

For an even number of observations (in ____order), the median is the ______ of the two middle values

A

Average ~ median = (19+26)/2 = 22.5

63
Q

The ____ of a data set is the value that occurs with the greatest frequency

A

Mode

64
Q

The greatest frequency can occur at two or more different values

A

The Mode

65
Q

If the data have exactly two modes, the data are ____.

A

Bimodal

66
Q

If the data have more than two modes, the data are ______.

A

multimodal

67
Q

Excells mean function

A

=AVERAGE(data cell range)

68
Q

Excels median function

A

=MEDIAN(data cell range)

69
Q

Excells mode function

A

=MODE.SNGL(data cell range)

70
Q

How does one calculate the geometric mean?

A

Finding the nth root of the product of n values

71
Q

What is the geometric mean function?

A

=GEOMEAN(data cell range)

72
Q

The _______ of a data set is a value such that at least _ percent of the items take on this value or less and at least (100- __) percent of the items take on this value or more.

A

pth percentile, p, p

73
Q

Equation used to compute percentiles

A

=PERCENTILE.EXC(data range, p/100)

74
Q

Quartiles examples

A

First quartile = 25th percentile, second quartile = 50th percentile = median, third quartile = 75th percentile

75
Q

Measure of ____ give information on the ____ or _____ or _____ of the data values

A

Spread, variability, or dispersion

76
Q

The difference between the largest and smallest data values

A

The range

77
Q

What is the range calculation?

A

Range = largest value - smallest value

78
Q

The simplest measure of variability

A

Range

79
Q

Is very sensitive to the smalled and largest data values

A

Range

80
Q

A measure of variability that utilizes all the data

A

Variance

81
Q

Based on the difference between the value of each observation and the mean

A

Variance

82
Q

The ____ _____ of a data set is the positive square root of the variance

A

Standard deviation

83
Q

Measured in the same units as the data, making it more easily interpreted than the variance

A

Standard deviation

84
Q

Excel function for sample variance

A

=VARS.S(data cell range)

85
Q

Excel function for sample standard deviation

A

=STDEV.S(data cell range)

86
Q

Indicates how large the standard deviation is in relation to the mean

A

Coefficient of variation

87
Q

The number of standard deviation a data value is from the mean

A

Z-score

88
Q

Describes how data are distributed

A

Shape of a distribution

89
Q

Measures the extent to which data values are not symmetrical

A

Skewness

90
Q

Measures the peakedness of the curve of the distribution- that is, how sharply the curve rises approaching the center of distribution

A

Kurtosis

91
Q

Sumamry measures describing a population, called ____ are denoted with greek letters

A

Parameters

92
Q

The sum of the values in the population divided by the population size, N

A

Population mean

93
Q

The ___ ___ approximated the variation of data in a bell-shaped distribution

A

Empirical rule

94
Q

The _____ measures the strength of the linear relationshop between two ____ variables

A

Covariance, numerical

95
Q

Excel function for the coefficient of correlation ~ covariance

A

=COVARIANCE.S(X,Y0

96
Q

Excel function for coefficient of correlation ~ correlation coefficient

A

=CORREL(X,Y)