Chapter 1: Data & Statistics Flashcards

1
Q

analytics

A

The scientific process of transforming data into insights for making better decisions.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

The scientific process of transforming data into insights for making better decisions.

A

analytics

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

big data

A

A set of data that cannot be managed, processed, or analyzed with commonly available software in a reasonable amount of time. Big data are characterized by great volume (a large amount of data), high velocity (fast collection and processing) or wide variety (could include nontraditional data such as video, audio, and text).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

A set of data that cannot be managed, processed, or analyzed with commonly available software in a reasonable amount of time.

A

big data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

categorical data

A

Labels or names used to identify an attribute of each element. Categorical data use either the nominal or ordinal scale of measurement and may be nonnumeric or numeric.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Labels or names used to identify an attribute of each element.

A

categorical data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

categorical variable

A

A variable with categorical data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

A variable with categorical data.

A

categorical variable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

census

A

A survey to collect data on the entire population.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

A survey to collect data on the entire population.

A

census

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

cross-sectional data

A

Data collected at the same or approximately the same point in time.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Data collected at the same or approximately the same point in time.

A

cross-sectional data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

data

A

The facts and figures collected, analyzed, and summarized for presentation and interpretation.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

The facts and figures collected, analyzed, and summarized for presentation and interpretation.

A

data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

data mining

A

The process of using procedures from statistics and computer science to extract useful information from extremely large databases.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

The process of using procedures from statistics and computer science to extract useful information from extremely large databases.

A

data mining

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

data set

A

All the data collected in a particular study.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

All the data collected in a particular study.

A

data set

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

descriptive analytics

A

Analytical techniques that describe what has happened in the past.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Analytical techniques that describe what has happened in the past.

A

descriptive analytics

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

descriptive statistics

A

Tabular, graphical, and numerical summaries of data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

Tabular, graphical, and numerical summaries of data.

A

descriptive statistics

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

elements

A

The entities on which data are collected.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

The entities on which data are collected.

A

elements

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

interval scale

A

The scale of measurement for a variable if the data demonstrate the properties of ordinal data and the interval between values is expressed in terms of a fixed unit of measure. Interval data are always numeric.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

The scale of measurement for a variable if the data demonstrate the properties of ordinal data and the interval between values is expressed in terms of a fixed unit of measure. Always numeric.

A

interval scale

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
27
Q

nominal scale

A

The scale of measurement for a variable when the data are labels or names used to identify an attribute of an element. Nominal data may be nonnumeric or numeric.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
28
Q

The scale of measurement for a variable when the data are labels or names used to identify an attribute of an element. May be nonnumeric or numeric.

A

nominal scale

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
29
Q

observation

A

The set of measurements obtained for a particular element.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
30
Q

The set of measurements obtained for a particular element.

A

observation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
31
Q

ordinal scale

A

The scale of measurement for a variable if the data exhibit the properties of nominal data and the order to rank of the data is meaningful. Ordinal data may be nonnumeric or numeric.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
32
Q

The scale of measurement for a variable if the data exhibit the properties of nominal data and the order to rank of the data is meaningful. May be nonnumeric or numeric.

A

ordinal scale

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
33
Q

population

A

The set of all elements of interest in a particular study.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
34
Q

The set of all elements of interest in a particular study.

A

population

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
35
Q

predictive analytics

A

Analytical techniques that use models constructed from past data to predict the future or assess the impact of one variable or another.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
36
Q

Analytical techniques that use models constructed from past data to predict the future or assess the impact of one variable or another.

A

predictive analytics

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
37
Q

prescriptive analytics

A

Analytical techniques that yield a course of action.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
38
Q

Analytical techniques that yield a course of action.

A

prescriptive analytics

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
39
Q

quantitative data

A

Numeric values that indicate how much or how many of something. Quantitative data are obtained using either the interval or ratio scale of measurement.

40
Q

Numeric values that indicate how much or how many of something. Obtained using either the interval or ratio scale of measurement.

A

quantitative data

41
Q

quantitative variable

A

A variable with quantitative data.

42
Q

A variable with quantitative data.

A

quantitative variable

43
Q

ratio scale

A

The scale of measurement for a variable if the data demonstrate all the properties of interval data and the ratio of two values is meaningful. Ratio data are always numeric.

44
Q

The scale of measurement for a variable if the data demonstrate all the properties of interval data and the ratio of two values is meaningful. Always numeric.

A

ratio scale

45
Q

sample

A

A subset of the population.

46
Q

A subset of the population.

A

sample

47
Q

sample survey

A

A survey to collect data on a sample.

48
Q

A survey to collect data on a sample.

A

sample survey

49
Q

statistical inference

A

The process of using data obtained from a sample to make estimates or test hypotheses about the characteristics of a population.

50
Q

The process of using data obtained from a sample to make estimates or test hypotheses about the characteristics of a population.

A

statistical inference

51
Q

statistics

A

The art and science of collecting, analyzing, presenting, and interpreting data.

52
Q

The art and science of collecting, analyzing, presenting, and interpreting data.

A

statistics

53
Q

time series data

A

data collected over several time periods.

54
Q

data collected over several time periods.

A

time series data

55
Q

variable

A

A characteristic of interest for the elements.

56
Q

A characteristic of interest for the elements.

A

variable

57
Q

Examples of categorical data include all of the following except:

a. extra large

b. 90210 (Beverly Hills)

c. blonde

d. 228 lbs.

A

d. 228 lbs

58
Q

The collection of all elements of interest in a particular study is:

a. descriptive statistics

b. the sample of interest

c. the population of interest

d. statistical inference

A

c. the population of interest

59
Q

All of the following are examples of observational studies except:

a. a Gallup poll measuring the approval rating of the president

b. the number of cars running a stop sign in a residential area during rush hour

c. the behavior of Walmart shoppers after they are given a $20 gift card from the store

d. an online survey to record your satisfaction with a company’s service

A

c. the behavior of Walmart shoppers after they are given a $20 gift card from the store

60
Q

Data obtained from a nominal scale:

a. must be numeric

b. must be alphabetic

c. must rank and order the data

d. can be either numeric or nonnumeric

A

d. can be either numeric or nonnumeric

61
Q

What is the principal connection between a sample and a population?

a. Small samples can be drawn to describe small populations.

b. A population describes all sample
data.

c. A population cannot be accurately described by any sample.

d. A random sample drawn from a population seeks to describe the characteristics of that population.

A

d. A random sample drawn from a population seeks to describe the characteristics of that population.

62
Q

The American Statistical Association describes eight general topic areas and specifies important ethical considerations under each topic. One area is “Professionalism.” Professionalism points out the need for competence, judgment, diligence, self-respect, and worthiness of the respect of other people. Which of the following does not adhere to upholding the ethical guidelines for statistical practice?

a. Use only statistical methodologies suitable to the data and to obtaining valid results. For example, address the multiple potentially confounding factors in observational studies and use due caution in drawing causal inferences.

b. Guard against the possibility that a predisposition (bias) by investigators or data providers might predetermine the analytic result.

c. Account for all data considered in a study and explain the sample(s) actually used.

d. Engage in discrimination based on personal characteristics.

A

d. Engage in discrimination based on personal characteristics.

63
Q

Some hotels ask their guests to rate the hotel’s services as excellent, very good, good, and poor. This is an example of the:

a. ratio scale.

b. interval scale.

c. nominal scale.

d. ordinal scale.

A

d. ordinal scale.

64
Q

The entities on which data are collected are:

a. observations.

b. elements.

c. samples.

d. populations.

A

b. elements.

65
Q

The summaries of data, which may be tabular, graphical, or numerical, are referred to as:

a. inferential statistics.

b. quantitative statistics.

c. descriptive statistics.

d. categorical statistics.

A

c. descriptive statistics.

66
Q

The major applications of data mining have been made by companies with a strong _____ focus.

a. research and development

b. human resource

c. electronic design

d. consumer

A

d. consumer

67
Q

Big data is often defined according to the four v’s of data: volume, variety, veracity, and ___________.

a. visibility

b. velocity

c. validity

d. vacancy

A

b. velocity

68
Q

The largest experimental statistical study ever conducted is believed to be for:​

a. diphtheria.

b. polio.

c. flu vaccine.

d. cholera.

A

b. polio.

69
Q

Statistical studies in which researchers do not control variables of interest are:

a. observational studies.

b. experimental studies.

c. existing sources studies.

d. completely randomized studies.

A

a. observational studies.

70
Q

In data mining, statistical models play an important role in developing _____.

a. financial organizations

b. human resources

c. businesses

d. predictive models

A

d. predictive models

71
Q

Which of the following is not a type of data acquisition errors?

a. A particularly extreme value is verified, and is still included in the data set.

b. The person asking a survey question may place undo emphasis on one of the answer choices.

c. Data being used before the source is properly vetted.

d. A subject’s answers may be transcribed incorrectly.

A

a. A particularly extreme value is verified, and is still included in the data set.

72
Q

The Department of Transportation of a city has noted that on average there are 17 accidents per day. The average number of accidents is an example of:

a. a population.

b. descriptive statistics.

c. a sample.

d. statistical inference.

A

b. descriptive statistics.

73
Q

The Department of Homeland Security has noted that on average 1120 suspicious vehicles are stopped and searched each day in the United States. This number is used to estimate the number of cars stopped in an average yearly period. The average number of cars stopped is not an example of:

a. statistical inference.

b. descriptive statistics.

c. a population.

d. a sample.

A

b. descriptive statistics.

74
Q

Anyone who wants to use the data and statistical analysis as aids to decision making must be aware of the time and cost issues. If important data are not readily available, it would be best to:

a. do nothing at all due to cost.

b. conduct a times series analysis.

c. borrow another company’s data.

d. use a cross-sectional data set.

A

d. use a cross-sectional data set.

75
Q

Data collected through a survey attached to this month’s pay stub:

a. will be useless because not everyone receives the survey.

b. is experimental because the control is the time of month it was administered.

c. will have no data acquisition error.

d. is considered an observational study because no control is imposed.

A

d. is considered an observational study because no control is imposed.

76
Q

A company wants to enhance the benefits for its yearly healthcare package offering. It observes the set of employees who smoke on their break times at 10 a.m., 12 p.m., and 2 p.m. It records the number of cigarettes smoked by each individual. This is an example of:

a. an observational study.

b. a controlled experiment.

c. a study using existing sources.

d. a randomized study.

A

a. an observational study.

77
Q

Data mining deals with methods for developing useful decision-making information from large databases. It performs all actions except the:

a. reselling of the packaged data.

b. extraction of predictive information.

c. warehousing of data.

d. collection of useful data.

A

a. reselling of the packaged data.

78
Q

The number of observations in a complete data set having 20 elements and 3 variables is:

a. 20.

b. 20*3 or 60.

c. 20 – 3 or 17.

d. 20 + 3 or 23.

A

a. 20.

79
Q

In a sample of 200 students in a university, 40, or 20%, are communications majors. Based on the above information, the school’s paper reported that “an estimated 20% of all the students at the university are communications majors.” This report is an example of:

a. an experiment.

b. a sample.

c. statistical inference.

d. a population.

A

c. statistical inference.

80
Q

Examples of cross-sectional data include all of the following except:

a. the comparison of five different variables for the 60 World Trade Organization nations on January 1st.

b. the comparison of sales output of all 10 salespeople in the Western Sales Region for the 3rd quarter.

c. the comparison of mpg data gathered in a study on all 2015 cross-over SUVs.

d. the comparison of performance of all telecommunications stocks today.

A

b. the comparison of sales output of all 10 salespeople in the Western Sales Region for the 3rd quarter.

81
Q

Examples of quantitative data include all of the following except:

a. the number of pairs of shoes in your closet.

b. the temperature outside to the nearest degree.

c. your zip code.

d. your weight rounded to the nearest gram.

A

c. your zip code.

82
Q

The owner of a factory regularly requests a graphical summary of all employees’ salaries. The graphical summary of salaries is an example of:

a. an experiment.

b. a sample.

c. statistical inference.

d. descriptive statistics.

A

d. descriptive statistics.

83
Q

A time series is a sequence of data points, typically consisting of successive measurements made over a time interval. Examples of the time series include all of the following except:

a. point differential against opponents for a football team this year.

b. ocean tides.

c. the volume of shares traded today in the stock market.

d. a stock’s opening price for each month.

A

c. the volume of shares traded today in the stock market.

84
Q

Which of the following is used for data-driven decision making?

a. descriptive statistics

b. statistical inference

c. time series data

d. analytics

A

d. analytics

85
Q

Statistical Inference:

a. is the same as a description using tabular and graphical statistics.

b. refers to the process of drawing conclusions about the sample based on the characteristics of the population.

c. is the process of drawing conclusions about the population based on the evidence taken from the sample.

d. is the same as a census.

A

c. is the process of drawing conclusions about the population based on the evidence taken from the sample.

86
Q

In a random sample of 200 items, 5 items were defective. An estimate of the percentage of defective items in the population is:

a. 20.0%.

b. 2.5%.

c. 10.0%.

d. 5.0%.

A

b. 2.5%

87
Q

Which of these is a continuous and quantitative variable?

a. The number of internet sales on Cyber Monday

b. The number keys on a piano

c. The time it takes to grill a steak

d. The pairs of shoes in your closet

A

c. The time it takes to grill a steak

88
Q

The main difference between prescriptive analysis and descriptive/predictive analysis is that prescriptive analysis:

a. yields a course of action.

b. uses past data to predict the future.

c. uses linear regression and time series analysis.

d. describes what has happened in the past.

A

a. yields a course of action.

89
Q

In a sample of 1600 registered voters, 912, or 57%, approve of the way the President is doing his job. The 57% approval rating is an example of:

a. a population.

b. a sample.

c. statistical inference.

d. descriptive statistics.

A

d. descriptive statistics.

90
Q

An experiment is conducted to study the effects of a new blood pressure medicine on subjects. A random sample of 60 people are placed into three groups of 20 according to their age (young, middle-aged, and senior). Each group of 20 people are randomly assigned to two treatment groups (new medicine and placebo). How many total treatment groups are there?

a. 6

b. 1

c. 4

d. 2

A

a. 6

91
Q

A group of employees at a major grocery chain is tasked with analyzing point of sale data to predict the success of a marketing campaign they are considering launching. This is an example of:

a. prescriptive analysis.

b. descriptive analysis.

c. predictive analysis.

d. descriptive statistics.

A

c. predictive analysis.

92
Q

The sample size:

a. is always equal to the population size.

b. can be larger than the population size.

c. can be larger or smaller than the population size.

d. is always smaller than the population size.

A

d. is always smaller than the population size.

93
Q

The American Statistical Association describes eight general topic areas and specifies important ethical considerations under each topic. One area is “Responsibilities to Research Subjects.” It describes the requirements for protecting the interests of human and animal subjects of research not only during data collection but also in the analysis, interpretation, and publication of results of the findings. Which of the following is not an example of this?

a. Avoiding or minimizing the use of deception

b. Avoiding excessive risk to research subjects and excessive imposition on their time and privacy

c. Knowing about and adhering to appropriate animal welfare guidelines in research involving animals

d. Assuming legal privacy and confidentiality protections where they may not apply

A

d. Assuming legal privacy and confidentiality protections where they may not apply

94
Q

What is the principal difference between time series data and cross-sectional data?

a. Time series data and cross-sectional data are concerned with different sized samples.

b. Cross-sectional data looks at only a particular “cross-section” of the population.

c. Cross-sectional data are limited to an approximate window of time, while time series data are collected over several time periods.

d. Time series data seeks only to capture data within a snapshot in time.

A

c. Cross-sectional data are limited to an approximate window of time, while time series data are collected over several time periods.

95
Q
A