Final Exam Flashcards

(107 cards)

1
Q

According to the text, which of the following is NOT true?

A

An example of an SSBI tool is PowerPoint.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

When preparing data, analysts use the ETL process. ETL stands for Explore, Transfer, Load.

A

False

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

According to the text, the data analysis process is comprised of three equally important stages, which of the following is NOT one of those stages?

A

Review

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Understanding “why” something happening in your analysis is called _________ analytics.

A

Diagnostic

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

A visualization of a chart that compares actual vs expected monthly revenue would probably be found in the _________ area

A

auditing

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

In preparing data, the process of reviewing the data for possible issues is called

A

profiling

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

In the data analysis process, “C” in the MOSAIC model stands for “Cleaning”.

A

False

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

The CPA Exam and the CMA Exam both include topics on data analytics

A

True

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Which is consistent with the Data Analytics Mindset?

A

all of these

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Which of the following is best defined as a measure of dispersion

A

variance

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Most of the data you will work with will come from

A

relational databases

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Questions with single dimensions should be answered with pivot tables, questions with multiple dimensions should be answered with excel functions.

A

False

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Database elements can be represented in the REA model, the model’s elements are..

A

resources, events, agents

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Which of the following is NOT one of the basic excel functions used in foundational analysis

A

DISPLAYIF

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

In a relational database table, a primary key is

A

is a unique value

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

A __________ is a bar chart of frequency distributions where the height of the bar represents the count of items in the interval

A

histogram

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

There are 4 types of joins used to link tables together, which type of join DOES NOT result in any null values being produced?

A

Inner

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Simultaneously filtering for multiple dimensions is called

A

data slicing

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

An action request made to a database is called a(n)

A

query

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Which is the best tool when the desired result is known, but not the input value for a single variable will achiever that result?

A

Goal Seek

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

An analysis prepared to support a predetermined belief is an example of

A

confirmation bias

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

an anomaly is

A

an observation that deviates from what is normal/expected

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

When examining the relationship between two variables, if one variable increases as the other variable decreases the relationship is

A

a negative correlation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

In a regression model prepared to predict revenue, which of the following is the correct interpretation of an adjusted R-squared of 0.85?

A

the independent variables in the model can explain 85% of the change in revenue

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
A spreadsheet model that allows evaluating how changes to values and assumptions affect an outcome is called a
what-if analysis
26
Determining if the analysis makes senses is associated with....
data analysis interpretation
27
An appropriate analysis to use to determine how many times an event has occurred would be
a frequency distribution
28
which of the following analysis can predict a future outcome
linear regression
29
if the objective is to use historical data to identify patterns, which is the best analysis to use?
Trend analysis
30
which of the following describes part of the goal of the ETL process
Identify and obtain the data needed for solving the problem
31
the purpose of transforming data is
to validate the data for completeness and integrity
32
mastering the data can also be described via the ETL process. ETL process stands for:
Extract, Transform, Load
33
the advantages of storing data in a relational database include
help in enforcing business rules and integrating business processes
34
why is supplier ID considered to be a primary key for a supplier table
it contains a unique identifier for each supplier
35
Which of the following questions are not suggested by the institute of business ethics to allow a business to create value from data use and analysis, and still protect the privacy of stakeholders?
Does the data used by the company include personally identifiable information?
36
which of the following is not a common way that data will need to be cleaned after extraction and validation
Clean up trailing zeroes
37
which attribute is required to exist in each table of a relational database and serves as the "unique identifier" for each record in a table?
Primary key
38
what are attributes that exist in a relational database that are neither primary nor foreign keys?
Descriptive attributes
39
the metadata that describes each attribute in a database is
data dictionary
40
which of the following best describes an unsupervised approach to the evaluation of data?
data exploration looking for potential patterns of interest
41
these data are organized and reside in a fixed field with a record or a file. such data are generally contained in a relational database/ spreadsheet and are readily searchable by search algorithms.
structured data
42
which approach to data analytics attempts to assign each unit in a population into a small set of classes where the unit belongs
classification
43
an observation about the frequency of leading digits in many real-life sets of numerical data
benford's law
44
which approach to data analytics attempts to predict a relationship between two data items
link prediction
45
models associated with regression and classification data approaches have all these important parts except:
test data
46
auditing financial statements, and its desire to look for errors, anomalies, and possible fraud, is most consistent with which type of analytics?
Diagnostic analytics
47
in general, the simpler the model, the greater the chance of
underfitting the data
48
test data
set of data used to assess the degree and strength of a predicted relationship
49
in general, the more complex the model, the greater the chance of
overfitting the data
50
ratio data
considered the most sophisticated type of data
51
in the late 1960s ed altman developed a model to predict if a company was at severe risk of going bankrupt. He called his statistic altman's z-score, now a widely used score in finance. Based on the name of the statistic, which statistical distribution would you guess this came from?
standardized normal distribution
52
the Fahrenheit scale of temperature measurement would best be described as an example of
interval data
53
Conceptual (Qualitative)
Comparison: Bar Chart, Pie Chart, stacked bar chart, Tree map, Heat map Geographic data: Symbol map Text Data: word cloud
54
Data-driven (quantitative)
Outlier detection: box and whisker plot Relationship between two variables: scatter plot Trend over time: line chart Geographic data: filled map
55
least sophisticated type of data
nominal
56
not a typical example of nominal data
SAT scores
57
Anscombe's quartet suggests that
visualizations should be used in tandem with statistics
58
line charts are not recommended for
qualitative data
59
letter grades would be best described as
ordinal data
60
which testing approach would be used to predict whether certain cases should be evaluated as having fraud or no fraud
classification
61
describes finding correspondences between at least two types of text or entries that may not match perfectly
fuzzy matching
62
the determinants for sample size include all of the following except:
potential risk of account
63
Benford's law suggests that the first digit of naturally occurring numerical datasets follow an expected distribution where
the leading digit of 8 is more common than 9
64
What type of analysis would help auditors find missing checks?
sequence check
65
CAAT (Computer assisted audit techniques)
Automated scripts that can be used to validate data, test controls, and enable substantive testing of transaction details or account balances and generate supporting evidence for the audit
66
which testing approach would be useful in assessing the value of inventory shrinkage given multiple environmental factors
regression
67
which items would be currently out of the scope of data analytics
direct observation of processes
68
which type of audit analytics might be used to find hidden patterns/variables linked to abnormal behavior
diagnostic analytics
69
which type of audit analytics might be used to find hidden patterns/variables linked to abnormal behavior
diagnostic analytics
70
what allows tax departments to view multiple years, periods, jurisdictions (state/federal/international) and differing scenarios of data, typically through use of a dashboard
tax data visualizations
71
the task to tax accountants and tax departments to minimize the amount of taxes paid in the future
tax planning
72
an example of a tax risk KPI would be
levels of late filing or error penalties
73
an example of a tax cost KPI would be
ETR (Effective tax rate)
74
an example of a tax efficiency and effectiveness KPI would be
amount of time spent on compliance vs strategic activities
75
tax departments interested in maintaining their own data are likely to have their own
tax data mart
76
in which stage of the IMPACT model would the use of tax cockpits fit?
track outcomes
77
predictive analysis of potential tax liability and the formulation of a plan to reduce the amount of taxes paid is
tax planning
78
the evaluation of the impact of different tax scenarios/alternatives on various outcome measures including the amount of taxable income or tax paid
what-if scenario analysis
79
an example of a tax sustainability KPI would be
number of audits closed and significance of assessment over time
80
dependent variable is
Y
81
TO REMOVE NULL VALUES
go to power query and right click-remove empty
82
binary values
either 0 or 1
83
IMPACT
I-Identify the questions M-Master the data P-Perform the test A-Address and refine results C-Communicate insights T-Track Outcome
84
A data approach that attempts to discover associations between individuals based on transactions involving them.
co-occurrence grouping
85
A data approach that attempts to characterize the “typical” behavior of an individual, group, or population by generating summary statistics about the data (including mean, standard deviations, etc.).
profiling
86
A data approach that attempts to estimate or predict, for each unit, the numerical value of some variable using some type of statistical model.
regression
87
Data that do not adhere to a predefined data model in a tabular format.
unstructured data
88
An information system for managing all interactions between the company and its current and potential customers.
Customer Relationship Management (CRM) system
89
Centralized repository of descriptions for all of the data attributes of the dataset.
data dictionary
90
A means of storing data in one place, such as in an Excel spreadsheet, as opposed to storing the data in multiple tables, such as in a relational database.
flat file
91
An information system that helps manage all the company’s interactions with suppliers.
Supply chain mgmt (SCM) system
92
A data approach that attempts to divide individuals (like customers) into groups (or clusters) in a useful or meaningful way.
clustering
93
Procedures that summarize existing data to determine what has happened in the past. Some examples include summary statistics (e.g., Count, Min, Max, Average, Median), distributions, and proportions.
descriptive analytics
94
A numerical value (0 or 1) to represent categorical data in statistical analysis; values assigned a 1 indicate the presence of something and 0 represents the absence.
dummy variable
95
One way to categorize quantitative data, as opposed to discrete data. Continuous data can take on any value within a range. An example of continuous data is height.
continous data
96
One way to categorize quantitative data, as opposed to continuous data. Discrete data are represented by whole numbers. An example of discrete data is points in a basketball game.
discrete data
97
The second most sophisticated type of data on the scale of nominal, ordinal, interval, and ratio; a type of qualitative data. Ordinal can be counted and categorized like nominal data and the categories can also be ranked. Examples of ordinal data include gold, silver, and bronze medals.
ordinal data
98
The least sophisticated type of data on the scale of nominal, ordinal, interval, and ratio; a type of qualitative data. The only thing you can do with nominal data is count, group, and take a proportion. Examples of nominal data are hair color, gender, and ethnic groups.
nominal data
99
interval data
The third most sophisticated type of data on the scale of nominal, ordinal, interval, and ratio; a type of quantitative data. Interval data can be counted and grouped like qualitative data, and the differences between each data point are meaningful. However, interval data do not have a meaningful 0. In interval data, 0 does not mean “the absence of” but is simply another number. An example of interval data is the Fahrenheit scale of temperature measurement.
100
Procedures used to generate a model that can be used to determine what is likely to happen in the future. Examples include regression analysis, forecasting, classification, and other predictive modeling.
predictive analytics
101
Procedures that summarize existing data to determine what has happened in the past. Some examples include summary statistics (e.g., Count, Min, Max, Average, Median), distributions, and proportions.
descriptive analytics
102
Procedures that work to identify the best possible options given constraints or changing conditions. These typically include developing more advanced machine learning and artificial intelligence models to recommend a course of action, or optimizing, based on constraints and/or changing conditions.
prescriptive analytics
103
Analysis technique of business processes used to diagnose problems and suggest improvements where greater efficiency may be applied.
process mining
104
tax legislation offering major change to existing tax code
2018 Tax cuts and jobs act tax reform
105
A subset of the data warehouse focused on a specific function or department to assist and support its needed data requirements.
data mart
106
A repository of data accumulated from internal and external data sources, including financial data, to help management decision making.
data warehouse
107
A subset of a company-owned data warehouse focused on the specific needs of the tax department.
tax data mart