07 Data Analysis and Reporting Tools Flashcards

Question

Why would a fraud examiner perform duplicate testing on data? A. To determine the relationship among different variables in raw data B. To identify transactions with matching values in the same field C. To determine whether company policies are met by employee transactions D. To identify missing items in a sequence or series

Answer 1

B. To identify transactions with matching values in the same field ## Footnote Duplicate testing is used to identify transactions with duplicate values in specified fields. This technique can quickly review the file, or several files joined together, to highlight duplicate values of key fields. In many systems, the key fields should contain only unique values (no duplicate records). For example, a fraud examiner would expect fields such as check numbers, invoice numbers, and government identification numbers to contain only unique values within a data set; searching for duplicates within these fields can help the fraud examiner find anomalies that merit further examination.

Answer 2

D. All of the above ## Footnote Data mining is the science of searching large volumes of data for patterns. It combines several different techniques essential to detecting fraud, including the streamlining of raw data into understandable patterns. Data mining can also help prevent fraud. Additionally, it is an effective way for fraud examiners to develop fraud targets for further investigation.

Answer 3

A. Correlation analysis ## Footnote Using the correlation analysis function, fraud examiners can determine the relationships among different variables in the raw data. Fraud examiners can learn a lot about data files by learning the relationship between two variables. For example, we should expect a strong correlation between the following independent and dependent variables because a direct relationship exists between the two variables. Hotel costs should increase as the number of days traveled increases. Gallons of paint used should increase as the number of houses painted increases.

Answer 4

A. True ## Footnote The results of a data analysis test will only be as good as the data used for the analysis. Thus, before running tests on the data, the fraud examiner must make certain the data being analyzed are relevant and reliable for the objective of the engagement. The phrase “garbage in, garbage out” is applicable in the preparation phase. Depending on how the data was collected and processed, as well as the results of the data verification process, the fraud examiner might need to cleanse and convert the data to a format suitable for analysis before executing any data analysis tests. For example, certain field formats (e.g., date, time, or currency) might need to be modified to make the information consistent and ready for testing. The data must also be normalized so that all data being imported for analysis can be analyzed consistently. Common data fields from multiple systems must be identified, and data must be standardized. In normalizing the data for analysis, table layout, fields/records, data length, data format, and table relationships are all important considerations. Additionally, the following inconsistencies in the data must be addressed: * Known errors * Special/unreadable characters in the data * Other unusable entries When possible, such situations should be addressed by fixing, isolating, or eliminating them. Any issues that cannot be cleaned up will require special consideration during the testing and interpretation phase.

Answer 5

B. Pressure ## Footnote In conducting a textual analytics examination, the fraud examiner should come up with a list of fraud keywords that are likely to point to suspicious activity. This list will depend on the industry, the suspected fraud schemes or types of fraud risk present, and the data set the fraud examiner has available. In other words, if he is running a search through journal entry details, he will likely search for different fraud keywords than if he were running a search of emails. The factors identified in the Fraud Triangle are helpful when coming up with a fraud keyword list. One of these factors is pressure; consequently, the fraud examiner should consider how someone in the entity might be under pressure to commit fraud. For example, many people commit fraud because of something that has happened in their life that motivates them to steal. Maybe they find themselves in debt, or perhaps they must meet a certain goal to qualify for a performance-based bonus. Keywords that might indicate pressure include deadline, quota, trouble, short, problem, and concern.

Answer 6

D. All of the above ## Footnote Although the purpose of data analysis involves running targeted tests on data to identify anomalies, the ability of such tests to help detect fraud depends greatly on what the fraud examiner does before and after actually performing the data analysis techniques. Without sufficient time and attention devoted to planning early on, the fraud examiner risks analyzing the data inefficiently, lacking focus or direction for the engagement, running into avoidable technical difficulties, and possibly overlooking key areas for exploration. As a first step in the planning process—long before determining which tests to run—the fraud examiner must know what data is available to be analyzed and how that data is structured. Understanding the structure of the existing data will not only help ensure that the fraud examiner builds workable tests to be run on the data, but might also help identify additional areas for exploration that might otherwise have been overlooked.

Answer 7

D. All of the above ## Footnote Computers can scan database information for several specific types of information, creating a red flag system. To perform this, most software packages use a combination of different functions. These functions are: * Sorting * Record selection * Joining files * Multi-file processing * Correlation analysis * Verifying multiples of a number * Compliance verification * Duplicate searches * Expressions and equations * Filter and display criteria * Fuzzy logic matching * Gap tests * Pivot tables * Regression analysis * Sort and index * Statistical analysis * Stratification * Date functions * Benford's Law analysis * Graphing

Answer 8

D. All of the above ## Footnote Link analysis software is used by fraud examiners to create visual representations (e.g., charts with lines showing connections) of data from multiple data sources to track the movement of money; demonstrate complex networks; and discover communications, patterns, trends, and relationships. Link analysis is very effective for identifying indirect relationships and relationships with several degrees of separation. For this reason, link analysis is particularly useful when conducting a money laundering investigation, since it can track the placement, layering, and integration of money as it moves around unexpected sources. It could also be used to detect a fictitious vendor (shell company) scheme. For instance, the investigator could map visual connections between a variety of entities that share an address and bank account number to reveal a fictitious vendor created to embezzle funds from a company.

Answer 9

A. True ## Footnote Although the purpose of data analysis involves running targeted tests on data to identify anomalies, the ability of such tests to help detect fraud depends greatly on what the fraud examiner does before and after actually performing the data analysis techniques. Without sufficient time and attention devoted to planning early on, the fraud examiner risks analyzing the data inefficiently, lacking focus or direction for the engagement, running into avoidable technical difficulties, and possibly overlooking key areas for exploration. As a first step in the planning process—long before determining which tests to run—the fraud examiner must know what data is available to be analyzed and how that data is structured. Understanding the structure of the existing data will not only help ensure that the fraud examiner builds workable tests to be run on the data, but might also help identify additional areas for exploration that might otherwise have been overlooked.

Answer 10

D. The join function ## Footnote The join function gathers the specified parts of different data files. Joining files combines fields from two sorted input files into a third file. Join is used to match data in a transaction file with records in a master file, such as matching invoice data in an accounts receivable file to a master cluster. For example, you might need to compare two different files to find differing records between them.

Answer 11

D. All of the above ## Footnote The results of a data analysis test will only be as good as the data used for the analysis. Thus, before running tests on the data, the fraud examiner must make certain the data being analyzed are relevant and reliable for the objective of the engagement. The phrase “garbage in, garbage out” is applicable in the preparation phase. Depending on how the data was collected and processed, as well as the results of the data verification process, the fraud examiner might need to cleanse and convert the data to a format suitable for analysis before executing any data analysis tests. For example, certain field formats (e.g., date, time, or currency) might need to be modified to make the information consistent and ready for testing. The data must also be normalized so that all data being imported for analysis can be analyzed consistently. Common data fields from multiple systems must be identified, and data must be standardized. In normalizing the data for analysis, table layout, fields/records, data length, data format, and table relationships are all important considerations. Additionally, the following inconsistencies in the data must be addressed: * Known errors * Special/unreadable characters in the data * Other unusable entries When possible, such situations should be addressed by fixing, isolating, or eliminating them. Any issues that cannot be cleaned up will require special consideration during the testing and interpretation phase.

Answer 12

C. Cleansing and normalizing the data ## Footnote Defining the examination objectives, determining whether predication exists, and building a profile of potential frauds are all steps of the planning phase of the data analysis process, which is the first phase that should be undertaken. The second phase of the data analysis process is the preparation phase. The results of a data analysis test will only be as good as the data used for the analysis. Thus, before running tests on the data, the fraud examiner must make certain the data being analyzed are relevant and reliable for the objective of the engagement. During the preparation phase of the data analysis process, the fraud examiner must complete several important steps, including: * Identifying the relevant data * Obtaining the requested data * Verifying the data * Cleansing and normalizing the data

Answer 13

D. Textual analytics ## Footnote Textual analytics is a method of using software to extract usable information from unstructured text data. Through the application of linguistic technologies and statistical techniques—including weighted fraud indicators (e.g., fraud keywords) and scoring algorithms—textual analytics software can categorize data to reveal patterns, sentiments, and relationships indicative of fraud. For example, an analysis of email communications might help fraud examiners to gauge the pressures/incentives, opportunities, and rationalizations to commit fraud that exist in an organization.

Answer 14

A. True ## Footnote To maximize the potential success of detecting fraud through data analysis, the analysis performed should be based on an understanding of the entity’s existing fraud risks. To do so, the fraud examiner must first build a profile of potential frauds by identifying the organization’s risk areas, the types of frauds possible in those risk areas, and the resulting exposure to those frauds. Using the profile of potential frauds as a guide, the fraud examiner must identify the target data for analysis. Specifically, for each specific fraud scenario assessed to be a high risk to the organization, the fraud examiner should determine which data fields and records would be affected by such a scheme. The fraud examiner must then identify the logistics involved with obtaining this information, including: * What specific data (i.e., fields, records) is available * Who generates and maintains the data * Where the data is stored * Timing of the data extraction (e.g., date range, cutoff dates/times) * How the fraud examiner will receive and store the data (i.e., data format and storage/transfer mechanism) * Control totals needed for verification * How to validate the sources of data

Answer 15

C. Multi-file processing ## Footnote Multi-file processing allows the user to relate several files by defining relationships between multiple files, without the use of the join command. A common data relationship would be to relate an outstanding invoice master file to an accounts receivable file based on the customer number. The relationship can be further extended to include an invoice detail file based on invoice number. This relationship will allow the user to see which customers have outstanding invoices sorted by date.

Answer 16

D. Sort asset values by asset type or monetary amount. ## Footnote The following are typical examples of data analysis queries that can be performed by data analysis software on accounts payable: * Audit paid invoices for manual comparison with actual invoices. * Summarize large invoices by amount, vendor, etc. * Identify debits to expense accounts outside of set default accounts. * Reconcile check registers to disbursements by vendor invoice. * Verify vendor tax forms (e.g., U.S. Form 1099 or Value Added Tax (VAT) forms). * Create vendor detail and summary analysis reports. * Review recurring monthly expenses and compare to posted/paid invoices. * Generate a report on specified vouchers for manual audit or investigation.

07 Data Analysis and Reporting Tools Flashcards

(40 cards)