Exam 1 Flashcards

1
Q

Currently, most data analysis is performed by ______
A. Data analysis
B. Data scientist
C. Business users
D. All of the above

A

C. Business users

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Which of the following is NOT part of the convergence of Data Analytics?
A. Domain Knowledge
B. Mathematics/Statics
C. Engineering
D. Computer Science

A

C. Engineering

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Analytics takes us from Data to Decision - What is the order for the middle steps?
Wisdom
Knowledge
Data
Information

A

Data
Information
Knowledge
Wisdom

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Which of the following is NOT one of the benefits of data analytics?
A. Performance
B. Longevity
C. Value
D. Training

A

D. Training

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Data analytics and data science are different words for the same thing.
A. True
B. False

A

B. False

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Place the data analytics step in the correct order.
A. Making decisions based on the information
B. Gathering data that are sometimes not in a usable form
C. Loading the data into storage models
D. Identifying the problem

A

D. Identifying the problem
B. Gathering data that are sometimes not in a usable form
C. Loading the data into storage models
A. Making decisions based on the information

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Which of the following is an enabler of data analytics?
A. People
B. Performance
C. Infrastructure
D. Training

A

C. Infrastructure

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Which one of the following is NOT one for the enablers of data analytics?
A. Tools
B. People
C. Infrastructure
D. Technology

A

B. People

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Digital transformation is part of which industrial revolution?
A. 1
B. 2
C. 3
D. 4

A

D. 4

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

The 4th industrial revolution…
A. Uses water and Steam to mechanize production
B. Uses disruptive technologies and trends such as AI, IoT, robotics
C. Uses electronics and information technology to automate production
D. Uses electric energy to create mass production

A

B. Uses disruptive technologies and trends such as AI, IoT, robotics

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Is the data described below structure, semi-structured, or unstructured or a mix of each?

A university tracks all of the classes that students sign up for each semester. The university records the course number, class decription, and course credit hours for each student.

A. Structured
B. Semi-structured
C. Unstructured
D. Mix of each

A

A. Structured

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is a flat file?
A. A single file linked to other single files
B. Multiple tables with no hierarchy
C. Multiple tables with hierarchy
D. Single file with no hierarchy

A

D. Single file with no hierarchy

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Why is a primary key needed?
A. To uniquely identify a record
B. To uniquely identify a table
C. To uniquely identify an attribute
D. To uniquely identify an entity

A

A. To uniquely identify a record

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Why is a foreign key needed?
A. To uniquely identify a record
B. To link two tables
C. To uniquely identify an entity
D. It is just an extra piece of information

A

B. To link two tables

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Natural language processing (NLP) is the ability of a computer program to understand human language.
A. True
B. False

A

A. True

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is metadata?
A. A metro system
B. Provides information about other data
C. Graphically shows data
D. Show’s stored information

A

B. Provides information about other data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Is the data described below structured, semi-structured, or unstructured or a mix of each?

A company owns a football stadium and takes high definition photos of all fans. The company stores these images and plans eventually to use advanced technologies to see which fans are most likely to wear the team’s colors so they can market clothing to them.

A. Structured
B. Semi-structured
C. Unstructured
D. Mix of each

A

C. Unstructured

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

In online transactional processing (OLTP) data is stored one transaction at a time?
A. True
B. False

A

A. True

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Three-tier architecture includes which of the following?
A. User interface level
B. Data level
C. Application level
D. Analysis level

A

A. User interface level
B. Data level
C. Application level

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

What is data concurrency?
A. Users are allowed access to the same data simultaneously
B. Provides access to all authorized users
C. No unnecessary replication of data
D. Separation of data from the programs that use the data

A

A. Users are allowed access to the same data simultaneously

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

A typical Enterprise Resource Planning (ERP) system will NOT support?
A. Customer Relationship Management
B. Human Resource Management
C. Supply Chain Management
D. Unique requirement of a specific business sector

A

D. Unique requirement of a specific business sector

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

What does OLAP stand for?
A. Online Analytical Processing
B. Old Angry Person
C. Online Literate Apes
D. Old Learning Algorithms Program

A

A. Online Analytical Processing

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

Online Analytical Processing (OLAP) is best defined as ______.
A. Technology for the very rapid analysis and processing of large datasets
B. Activities for detecting and correcting data in a database
C. Capability for manipulating and analyzing large datasets from many sources
D. Open-source software framework that enables distributed parallel process

A

C. Capability for manipulating and analyzing large datasets from many sources

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

A web crawler ….
A. Lists pages on the internet
B. Is used by search engines
C. indexes pages to make searching easier
D. Uses key information to return results

A

C. indexes pages to make searching easier

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

Clickstream is….
A. The fingerprint that web visitors leave
B. Sequence of hyperlinks to follow web visitor action in order
C. The links of a web page
D. The first and last page viewed by visitors

A

B. Sequence of hyperlinks to follow web visitor action in order

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

How do organizations gather data through sentiment mining?
A. Evaluate customer comments from social media (Facebook and Twitter)
B. Examine purchases through video camera
C. Uncover unknown patterns of databases and variables
D. Obtain data from UPC Scanner codes.

A

A. Evaluate customer comments from social media (Facebook and Twitter)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
27
Q

Data warehouses are informational systems.
A. True
B. False

A

A. True

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
28
Q

Which of the following are true about a data warehouse (DW) structure?
A. Makes reporting and accessing data difficult
B. “Read only” and therefore modification anomalies are irrelevant
C. Relational database that has been denormalized
D. Can only hold numerical data

A

B. “Read only” and therefore modification anomalies are irrelevant
C. Relational database that has been denormalized

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
29
Q

What does “denormalized” mean?
A. Breaking large database tables into many smaller tables to aid performance
B. Using sophisticated techniques to discover new relationships in a data set
C. Using techniques to investigate hypothesized relationships in data set
D. Some redundant data is added back to the database to reduce the # of tables

A

D. Some redundant data is added back to the database to reduce the # of tables

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
30
Q

A multidimensional model is also referred to as a data cube or data mart.
A. True
B. False

A

A. True

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
31
Q

What is data staging?
A. Area where data analytics and visuals are produced
B. Front end user interface (UI)
C. Area where data is stored indefinitely
D. Area where data are cleaned up and prepared (transformation)

A

D. Area where data are cleaned up and prepared (transformation)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
32
Q

A star schema typically has what type of relationship between a dimension and fact table?
A. Many to many
B. One to one
C. One to many
D. All of the above

A

C. One to many

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
33
Q

A star schema is…
A. 4-step data warehouse design process
B. Oracle construct where users, tables, and indexes are stored
C. Efficient way to organize facts and dimensions in a data mart
D. Collection of data marts within a data warehouse

A

C. Efficient way to organize facts and dimensions in a data mart

34
Q

When creating a star schema for Expenditures, what items would be a measure?
A. Amount
B. Vendor
C. Product
D. Quantity

A

A. Amount
D. Quantity

35
Q

If a database needs to contain both county and school district data for the same address, what hierarchy is needed?
A, Time-dependent hierarchy
B. Version-dependent hierarchy
C. Time-independent hierarchy
D. Interval dependent hierarchy

A

B. Version-dependent hierarchy

36
Q

It is possible to have a time-dependent language-dependent text attribute.
A. True
B. False

A

A. True

37
Q

What is the difference between a star schema and a snowflake (SF) schema?
A. Star-uses surrogate keys; SF uses business keys
B. Star-all dimensions are normalized; SF-all dimensions are denormalized
C. Star-all dimensions are denormalized; SF-some dimensions are normalized
D. Star has one fact table; SF has many fact tables

A

C. Star-all dimensions are denormalized; SF-some dimensions are normalized

38
Q

Under which condition should the snowflake schema be used?
A. Difficult data migration
B. When star schema is too slow
C. Star schema is unavailable
D. Different grains (granularity) and different source systems

A

D. Different grains (granularity) and different source systems

39
Q

What is the name for storing how a data item has changed over time?
A. Historization
B. Data redundancy
C. Multi-dimensional
D. Normalization

A

A. Historization

40
Q

What does ETL stand for?
A. Extract, test, load
B. Extend, transition, load
C. Extract, transform, load
D. Extract, trust, load

A

C. Extract, transform, load

41
Q

A data source and source system are the same thing.
A. True
B. False

A

B. False

42
Q

The job of a data wrangler is to:
A. Realign mismatched data and harmonize keys and records
B. Slice and dice the data
C. Develop complex code for data mining
D. Test our data models

A

A. Realign mismatched data and harmonize keys and records

43
Q

Which statement best describes extraction?
A, Manually parsing data
B. Slicing and dicing to get only the data we are interested in
C. Identifying data sources & source fields and acquiring or sourcing the data
D. Migrating each data point to a new system

A

C. Identifying data sources & source fields and acquiring or sourcing the data

44
Q

What is the name for programs that pull data from a source system and bring them into the data warehousing system?
A. Transformation
B. Extractors
C. Mappers
D. Harmonizers

A

B. Extractors

45
Q

Transformation includes a data harmonization step.
A. True
B. False

A

A. True

46
Q

What does data harmonization mean?
A. Data from multiple sources is made consistent
B. Unnecessary data is deleted and the system is optimized
C. Outliers are removed to make sure our trendlines is correct
D. The data transformation is peer reviewed

A

A. Data from multiple sources is made consistent

47
Q

Before we harmonize data, we must create a data map.
A. True
B. False

A

A. True

48
Q

Data harmonization includes which tasks?
A, Slicing data
B. Consolidating data
C. Cleaning data
D. Reformatting data

A

B. Consolidating data
C. Cleaning data
D. Reformatting data

49
Q

Where in the ETL process would moving the currency symbol from a revenue field to a new seperate field occur?
A. Combing data
B. Data cleansing
C. Data smoothing
D. Splitting data

A

D. Splitting data

50
Q

Which of the following is true regarding outliers?
A. Outliers should ALWAYS be excluded from the data set
B. Outliers can result from data entry errors in the source system
C. Outliers can be calid data points outside the normal rance
D. Outliers can skew the results of data analytics

A

B. Outliers can result from data entry errors in the source system
C. Outliers can be calid data points outside the normal rance
D. Outliers can skew the results of data analytics

51
Q

How do you handle missing or corrupted data in a dataset?
A. Drop missing rows or columns
B. Assign a unique category to missing values
C. Replace missing values with mean/median/mode
D. All of the above

A

D. All of the above

52
Q

Fuzzy inference (logic) operates similar to humans in the decision-making process.
A. True
B. False

A

A. True

53
Q

What is data cleansing?
A. How close measurements of the same item are to each other
B. When the sampled data doesn’t represent the population
C. Removing errors and inconsistencies from data
D. Splitting one data field into two or more fields

A

C. Removing errors and inconsistencies from data

54
Q

In data cleansing, what is “signal”?
A. Relevant meaningful data
B. Irrelevant meaningless data
C. Unstructured data
D. Mathematical method to reduce noise

A

A. Relevant meaningful data

55
Q

What type of transformation rule would be applied to convert a field from decimal to a percentage?
A. String rule
B. Data and time rule
C. Algebraic rule
D. Programmatic rule

A

C. Algebraic rule

56
Q

In a data warehouse, what is dynamic data?
A. Data that must be split into multiple fields
B. Data that requires updating over time after data loading
C. multiple data fields that must be combined into one field
D. Null values that need to be addressed before data loading

A

B. Data that requires updating over time after data loading

57
Q

Which loading method is used when only records added/modified since the previous load are added to the data warehouse?
A. Historical load
B. Delta load
C. Repeating load
D. Full load

A

B. Delta load

58
Q

What is a series of rule-based schedules of data extractions and loading?
A. Transformational programming
B. Roll back
C. Process chain
D. Programmatic rule

A

C. Process chain

59
Q

Slicing is a way to filter a large dataset to smaller data sets. Dicing then creates an even more granular data set.
A. True
B. False

A

A. True

60
Q

What is a common way for an OLAP tool to connect to a data warehouse?
A. Multidimensional expressions
B. Directly through quesries
C. Star schema
D. Crosstab tabulation

A

A. Multidimensional expressions

61
Q

Multidimensional analysis involves applying slicing and dicing techniques to star schemas instead of to pivot tables.
A. True
B. False

A

A. True

62
Q

Crosstabs are useful for summarizing data by category or group.
A. True
B. False

A

A. True

63
Q

Which of the following are slicing and dicing techniques?
A. Sort
B. Filter
C. Rank
D. Aggregations
E. Calculations
F. Cubing

A

A. Sort
B. Filter
C. Rank
D. Aggregations
E. Calculations

64
Q

To add emphasis to a crosstab, a creator can add…
A. Calculated fields
B. Conditional formatting
C. Pivot tables

A

B. Conditional formatting

65
Q

What are appropriate aggregations for quantity on hand?
A. Minimum
B. Sum
C. Maximum
D. Average

A

A. Minimum
C. Maximum
D. Average

66
Q

In currency conversion, what is the currency of the original transaction?
A, Target currency
B. Source currency
C. Selling currency
D. Buying currency

A

B. Source currency

67
Q

What is background filtering?
A. Filtering on a dependent dimension
B. Filtering on an independent dimension
C. Filtering a characteristic not displayed in the crosstab
D. Filtering a characteristic displayed in the crosstab

A

C. Filtering a characteristic not displayed in the crosstab

68
Q

What is true about key figures that are “cumulative” in nature?
A. Require a time marker - as of xxxx
B. Not aggregated from period to period on a crosstab
C. Restart at 0 on a regular basis
D. Balance sheet accounts are an example

A

C. Restart at 0 on a regular basis

69
Q

What are appropriate aggregations for quantity sold?
A. Minimum
B. Maximum
C. Sum
D. Average

A

A. Minimum
B. Maximum
C. Sum
D. Average

70
Q

What data characteristics allow for roll up and drill down techniques?
A. Language-related characteristics
B. Time-related characteristics
C. Geospatial characteristics
D. Hierarchies

A

D. Hierarchies

71
Q

Which is an example of an inaccurate aggregation method?
A. Grand total on average profit by month
B. Grand total on total profit
C. Average on quarterly profit
D. Average on total profit

A

A. Grand total on average profit by month

72
Q

An area chart is an enhancement of the line chart.
A. True
B. False

A

A. True

73
Q

Is the number of ducks on a pond continuous or discrete?
A, Continuous
B. Discrete

A

B. Discrete

74
Q

Text on a chart is usually a/an
A. Label
B. Marker
C. Attachment
D. All of the above

A

A. Label

75
Q

How many numerical variables can be included in a pie chart?
A. 1
B. Up to 10
C. As many there are slices
D. All of the above

A

A. 1

76
Q

Is the volume of water in Lake Conroe continuous or discrete?
A, Continuous
B. Discrete

A

A, Continuous

77
Q

Gender is an example of a nominal variable
A. True
B. False

A

A. True

78
Q

Which one of these is not part of the IBCS Success acronym?
A. Sample
B. Unify
C. Check
D. Express

A

A. Sample

79
Q

Is distance a discrete or continuous variable?
A. Continuous
B. Discrete

A

A. Continuous

80
Q

A line chart can only handle one variable.
A. True
B. False

A

B. False

81
Q

Focused analysis refers to the user’s ability to display data the meets specified criteria.
A. True
B. False

A

A. True