Chapter 6: Data - Business Intelligence Flashcards

1
Q

What is data grannularity?

A

The extent of detail within a set of data (fine and detailed or coarse and abstract)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What are the different levels of data?

A

Individual, department, enterprise

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What are the different formats data could come in?

A

Document, presentation, spreadsheet, database

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What are the different granularities of data?

A

Detailed (fine), summary, aggregate (coarse)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What are the 4 primary traits that help determine the value of data?

A
  1. Data type
  2. Data timeliness
  3. Data quality
  4. Data governance
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What are the two primary types of data?

A

Transactional and analytical

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is transactional data?

A

All data contained within a single business process or unit of work, primary purpose is to support daily operational tasks. Can be used to determine how much inventory to carry and for analyzing daily sales reports, etc.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is analytical data?

A

All organizational data, primary purpose is to support the performance of managerial analysis tasks. Can help identify trends and make long-term strategic decisions.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is a real-time system?

A

Provides real-time data in response to requests. Many organizations use real-time systems to uncover key corporate transactional data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is data inconsistency?

A

Occurs when the same data element has different values

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What are data integrity issues?

A

Occur when a system produces incorrect, inconsistent, or duplicate data. Can cause managers to consider system reports invalid and make decisions based on other sources.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What are the 5 characteristics of high-quality data?

A
  1. Accurate
  2. Complete
  3. Consistent
  4. Timely
  5. Unique
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is a data gap analysis?

A

Occurs when a company examines its data to determine if it can meet business expectations, while identifying possible data gaps or where missing data may exist.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is data stewardship?

A

Management and oversight of an organization’s data assets to help provide business users with high-quality data easily accessible in a consistent manner

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is a data steward?

A

Responsible for ensuring policies and procedures are implemented across the organization and acts as a liaison between the MIS department and the business

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is data governance?

A

Overall management of the availability, usability, integrity, and security of company data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

What is Master Data Management?

A

Practice of gathering data and ensuring that it is uniform, accurate, consistent, and complete.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

What does a company that supports a data governance program have?

A

A defined policy that specifies who is accountable for various portions or aspects of data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

How do data governance and stewardship differ?

A

Governance focuses on enterprisewide policies and procedures, stewardship focuses on strategic implementation of policies and procedures.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

What is data validation?

A

Tests and evaluations used to determine compliance with data governance policies to ensure correctness of data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

What is a database?

A

Maintains data about various types of objects (inventory), events (transactions), people (employees/customers), and places (warehouses).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

What is a database management system (DBMS)?

A

Creates, reads, updates, and deletes data in a database while controlling access and security.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

What are the 2 primary tools for retrieving data from a DBMS?

A

Query-by-example (QBE) tool - helps users graphically design the answer to a question against a database.
Structured query language (SQL) tool - asks users to write lines of code to answer questions against a database.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

Do managers prefers QBE or SQL tools?

A

QBE ( Query-by-example) tools

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

What is a data element?

A

Also known as data field, it is the smallest or basic unit of data. Can include a customer’s name, address, email, etc.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

What are data models?

A

Logical data structures that detail the relationships among data elements by using graphics or pictures

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
27
Q

What is metadata?

A

Details about data. For example, metadata of an image could be its size, resolution, date created, etc.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
28
Q

What is a data dictionary?

A

Compiles all metadata about the data elements into the data model to be used when looking at a database.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
29
Q

What is a relational database model?

A

Stores data in the form of logically related two-dimensional tables.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
30
Q

What is another term for an “entity”?

A

A table

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
31
Q

What are “attributes”?

A

The categories within the table (headers)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
32
Q

What are “records”?

A

Each record in an entity (table) occupies one row

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
33
Q

What is the “primary key”?

A

The field (or group of fields) in a table that uniquely identifies a given record. For example, the name “Steve Smith” might bring up 20 results but the client ID 12345678 will only bring up the correct record.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
34
Q

What is a “foreign key”?

A

Primary key of one table that appears as an attribute in another table and acts to provide a logical relationship between the two tables.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
35
Q

What is data integrity?

A

Measure of the quality of data.

36
Q

What are integrity constraints?

A

Rules that help ensure the quality of data. Database design needs to consider integrity constraints.

37
Q

What are the two types of integrity constraints?

A
  1. Relational

2. Business critical

38
Q

What are relational integrity constraints?

A

Rules that enforce basic and fundamental information-based constraints. (For example, not able to create an order for a nonexistent customer)

39
Q

What are business-critical integrity constraints?

A

Enforce business rules vital to an organization’s success and often require more insight and knowledge than relational integrity constraints.

40
Q

What are business rules?

A

Define how a company performs certain aspects of its business and typically results in either a yes/no or true/false answer. (Example: merchandise returns are allowed within 10 days)

41
Q

What is identity management?

A

A broad administrative area that deals with identifying individuals in a system and controlling their access to resources within that system by associating user rights and restrictions with the established identity.

42
Q

What is a data point?

A

An individual item on a graph or chart.

43
Q

What are the 4 steps in the data analysis cycle?

A

Collect data, analyze data, communicate data, and visualize data

44
Q

What is a dataset?

A

An organized collection of data.

45
Q

What can a comparative analysis do?

A

Compare 2 or more datasets to identify patterns and trends.

46
Q

What is a data map?

A

Technique for establishing a match or balance between the source data and the target data warehouse.

47
Q

What can data maps do?

A

Identify data shortfalls and recognize data issues. Can alert managers to inconsistencies or help determine the cause and effects of enterprise-wide business decisions.

48
Q

What is data-driven decision management?

A

Approach to business governance that values decisions that can be backed up with verifiable data. Reliant upon the quality of the data gathered and the effectiveness of its analysis and intepretation.

49
Q

Why is data-driven decision management usually undertaken?

A

As a way to gain a competitive advantage.

50
Q

What is source data?

A

Identifies the primary location where data is collected (spreadsheets, invoices, time sheets)

51
Q

What is raw data?

A

Data that has not been processed for use. Raw data that has undergone processing is sometimes referred to as “cooked data”.

52
Q

What is data aggregation?

A

The collection of data from various sources for the purpose of data processing.

53
Q

What is a data warehouse?

A

A logical collection of data - gathered from many different operational databases - that supports business analysis activities and decision-making tasks.

54
Q

What are the 3 layers to a data warehouse?

A
  1. ETL or integration layer (extraction, transformation, and loading ETL) - extracts data from internal and external databases and transforms it using a common set of enterprise definitions, and loads it into a data warehouse.
  2. Data warehouse layer - layer stores data from every source system over time
  3. Data mart layer - contains a subset of data warehouse data. Data marts have a more functional purpose than data warehouses (organizational)
55
Q

How is big data different than a relational database?

A

Relational database contains data in a series of 2-dimensional tables. Big data is multidimensional, it contains layers of columns and rows.

56
Q

What is a data lake?

A

Storage repository that holds a vast amount of raw data in its original format until the business needs it. A data lake uses a flat architecture to store data unlike a traditional warehouse that stores data in files or folders.

57
Q

What happens with a data lake when a business question arises?

A

The data lake can be queried for all the relevant data providing a smaller dataset that can then be analyzed to help answer the question.

58
Q

What is data cleansing or scrubbing?

A

Process that weeds out and fixes or discards inconsistent, incorrect, or incomplete data.

59
Q

What do many companies trade data accuracy for?

A

Completeness

60
Q

What are data quality audits?

A

Determine accuracy and completeness of a firm’s data

61
Q

What is a data artist?

A

A business analytics specialist who uses visual tools to help people understand complex data

62
Q

What is a blockchain?

A

A type of distributed ledger, consisting of blocks of data that maintain a permanent and tamper-proof record of transactional data. Allows different parties around the world to access and verify data.

63
Q

What is distributed computing?

A

Processes and manages algorithms across many machines in a computing environment. It is a key component of big data and blockchain technologies.

64
Q

What is a ledger?

A

Records classified and summarized transactional data.

65
Q

How are blockchains a form of distributed computing?

A

A decentralized database is managed by computers belonging to a network. Each of the computers in the distributed network maintains a copy of the ledger to prevent a single point of failure and all copies are updated and validated simultaneously.

66
Q

What is proof-of-work?

A

Requirement to define an expensive computer calculation, also called mining, that needs to be performed in order to create a new group of trustless transactions (blocks) on the distributed ledger of blockchain.

67
Q

What are the two primary goals of proof-of-work?

A
  1. Verify legitimacy of a transaction, avoid the so called “double-spending”
  2. Create new digital currencies by rewarding miners for performing the previous task
68
Q

What could happen without proof-of-work?

A

Anyone could edit a transaction, recalculate all the hash values, and make a new blockchain with its own valid set of hash-linked transactions.

69
Q

When was blockchain introduced?

A

In 2009 with the release of Bitcoin.

70
Q

What are blocks?

A

Data structure containing a has, a previous hash, and data.

71
Q

What is a hash?

A

A hash is a function that converts an input of letters and numbers into an encrypted output of a fixed length. Hashes are the links in the blockchain.

72
Q

What is the term for the first block created in the blockchain?

A

Genesis block

73
Q

What is proof-of-stake?

A

A way to validate transactions and achieve a distributed consensus. The creator of a new block is chosen in a deterministic way, depending on its wealth, also defined as stake.

74
Q

What are 3 advantages of implementing blockchain technologies?

A
  1. Immutability
  2. Digital trust
  3. IoT integration
75
Q

What is immutable/immutability?

A

Unchangeable. Immutability is the ability for a blockchain ledger to remain a permanent, indelible, unalterable history of transactions.

76
Q

What are simple and composite attributes?

A

Simple attributes cannot be broken down into smaller components (example: last name). A composite attribute can be divided into smaller components (address can be broken down into street, city, state, etc.)

77
Q

What are single-valued vs multivalued attributes?

A

A person’s age is a single valued attribute since a person cannot have more than one age. Multivalued means the potential for having more than 1 attribute.

78
Q

What is a stored vs derived attribute?

A

Derived - if an attribute can be calculated using the value of another, it is a derived attribute (age can be derived from DOB).
Stored - all other attributes. May be derived into another attributes (DOB can be derived to age)

79
Q

What is a null-valued attribute?

A

Assigned to an attribute when no other value applies or a value is unknown.

80
Q

When can the Entity-Relationship Diagram (ERD) be documented?

A

Once entities, attributes, business rules have been identified.

81
Q

What is the main reason for creating an ERD?

A

Entity-Relationship Diagram. To identify and represent relationships between entities.

82
Q

What are the three basic types of relationships in the ERD?

A

One-to-one
One-to-many
Many-to-many

83
Q

What is a one-to-one relationship?

A

A relationship between two entities in which an instance of one entity can be related to only one instance of a related entity.

84
Q

What are the problems with many-to-many relationships?

A

Relational data model was not designed to handle many-to-many relationships and need to be replaced with one-to-many relationships. They will also create redundancy in the data that are stored which has a negative impact on accuracy and consistency.

85
Q

What are composite entities?

A

Entities that exist to represent the relationship between 2 other entities.

86
Q

What is relationship cardinality?

A

Expresses the specific number of instances in an entity.