Information Management and BI 1(b) Flashcards Preview

CITP > Information Management and BI 1(b) > Flashcards

Flashcards in Information Management and BI 1(b) Deck (15):

What is "Data Modeling"?

- Process of prescribing data format, must be standardized for efficient processing
- Provides means for defining and analyzing data
- Dev process incorporate business rules into data model
- Data Modeling s/b foundation for reporting, lead to effective design of data to be captured, stored and reported


What is the difference btwn "Unstructured" vs "Structured" Data?

(1) Unstructured data:
- No prescribed format
- Capture, store and process in free form
(2) Structured data:
- Prescribed format
- Data organized and stored into files, records and fields w/ criteria for data entered into that format
- Ex: ERP transaction data
- Data must be Structured for computers to process data properly


Name 3 "Data Types":

(1) Numbers - used for quantities, amts and
dollar amts
(2) Alphanumeric - Address( letters, numbers, characters used in data value)
(3) Dates - describe specific date format


What is "Data Normalization"?

- Transactional flat file
- Remove certain
factors to eliminate data anomalies
- Outcomes of normalized data is to maximize efficiency of data model, increases integrity associated w/ capturing and managing data
- Eliminate data redundancy, leave any data field only once, except for duplication of key fields to establish relationships btwn files
- Integrating data from diff systems more efficient if integration data model is normalized
- Data files susceptible to data anomalies represents significant IT risk in that system


What are 3 types of "Conceptual Data Modeling" tools?

(1) Entity-relationship (ER) model
(2) Relational model
(3) Unified Modeling Language (UML)


What is the "ER Model" and what does it depict?

- A Conceptual Data Modeling tools
- Depicts relationship btwn files and records data files in a top-down fashion
- Results = ER diagrams


What are "ER diagrams"?

- Depicts file, type of interaction w/ another file and relationship btwn records in one file to the other


What is a "Relational Model"?

- A Conceptual Data Modeling tool
- Illustrates data files as tables w/ arrows from primary to foreign keys to illustrate relationship btwn data files
- Advantage of a Relational Model is that files are depicted as a table, similar to electronic spreadsheet and fields are depicted as columns, records as rows


What is Unified Modeling Language (UML)?

- Most recent Conceptual Data Modeling tool
- Associated w/ object-oriented programming (OOP), newer programming languages like C sharp and Java
- Both static and dynamic aspects can be depicted in a UML diagram
- UML is more technical than
ER and Relational Models


Name 2 "Data File" forms:

(1) Flat file
- All transactional data in a single record (row)
(2) Databases
- Transactional data in separate files using a structured process (outcome of data normalization)


What are 2 kinds of structure for data in a DW?

1. Dimensional Data ("Star")
2. Normalized Data ("Snowflake")


What is the "Snowflake" Approach?

- Normalization of Data
- Tables grouped by subject areas (customers, products, employees, purchases)
- Dozens of tables then linked by foreign/primary keys
- Advantage: Adding data from systems to a Snowflake DW is straightforward


What are the Disadvantages of the "Snowflake" Approach?

- Disadvantages:
(1) Difficult to join data from different sources to obtain more meaningful info
(2) Difficult to access info w/out thorough knowledge of data sources and DW structure


What is the "Star" Approach?

- Dimensional Data
- Transactional data designed and defined as facts (numeric data) or dimensions (ref info that adds meaning to the facts)
- Includes various relevant data that describes the numeric data (dimensions)
- Ex: Sale transaction has quantities sold and dollar amts ("facts”)
- Includes sale date, region sold, sales rep, inv category
- Advantage: Easier for users to understand than Snowflake Approach


What are Disadvantages of the "Star" Approach?

- Disadvantages:
(1) Load data from diff systems and databases complicated to get it into dimensional form
(2) Difficult to modify DW structure if entity changes business processes or captures data
- Struggle w/ “one way” dimension definitions
- Must refer to and use city, state and zip code as CSZ, and must be 1 field, not 3