Instructor Created Flashcards
What is the purpose of affinity grouping?
To evaluate relationships or associations between data elements that demonstrate some kind of affinity between objects
How is estimation defined in data analysis?
A process of assigning some continuously valued numeric value to an object
What is a key benefit of the estimation process?
Results can be ranked by score
What does classification involve?
Organizing data into predefined classes
What is the goal of the classification process?
To build a model that can accurately classify new records
What is clustering in data mining?
The task of dividing a large collection of entities into smaller groups based on similarity
What is the main process in data mining?
Assemble information, prepare it for mining, apply algorithms, and analyze results
What does data mining rely on?
Using one set of data for training and another for testing
What is data type conversion?
Parsing strings representing values and transforming them into the proper form for the target machine
What is data cleansing?
Correcting known data errors and automating corrections
What is the purpose of integration in data processing?
To represent linkage between different tables and maintain metadata
What does referential integrity checking ensure?
That referential integrity constraints are not violated
What are derivations in data processing?
Transformations based on business rules applied during data movement
What is the difference between denormalization and renormalization?
Denormalization breaks data into a simpler form, while renormalization restores a structured form
What is aggregation in data processing?
Populating summaries or cube dimensions in the staging area
What is the purpose of audit information?
To provide a reference for integrity checking
What is null conversion?
Transforming different forms of nulls from disparate systems
What are the two key questions in the extraction phase of ETL?
- What data should be extracted?
- How should that data be extracted?
What is a data mart?
A subject-oriented data repository for decision support and BI needs of a specific department
What does the ETL process stand for?
Extract, Transform, Load
What are the main steps in the ETL process?
- Get data from the source location
- Map data into a suitable model
- Validate and clean data
- Apply transformations
- Move data to the repository
- Load data into the warehouse
What is the financial value associated with increased profitability?
Derived from lowered costs or increased revenues
Financial value is crucial for evaluating business performance.
What does productivity value refer to?
Decreased workloads and high-quality outcomes
It emphasizes efficiency in processes like manufacturing.
What is trust value in a business context?
Greater customer, employee, or supplier satisfaction and confidence in forecasting
Trust value also includes better management reports and decision-making.