Integration of Data Sources Flashcards
(7 cards)
What does ETL stand for?
Extract, Transformation, Loading
What are some advantages of using flat files?
No overhead of maintaining data as DBMS does.
Sorting, merging, deleting, replacing and other data migration functions are much faster outside the DBMS.
What are the disadvantages of using Flat Files?
- No concept of updating.
- Queries and random access lookups are not well supported by the system.
- Flat files cannot be indexed for fast lookups.
Name 4 purposes flat files should be used for?
- Staging source data for safekeeping and recovery
- Sorting data
- Filtering
- Replacing Text Strings
What is Log Scraping?
Involves taking a snapshot of the database redo log at a certain time and finds the transactions affecting the tables that ETL is interested in.
What is Log Sniffiing?
Involves pooling the redo log at small time granularity and capturing the transactions on the fly.
What are the 4 basic steps needed for conceptual schema integration?
- Pre-integration analysis
- Comparison of schemas
- Conformation of schemas
- Merging and restructuring