IICS CONCEPTS Flashcards
What is the difference between Connected and Unconnected Lookup ?
What are some common uses for them ?
Connected Lookup– Is a part of the mapping data flow. With connected lookups you can have multiple return values. That is you can pass multiple values from the same row in the lookup table out of the lookup transformation.
Common uses for them are
* Finding a name based on a number
* Finding a value based on a range of dates
* Finding a value based on multiple conditions
Unconnected Lookup: Transformation that exists separate from the data flow in the mapping.
Common uses for them are:
- Testing the results of a lookup in a expression
- Filtering records based on the lookup results
- Marking records for update based on the result of a lookup ( for example updating slowly changing dimensions table)
- Calling the same lookup multiple times in one mapping.
What is a Normalizer ?
It is an active transformation that transforms one incoming row to multiple output rows
When the Normalizer transformation receives a row that contains multiple-occurring data, it returns a row for each instance of the multiple-occurring data.
For example, a relational source includes four fields with quarterly sales data. You can configure a Normalizer transformation to generate a separate output row for each quarter.
What are the normalizer properties ?
Desribe them as well ?
- Normalized Fields tab: It define the multiple-occurring fields and specify additional fields that you want to use in the mapping.
- Field Mapping tab: It connect the incoming fields to the normalized fields.
What is GeneratedColumnID (GCID) and GeneratedKey (GK) ?
GeneratedColumnID (GCID): Generates an incremental value starting with 1 for each occurrence of a multiple-occurring data within the row. Resets back to 1 for next row.
GeneratedKey (GK): Generate a key for each incoming row. The key starts with one and is incremented by one for each processed row. It is an optional field.
What is SCD ?
In data management and data warehousing, a slowly changing dimension (SCD) is a dimension that consists of relatively static data that can change slowly but unexpectedly, rather than on a regular schedule. Some examples of specific slowly changing dimensions are entities in the form of names of geographic locations, customers or products.
What are the 6 types of SCD and how are they used ?
Type 0 – Fixed Dimension
-No changes allowed, dimension never changes
Type 1 – No History
-Update record directly, there is no record of historical values, only current state
Type 2 – Row Versioning
-Track changes as version records with current flag & active dates and other metadata
** Type 3 – Previous Value column**
-Track change to a specific attribute, add a column to show the previous value, which is updated as further changes occur
Type 4 – History Table
-Show current value in dimension table but track all changes in separate table
Type 6 – Hybrid SCD
-Utilise techniques from SCD Types 1, 2 and 3 to track change.
SCD 0,1,2 are widely used. There is no SCD 5
WHAT IS
WHAT IS SCD2 ?
What is a Dimension Table ?
SCD type 2 will store the entire history in the dimension table.
In SCD type 2 effective date, the dimension table will have Start_Date (Begin_Date) and End_Date as the fields.
If the End_Date is Null, then it indicates the current row.
SCD type 2 retains the full history of values.
When the value of a chosen attribute changes, the current record is closed.
A new record is created with the changed data values and this new record becomes the current record.
Dimension table having the detailed information about the product.
Dimension tables are typically small, ranging from a few to several thousand rows.
It provides the context /descriptive information for a fact table measurements and it’s structure is smaller than Fact table.
Surrogate Key is used to prevent the primary key (pk) violation (store historical data).
What is a Fact table ?
Fact table having the aggregate information about the product.
Fact tables can grow very large, with millions or even billions of rows.
Fact table is a table that contains summarized numerical and historical data (facts) and a multipart index composed of foreign keys from the primary keys of related dimension tables.
It size is larger than Dimension table.
What are the transformations that is active and which transformation is passive ?
Active transformations: Filter, Aggregator, joiner, rank, router and source qualifier
Passive transformations: Expression, Sorter, Sequence Generator, Lookup
How can we make joiner as a passive transformation ?
By selecting the Full join
How can we make sorter transformation as active transformation ?
By using Distinct option
How many methods we have to achieve SCD2 and which method is mostly used ?
There are 3 methods
Method 1 ——- FLAG
Method 2 ——– Version
Method 3———- Effective date
what are the cloud services ?
Iaas?
Paas?
Saas?
- (infrastructure as a Service) third type of computing across the cloud, Iaas provides a virtualized platform.
- (Platform as a Service) Provides computers platforms for users of cloud computing
- (Software as a service) applications are hosted on cloud servers and users access them over the internet
What is data intergration used for in the cloud ? What are the different tasks that are used for batch jobs ? Define them as well
What are the rare tasks that get used as well ?
Developing Batch jobs
* Mapping task - This is similar to the PowerCenter mapping designer. This is used to define the data flow logic to process the data.
* Replication task - this can be used to replicate more than one object from a source to a target. You can either use full load, incremental load options for replicating the data.
* **Synchronization task **- This is used to synchronize the data between the source and target. It provides DML operations like insert, update, upsert and delete. Using this task, we can sync only one object data at a time.
* Taskflows - You can combine other tasks and run them together in a flow. You can run these tasks in sequence, parallel or based on a decision.
The rare tasks are
Powercenter task
Integration
Cleansing
Warehousing
What does the filter transformation do ? What happens when the filter condition is true or false ?
What exactly is a filter ? And how does it work best in mapping effiency
- The Filter transformation filters data out of the data flow based on a specified filter condition.
- To improve job performance, place the Filter transformation close to mapping sources to remove unnecessary data from the data flow.
Filter condition is an expression that returns TRUE or FALSE.
- When the filter condition returns TRUE for a row, the Filter transformation passes the row to the rest of the data flow.
- When the filter condition returns FALSE, the Filter transformation drops the row.
You can use all the operators in filter especially this one too != which
What is an expression transformation ? What else do you use the transformation for ?
The Expression transformation calculates values within a single row. Use the Expression transformation to perform non-aggregate calculations.
You might use an Expression transformation to
- concatenate first and last names,
- change FIRST_NAME to upper case or lower case,
- change LAST_NAME to lower case or upper case,
- add number of months in date column etc.
you are manipulating rows
What is a sorter transformation ? Is it active or passive ?
Tell us more about sorter transformations like the specifics ?
Sorter transformation in IICS is used to sort the data in an ascending or descending order based on single or multiple keys.
This is the type an active transformation which sorts the data either in ascending order or descending order.
- You can sort data that passes through a Lookup or an Aggregator transformation configured to use sorted incoming fields.
- When you create a Sorter transformation, specify fields as sort conditions and configure each sort field to sort in ascending or descending order.
- You can use a parameter for the sort condition and define the value of the parameter when you configure the mapping task.
It can also supress the duplicate records in the source
What is a sorter transformation(Distinct) ?
Sorter Transformation is used to sort the source data in either Ascending or Descending order, similar to SQL command ORDER BY Statement.
**If you want to Integration Service to select unique values from a source, use the Select Distinct option. **
**You might use this feature to extract unique customer/employees IDs from a table listing total customer/employees. **
**Using Select Distinct filters out unnecessary data earlier in the data flow, which might improve performance. **
What is a task ? What are the Data integration tasks ?
A task is a process that you configure to analyze, extract, transform, and load data. You can run individual tasks manually or set tasks to run on a schedule.
- Mapping: Use to process data based on the data flow logic defined in a mapping or Visio template.
- Synchronization : Use to load data and integrate applications, databases, and files. Includes add-on functionality such as mapplets.
- Replication: Use to replicate data from Salesforce or database sources to database or file targets. You might replicate data to archive the data, perform offline reporting, or consolidate and manage data.
- Mass Ingestion : Use to transfer a large number of files of any file type between on-premises and cloud repositories and to track and monitor file transfers.
- Masking : Use to replace source data in sensitive columns with realistic test data for non-production environments. Masking rules define the logic to replace the sensitive data. Assign masking rules to the columns you need to mask.
- Powercenter: Use to import a PowerCenter workflow and run it as a Data Integration PowerCenter task.
What is a Mapping task? What are the three templates ?
Mapping task is use to process data based on the data flow logic defined in a mapping or Visio template.
When you create a mapping task, you select the mapping or Visio template for the task to use.
The mapping or Visio template must already exist before you can create a mapping task for it. Alternatively, you can create a mapping task by using a template.
- Integration
- Cleansing
- Data Warehousing
Can we specify ranking on more than one port?
No you can’t.
Advanced properties define how to Rank transformation processes _________.
DATA
What is joiner transformation ?
Whats another aspect of joiner transformation ? Also the details of how
The Joiner transformation can join data from two related heterogeneous sources.
NOTE:You can use the Joiner transformation to join two or more than two tables but in this journey you will learn how to join two tables with some conditions.
A join condition matches fields between the two sources. You can create
What are the different Join types ?
- Normal Join:- Includes rows with matching join conditions. Discards rows that do not match the join conditions.
- Master Outer:- Includes all rows from the detail pipeline and the matching rows from the master pipeline. It discards the unmatched rows from the master pipeline.
- Detail Outer:- Includes all rows from the master pipeline and the matching rows from the detail pipeline. It discards the unmatched rows from the detail pipeline.
- Full Outer:- Includes rows with matching join conditions and all incoming data from the master pipeline and detail pipeline.