Tuesday, April 16, 2019

Informatica Interview Questions

* Direct and Indirect method ?
Direct method is used to give single file location
Indirect method is used to give multiple file location in a generic file and give that generic file name as file location.

* Dimension tables or Fact tables loaded first?
Dimension tables are loaded first and then fact tables and according to requirement document scds will be applied accordingly.

* Does update strategy work without primary keys defined on it ?
No, it does not work without primary key.

* Can you update the target table without primary key on it ?
Basically update on target is done with primary keys when we use Update Strategy transformation. 
But, we can update target without primary key also. 
We need to check the option in target table properties as 'update override

* What is Fatal and Non-Fatal error?
Fatal error force session to fail.
Non-fatal error does not force session to fail. 
Following are the few database error messages which will be logged in the Error Log Tables / Flat files.

* What are the files that are created by informatica at the time of session run ?
- Badfile
- Session log
- Error log
- Workflow log

* What are the different transformations use Cache files in Informatica?
There are 5 different transformations use Cache files in Informatica.
Lookup Cache : Index Cache Stores condition values(data) and  Data Cache stores the data that isn't stored in index cache.
Joiner Cache : Index cache stores all the port values which are participated in the join condition and data cache have stored all ports which are not participated in the join condition.
Aggregator Cache : Index Cache stores group values configured in group by ports and Data Cache stores calculations based on group by ports.
Rank Cache : Index Cache stores group values configured in group by ports and Data Cache stores ranking information based on group by ports.
Sorter Transformation: 
It does not create .idx (index) and .dat (data) cache files like those used by Lookup, Aggregator, Rank, or Joiner.
The Sorter transformation uses a sort cache, but it’s different. It uses a single cache (not split into index and data.

* Output Old Value On Update (Dynamic lookup) ?
When the Integration Service updates a row in the cache, it outputs the value that existed in the lookup cache before it updated the row based on the input data. When the Integration Service inserts a row in the cache, it outputs null values.

* Write backward compatible session log file.
If ‘Write Backward Compatible Session Log File’ option is checked under Informatica Session Property Settings then it will write the workflow or session logs into a text file 
If the ‘Write Backward Compatible Log File’ option is not checked then the Integration Service creates the binary log only, session log files are in binary format by default.


* Difference between static and dynamic lookup
A static cache is created by default if a dynamic lookup is not selected. Once a static cache is created, it will not insert or update, but it will look into the cache file and get the realted data into the target. In the case of a dynamic lookup, once a cache is created, it will insert or update dynamically as the dynamic cache is synced with the lookup table.

* How to load the source file name of the current processing flat file along with the data into the target ?
Go to source file and select the option 'Add Currently Processed Flat File Name Port'. As a result source file name along with the path is loaded into the target table.This works for the both direct and indirect methods

* Dynamically generate filenames with respective to dates?
If we have flat file we would select option available on the top right corner "add file name column to this table" for the target file. In expression transformation we add a column called filename or systemdate and write expression to display to current date i,e "to_char(sessstarttime,'yyyymmddhh') and connect the ports to target definition as a result we will get target filenames dynamically when ever we run workflow.

* Difference between connected and unconnected lookup ?
- Connected lkp received connection directly from pipleline where as unconnected lkp receives connection from :lkp expression.
- In Connected lkp Dynamic & Static cache can be used where as in Unconnected lkp only Static cache can be used.
- Connected lkp returns multiple columns from same row where as UnConnected lkp returns only one column from each row.
- Connected lkp passes multiple output values to another transformation where as unConnected lkp can pass only one output value.
- Connected lkp returns default values when there is no match from lkp condition where as UnConnected lkp returns Null.

* What are the different error tables? 
There are 4 types of error tables.
pmerr_data : It Stores data and metadata about a transformation row error and its corresponding source row.
pmerr_msg : It Stores metadata about an error and the error message.
pmerr_sess : It Stores metadata about the session.
pmerr_trans : It Stores metadata about the source and transformation ports, such as name and datatype, when a transformation error occurs.

* What is target load plan?`
Its used to specify the order in which integration service loads target.
We implement it when we want to give priority to a particular target table when we have multiple target tables. 
We can set in mappings tab of target load plan option. By using arrow key we can set target load plan.
Mainly when we have parent and child tables we use target load plan.

* What is incremental aggregation?
when you use aggregator transformation and when you want to add new data to previous data or rows then the incremental aggregation option from properties tab is used.

* What is version control checkin and checkout ?
When we want to edit informatica objects we need to checkin version inorder to edit it. Inorder to save what is edited and also to be viewed by others users we need to checkout versioning. Inorder to apply versioning we need to right click on the object on which you want to use versioning.

* What is scheduler?
Informatica scheduler
Control M : It would be processed by seperate team.

* What are mapping variables?
They are used to assign max and min variable. We can track change data capture mapping logic with mapping variables.
They are used for incremental load in the mapping.

* What are worflow variables? 
Workflow variables provides ability to store data somewhere inorder to use it in the conditions and actions within workflow.
They are used to run multiple sessions on conditions.

* Assignment task ?
Its used to assign value to workflow variable. 

* Control table or Control file or Audit table ?
Control table stores information about workflow and session status, session statistics, wkf status, session start time, end time, src success rows, tgt success rows etc

* what is pushdown optimization?
It converts the mapping logic into sql query.
Option is in properties tab of session task.
Limitations : It will not support the Rank and transaction control transformation.

* Global variables ?
All the file paths are mentioned in global variables.

* What is partitioning?
It's used for parallel processing inorder to decrease the time to load data into target. Different partitions are pass through, round robin, hash user key, auto user key, key range and finally database partition. Each type of partition works according to its own logic.
Default Partitions are SourceQualifier, Aggregator, Target.
SQ         : Pass though
Sorter     : Hash Auto key
Router     : 
Lookup     : Any partition
Expression : Pass through
Aggregator : Hash Auto key
Sorter     : Pass through
Target     : Round robin or Pass through

* Filter and router which gives better performance
- The Router transformation has better Performance because Informatica server processes the input data only once, instead of as many times as you have conditions in Filter transformation.
- If a Filter transformation doesn't meet the condition it simply drops the rows it only writes the message to the log file.
If a Router transformation doesn't meet the condition it simply routes to the default group and if you do not connect the default group to any target or to other transformation it simply drops the records.

* Tough scenarios u faced 
If it is so Yes we can do so..  Instead of a lookup -Take the target table as a source( say S2) and the incremental data (coming from your source database/file say S1). Now Use joiner transformation (IF S1 is your Master table) then use Detail outer join - By this you will get all the records coming from S1 source and only matching records coming from S2 .e. ur target table). Now you can flag your records on the basis of S2 tables columns, i.e. If C1 is a primary key of S2 and against a particular record it is coming as NULL then that is a new record and needs to be inserted into target (can use flag for this Insert updates)

* What is the difference between a shortcut of an object and copy of an object?
If we change or modify in Shortcut, it will reflect in the source. Whereas in case of copying, the source will not change if we change something in the copied object.

* Source Qualifier is an active t/r but there is no change of row count then y should we called it active t/r & Router,sorter sometimes no change in row count then Y we called its active t/r?
A transformation is active does not mean that it always change number of rows. An active transformation can have possibility to change the number of rows.
In Source Qualifier we can do Sql-override to limit the number of rows in where condition, Sorter can remove the duplicates when select the property check box Distinct, In Router you can only extract the records from selected (filtered) group. So the number of rows can be impacted.
Expression is an example for passive transformation. Never change number of rows.

* When do you use joiner and lookup transformation ?
- When sources are from different databases/files use joiner.
- When you only need one row or set of columns from another table use Lookup.
- If  the table size is not too large then preferable to use lookup.
- If result from a single matching records needs to be returned then use a lookup. If there is a query that needs to be used in a lookup to find the result for lookup then good to use a lookup.
- If you are doing lookup on a table that is updated in the session then use a lookup. Joiners are active so not preferable.
- If look up table data does not change then the table can be made persistent and used in the lookup which gives even better performance.
- If data from two different source qualifiers need to be joined then use a joiner.
- If data from two different databases are read and need to be joined with a outer or inner join then use joiner transformation.

* Difference between joiner and Lookup Transformation.
Below are the differences between lookup and joiner transformation:
- Joiner joins data from two heterogeneous sources but Lookup retrieves data from a lookup table.
- Joiner is Active and Lookup is Passive transformation.
- Joiner uses data cache & index cache whereas Lookup uses static, dynamic, or persistent caches.
- In lookup we can override the query but in joiner we cannot.
- In lookup we can provide different types of operators like – “>,<,>=,<=,!=” but, in joiner only “= “ (equal to )operator is available.
- In lookup we can restrict the number of rows while reading the relational table using lookup override but, in joiner we cannot restrict the number of rows while reading.
- In joiner we can join the tables based on- Normal Join, Master Outer, Detail Outer and Full Outer Join but, in lookup this facility is not available .Lookup behaves like Left Outer Join.

* How can you use primary key in dynamic lookup cache ?
In dynamic lookup we have default port as Associate port as Sequence-ID used for primary key generation . This is useful if the target table have a column with primary key exp: Event_id. At the time of dynamic lookup we have to configure the as primary key column of the lookup table with Sequence-ID in the Associate port i.e. Event_id with Sequence-ID in the Associate port
Functionality of this Sequence-ID:
----------------------------------------
1) When the Integration services creates the dynamic lookup cache, it tracks the range of values in the cache associated with any port using Sequence-ID
Exp:Event_id :
1
2
3
4
5
6
2) When a integration services insert a new record into the cache, it will generates a Sequence-ID for the port incrementing the max Sequence-ID existing in the cache
i.e if the new record comes to lookup the NewLookupRow will be 1 at that time the next Event_id = max (value)+1 ==> max(Event_id)+1 will be generated and inserted in to cache for Event_id and the same vaule is sent to target table. So next Event_id will be 7 and this will be inserted into target tables Event_id PK column along with the new row 

* How does Sorter handle NULL values?
We can configure the way the Sorter transformation treats null values. Enable the property Null Treated Low if we want to treat null values as lower than any other value when it performs the sort operation. Disable this option if we want the Integration Service to treat null values as higher than any other value. 

* How to remove Null values using Filter Transformation?
In the Filter Transformation you can filter out rows having null values and spaces by using ISNULL and IS_SPACES functions.
For example , to filter out rows that contain null in EMP_NAME column us the below condition
IIF(ISNULL(EMP_NAME),FALSE,TRUE)
IIF(IS_SPACE(EMP_NAME),FALSE,TRUE)
This condition says that if the Employee name is null then discard the row else pass it to the next transfomation

* How to remove records with ''(empty string) not null?
Provide the filter condition as (salary != null AND empid!= null) and use a rank transformation to get Top or Bottom records.
or
LTRIM(RTRIM(salary ))!='' and  LTRIM(RTRIM(empid ))!=''

* How to remove duplicate records?
- Select distinct option in source qualifier or sorter or check group by all port in aggregator transformation.
- Use dynamic lookup

* Mapping for pulling matching columns from flat files ?
Suppose we have emp table as the source Inorder to pull the matching columns from flat files we would take lookup transformation and do lookup it on target table and give lookup condition on common column empno from both the source and target tables. And connect lookup transformation to target table so that all the new rows will be updated in the target.   

* Use of mapping & session parameters and mapping and workflow variables with examples ?
Mapping variables are the values which change between session run.
Mapping variables are use to keep track of incremental loads to target and also to capture the changed data. 
We create mapping variables from mapping tab and create variable as $$custid. Then in expression transformation create variable port as var_custid assign setmaxvariable function i,e..Assign setmaxvariable($$custid,custid) and connect transformation output to the target.
   
Mapping parameters are values which remain constant through out the entire session. We will give the mapping parameters values in parameter file.
   
Workflow variables also change at the time of session run. 
They are used to store the values so that we can use those values in the conditions.

Workflow parameter defined in the parameter file.
It stores all parameters and their values i,e source,target,cache and so on.

* Similarities between joins of joiner transformation and sql joins?
Detail Outer Join is similar to Right Outer Join in SQL 
Master Outer join is similar to Left Outer Join in SQL
Full Outer is similar to Full Outer Join in SQL

* Lookup is similar to which join in Informatica ?
Lookup is similar to Left outer join 

* What if joiner has 2 join condition ? how does it work ?
Join conditions in joiner transformation works as AND. Integration first checks the first join condition and then checks second join condition. If both conditions are satisfied only then output will be displayed.

* Delete duplicate records from a table with and without pk in sql ?
Primarykey:
Delete from customers where id IN (select id, count(id) from customers group by id having count(id) >1);

NoN-Primarykey:
select * from
(select empno, ename, row_number() over (partition by empno order by empno desc) as Dup from emp2)
where Dup>1;

* What should we use for deleting empid or empname and its query ?
Delete is the command used for deleting empid or empname.
Delete from emp where empid IN (select empid, count(empid) from emp group by empid having count(empid)>1);

* Sql query vs session override ?
Session override will override the sql query. Workflow considers only the session override as query.
If you are using single mapping for different sessions and you want to override SQL then you can use it as session level. Session level SQL override takes precedence over mapping SQL override. Whatever changes, one make at the mapping level is visible at the session level, visa-versa is not true. So to do the last minute customization to the code, to be more precise temporary changes to the logic, that can be done at the session level.
                    
* sql override filter of sq and filter transformation ?
Source qualifier filters rows while extracting from the source and filter filters the rows while sending to target.
But source qualifier works only on relational tables but filter works on both flat files and relational tables. 

* When to use joiner and lookup transformation ?
Joiner supports only = operator while Lookup supports <=,>=,=,!=
Joiner has left, right,full,inner join while Lookup supports only left outer join
Joiner has No query override while Lookup can write a lookup override query
Joiner is Active while Lookup passive
Joiner used only as Source while Lookup used as Source and as well as Target.
Joiner on multiple matches it will return all matching records while Lookup Return either First Value or Last Value Or Any Value Or Error Value.
Joiner selects all the fields from source while Lookup selects only required fields from source.

* Scd2 explanation
Suppose if there is a requirement to update employee dimensions in the target, then we will take employee table as source and perform the lookup on target table. We drag common column between both sq and lookup table ..i,e the empno column to lookup and give the condition as sq_empno = lkp_emp.
Now drag all the ports from sq and output rows from lookup to router transformation. In router we create 2 groups as insert and update,
In insert tab we give condition to insert new rows as "isnull(cust_key)" and in update tab condition as "not isnull(cust_key)" and sq_empno =! lkp_emp.
Then connect insert port to update strategy transformation and write conditon as "dd_insert". create sequence generator and connect it to
cust_key column of target so that it will generate sequence numbers and create expression transformation create a port and name it version and
give 1 to it to know the version number in the target and connect it to version port of target.
From the update group of router transformation connect it to update strategy and write condition as "dd_update". Now connect all the ports to the
target definition. 

* How to create mapplets and worklets with examples and its uses ?
If you want to create same logic in multiple mapping we use mapplets.
Suppose if we have requirement to load only the employees belonging to department no 30. To apply logic in mapplet, we would go to mapplet designer and take
mapplet  input and we drag all the ports from src_emp table to mapplet input but sq will not be connected to  mapplet input. After that we would take filter      transformation and give filter condition as deptno = 30 and we would connect this filter transformation to  mapplet output and save the mapplet. Now this mapplet can be used in mappings multiple times if there is a same requirement.

If you want to create dependencies between the each task we use worklets.
If you have many sessions in a mapping its very difficult to manage with dependencies, instead of adding dependencies among various session we use mapplets to make     our work easy by creating dependencies among different worklets.


* How to disable 4th session from a sequence of 5 sessions ?
In 4th session select the option as disable task  
   
* Tivoli jobs deployed ?
Yes i have deployed jobs. i used to create job steams and create jobs and monitor them. 

* Envirnments purpose and its abbrevations ?
Different envirnments are 
SIT : System integrated testing
UAT : User acceptance testing
OAT OR PREPROD : 

* How many overrides are there in informatica ?
- Sql override
- Update override
- Lookup override

* How index and data cache works in Joiner transformation?
When you are joining two sources A and B .You have chosen Master table as B with columns Deptno and Deptname.
So Informatica will have Deptno from Table B in Index Cache and Deptname from Table B in Data Cache. Because your condition is only on Deptno.
So every record from table A should be compared with Index cache value of table B to get the deptname from data cache so its like full table scan of table B
If you use sorted input what happens is all data in table A is sorted and all the data in table B is sorted.

* Can you join same source to joiner transformation. 
Yes, you can join same source to joiner transformation.

* Default row type of informatica 
Default row type in informatica is 'Insert'.

* Does update strategy change row type 
Yes, it changes default row type to data driven.

* What if I select treat source rows as insert for update strategy ?
All the rows will be inserted into the target table.

* What is a slowly changing dimension?
Dimensions that change slowly over a period of time, rather than changing on regular basis are called as slowly changing dimensions.
* Primary key is also shared across many dimensions, is it a conformed dimension?
No, primary keys are not considered as conformed dimensions. Primary key is not a seperate dimension table as other dimensions.  

* Fact is normalized or denormalized in star schema?
The dimension tables of descriptive information are highly denormalized. Meanwhile, the fact tables with performance metrics are typically 'Normalized'. 
or 
The fact table is in normalized structure as the redundant information is maintained in the dimension tables

* What are the Facts and dimension in banking project?
Some of the dimensions are :- 

TIME
- Time key
- Day
- Month
- Year

LOCATION
- Location id
- Street
- City
- State
- Country

PRODUCT
- Product key
- Product name
- Product brand

CUSTOMER
- Customer id
- Customer name

BRANCH
- Branch id
- Branch name
- Branch type

Some of the Facts are :-
SALES
- Time key
- Location id
- Branch id
- Product key
- Customer id
- Dollars sold
- Units sold

* Surrogate keys are assigned to fact table or dimension tables?
Surrogate keys are assigned to dimension tables

* Difference between Clustered and Non clustered index ?
Clustered indexes sort and store the data rows in the table based on their key values. There can only be one clustered index per table, because the data rows themselves can only be sorted in one order.
The data rows of the underlying table are not sorted and stored in order based on their non-clustered keys.

* What are active and passive transformations ?
-Active transformation changes no of rows pass through it where as passive transformation does not.
- Active transformation changes row type and passive does not change. 
  update strategy's insert, update and delete is example of it
- Active transformation changes transaction boundary and passive does not change it.
  Commit or rollback is example for it 

* Why do you use connected and unconnected lookup ?
Connected transformation is used when we want to get multiple columns as a result, where as unconnected is used to get single column as a result from each row

* Why lookup is active transformation ?
From the properties tab you can see the 'Lookup Policy on Multiple Match' property is set as 'Use All Values' and it is a read only property and cannot be changed after the transformation is created.

* What are surrogate and natural keys ?
Natural keys are the columns used as your key that are already defined within your table. 
Name is called a natural key.
It comes from the domain itself.
Auto generated primary key is called a surrogate key. 
It takes place name of identity
Surrogate keys are the columns added into the table such as user_id, customer_id. 

* How get a particular string from invalid data like 'css-tyson- abcxyz'
Select Ltrim ('css-tyson-abcdxyz','css') from dual;
Select Rtrim ('css-tyson-abcdxyz','abcdxyz') from dual;

* What types of parameters and variables can go in a parameter file?
The following variables and parameters can be referenced in a parameter file
- Worklet variables
- Session parameters (Built in):
  $PMSessionLogFile
- Session parameters (User-defined):
  Database connections
  Source file names
  Target file names
  Reject file names
  Lookup file names
-Mapping parameters
-Mapping variables

* Difference between normal load and bulk load ?
- In Normal load, database logs are created whereas in Bulkload, database logs are not created.
- In Normal load, Indexes are created whereas in Bulkload, Indexes should not be created. It fails the session when indexes are created.
-  Use Normal load, if there are constraints and if you want to rollback. Use Bulkload, if there are no constraints and you cannot rollback.

* What are the limitations of Bulk load ?
- Bulk load does not support constraints in the target table. If there are any primary keys, foreign keys, indexed defined on table, bulk load does not support these constraints
- Bulk load does not support update strategy transformation. If you specify bulk load with it, integration service ignore bulk load and reverts it to normal load

* Difference between Sequence generator and Sorter transformation ?
- Sequence generator transformation is used to generate sequence numbers, where as Sorter transformation is used to sort the data either in ascending or descending order.

* How do you debug ?
We use Debugger to debug in informatica power center

* How do you perform error handling or Error handling techniques you follow ?
Error are of two types. They are fatal errors and non-fatal errors.
Fatal errors:- These errors force session to fail.
This happens when its unable to access source, target or repository. This can include loss of connection or target database error, such as lack of database space to load.
It happens while reading data from source to target. Rejected records will come in this category.
Non-fatal errors:- These errors does not force session to fail

* How do you capture transformation errors? 
To capture the transformation level errors, we need to enable the property in the configure tab in session properties 'Error Log Type'. And we choose DB connection along with the connection information, as a result 4 error tables are created in database

- PMERR_DATA
- PMERR_MSG
- PMERR_SESS
- PMERR_TRANS

* How do you tune aggregator ?
- Group by on simple columns, use numbers for instead of string and date columns for Group By port.

* Difference between Persistent and shared cache
Both Persistent and Shared can be named as well as can be shared in multiple sessions. Lookup uses shared cache it deletes the data once the session is completed, Where as when we use persistent cache, the data in the cache can be stored permanently in cache directory though the session is completed.

* What is a associated port in lkp
When you configure dynamic lookup, you must associate each lookup port with input port or sequence id. Integration service updates the cache with the result of associate expression and expression contains input values from lookup cache.

* Lookup by default returns which value on multiple match
By default 'lookup policy on multiple match' selects 'use any value' option, as a result it returns first value.

* Why lookup is active transformation?
When you configure the Lookup transformation to return all matching rows, the Lookup transformation returns all rows that match the lookup condition. The Lookup transformation becomes an active transformation. The Lookup Policy on Multiple Match property is Use All Values. The property becomes read-only. You cannot change the property after you create the transformation as becomes read only.

* Why do use sql override and lookup override ?
SQL Override is to limit the number of incoming rows entering the mapping pipeline, whereas Lookup Override is used to limit the number of lookup rows to avoid the whole table scan by saving the lookup time and the cache it uses.

* Difference between user defined join in source qualifier and Joiner transformation ?
User defined join in sq will join only relational sources but joiner will join both relational as well as flat files.
User defined join in sq will join only single source but joiner will join multiple or heterogeneous relational sources.

* What are Stored procedures and its Syntax? 
If a group of PLSQL commands or program stored as a Database objects permanently in the Database and precomplied is called Stored procedure  

Create or Replace Procedure Procedure_Name (In_Parameter Int)
Is
O_parameter int;
Begin
< Procedure Body >
End Procedure_Name;

* Do we use surrogate keys in date dimension? 
The date dimension is exempt from the surrogate key rule

* Why do we use surrogate keys? 
- It's impossible for the system to create and store duplicate values
- It applies uniform rules to all records
- It lacks context or business meaning, so there will be no change in the future

* What is the star and snow flake schema and uses of it ?
Star schema is contain fact table in the middle and surrounded by denormalized dimension tables
Where as Snow flake schema contains contains fact table in the middle and surrounded by Normalized dimension tables

* What is a factless fact and use of it ?
A table which does not contain measures or facts are called as factless fact. It’s just an intersection of dimensions. Product key and Date key is the perfect example of factless fact.

* What is a degenerated dimension ?
This dimension is derived from the fact table and does not have it own dimension table. Transaction code is the perfect example of it.

* Why do we do partitioning in oracle? 
Partitioning allows a table, index, or index-organized table to be subdivided into smaller pieces, where each piece of such a database object is called a partition.

* What is 2 identified as in error table? 
'Writer error' in error log table

* Alternative of joiner transformation to combine two tables ?
Source qualifier is the alternative of joiner transformation. When u drag n drop the source tables you will get the source qualifier for each source table. Delete all the source qualifiers, add a common source qualifier for all source tables. Right click on the source qualifier you will find edit click on it, Click on the properties tab, you can write sql query or give filter condition. In session, just specify the connections for source and target database.

* What type of data is stored in Parameter file 
Parameter file stores all the information related to mapping, session and workflow locations. 

* Can you use mapping variable of one mapping in another mapping ?
Yes, we can use mapping variable of one mapping in another mapping. Refer in real time scenarios tab 7th one 

* Can we use bulk load with Update strategy ?
No, we cannot use bulk load for update strategy. Even if we try to use, integration service reverts bulk load type to normal and it will be seen in session log.

 

* What is the difference between update strategy and session level update  option?
Session level update or insert is used to perform single operation (like insert, update or delete) while we have single target.
where as update strategy is used when we have multiple operations (like insert, update or delete) while we have multiple targets

* Session Bottlenecks
If we don’t find any source, target, or mapping bottleneck, then we may have a session bottleneck. Small cache size, low buffer memory, and small commit intervals can cause session bottlenecks.

* How do you load into target if you have 5 different files ?
- Source Filetype as 'Indirect' is the best approach for it  

* What are the different reusable tasks ?
Session, Email and Command are Reusable tasks.

* How do you configure email task?
- Go to task and create email task
- Double click it, enter email user name, email subject, email text in properties tab
- Go to session components tab, select on success email with email task
- Now automatically, email will be sent successfully after the workflow execution.

* Why do you use 'source filter' is source qualifier and 'filter' transformation ?
Source Qualifier transformation filters rows when read from a source, while the Filter transformation filters the row while sending to a target.
Source Qualifier transformation only filter rows from relational sources, while the Filter transformation filters rows from any type of source.
Source Qualifier transformation provides an alternate way to filter rows and it is better than Filter transformation. 

* Why do we use connected and unconnected lkp ?
If we want the output from a multiple ports at that time we have to use connected lookup Transformation.
We can use the unconnected lookup transformation when we need to return the output from a single port.

* How to update without update strategy transformation ?
Select in session level properties, treat source rows as ‘Update’ and in target level properties select ‘update else insert’.

* Can we update target without update strategy ?
Yes, we can update target without update strategy. Configure the option in target properties ‘target override’, to update target without update strategy.

* Use of data driven (with and without) in update strategy transformation ?
If you use ‘data driven’ with update strategy, you must use update strategy transformation at mapping level, while configuring a mapping.
You cannot use ‘data driven’ without update strategy.

* What if data driven is not selected, while using update strategy transformation?
If data driven is not selected, integration service will go for insert or update, which is selected for treat source rows.

* Does a workflow fails if it does not have workflow variables configured on it?
 No, workflow will not fail

* Use of Normalizer transformation ?
Normalizer transformation is an active transformation, used to convert single row into multiple rows and vice versa.

* Does normalizer converts rows into column or columns into rows ?
Normalizer converts columns into rows

* What is Lookup order by clause?
By default, for every lookup transformation, an order by clause is generated for every column. Change default order by clause to lookup condition ports.

* What is the use of Shell Scripts with in Informatica ?
- To execute workflow
- To schedule the workflow
- To create dependencies among various sessions

* Which transformation is used to perform SCD other than Lookup ?
Joiner can be used to perform SCD other than lookup

* How to remove comma from a flat file data 3,000 ?
To_integer(replacechr(FALSE,'3,000',',',''))
* How to remove comma from city 'san, fancisco' ?
REPLACESTR(1,col_name,',','')
OR
(REPLACECHR(1,NetEntered,',',NULL))
After col name give comma in single quotes and again give two single quotes without space this means a null is replaced in place of comma.

* By default lkp policy on multiple match returns which row ?
By default 'lookup policy on multiple match' selects 'use any value' option, as a result it returns first value.

* Do you use static cache or dynamic cache for SCD 2?
Static cache is used for SCD type 2 because, we can follow the scd2 logic with static but not dynamic cache. Why because, basically dynamic cache updates old records but, we don’t want to update old records. We want to insert old record with current date, so its possible only with static cache.
(Or)
SCD2 Concept is that if the source record found match in target,then that will Inserted into the target as a new record.So in SCD2 there are multiple record will be in target redundantly i,e let say for a single Empno there is more than one record that's why it maintain history.

The points are as below
- when the matched record is inserted into the Target Dimension table,it remain keep the source key value(Let say Empno),but the target table primary key is the surogate key which is not created upon the Employee Key.I suppose to say that it means in target table,the employee no may be redundant but the surogate key will be unique.
- The Lookup transformation always make condition upon Src.empno=Tgt.empno.Here the situation is when for a single source record if it match more than one record in target,then you can choose First,Last or any when you use Static Lookup.So by static lookup it is easier.But in case of Dynamic Lookup if for a single source record found more than one corrosponding record in target,that will fail.So you make sql override in Dynamic lookup that will extract only one match record (Last record)from target.then it will work,other wise not.
- In dynamic Lookup,if the newlookup row is 0-You can drop it.
1-Insert it by using the sequence genarator for Surogate key.
2-It will be inserted but use a new sequence generator having the start value somthing more than the previous sequence generator's last value,then the both are not contradict.
- Override the sql query in dynamic lookup,that's why in every run for each source record if it found match in target then always return the last matched record.
Then it work fine.But the better solution is Static Lookup for SCD2.

* How do you configure SCD with joiner transformation?
The entire implementation will be same as that using a lookup. The only thing we need to replace the Lookup transformation with a Joiner transformation. In the Joiner transformation the Source table will be used as Master and the Target table as Detail. The join condition will be same as that of lookup condition and the join type being Detail Outer Join
SRC-->SQ--> JNR-->EXP-->RTR-->TGT(Insert for both new and updated records)
SRC(TGT)-->SQ--> -->TGT(Update)

* What type of join used in joiner for SCD ?
Detail outer join is used in joiner for SCD.

* If your mapping is taking long time to load 15 million records what would you do ?
I will check session log for the busy percentage. If any of the reader or writer or transformation threads, busy percentage is near to 100, then I would tune that part.

* Difference between pass through and round robin and draw picture of both with example?
Pass through partition will process the records sequentially among the partitions whereas, round robin distributes rows evenly among the partitions.



* Do you use surrogate key or primary key for lookup condition
Primary key is used for lookup condition

* Why lookup is passive transformation ?
Lookup does not change no of rows pass thought it, if the condition is met or not met. If condition is met it passes values, if not met it passes null values. So, that is why lookup is called as passive transformation. Below 9.1 version of informatica power center, lookup is said to be passive.

* Can we use persistent cache in another workflow ?
Yes, I think we can use persistent cache in another work.

* Why workflow concurrency is used?
If source data is from multiple systems or the number of data sources increases, the ETL load time also increases
Concurrent Workflows are used to reduce Warehouse ETL Load Time in parallel.

* Why the Event Wait and Event Raise task is used?
To create dependencies among the tasks within the workflow.
'Even Wait' waits for a particular event to occur and it can be predefined events or user-defined events.
'Event Raise' is opposite to Event Wait, the Event Raise signifies that i am done, whoever is waiting for me go ahead and You can define only user-defined events.

- To configure event task, we create an event in edit workflow, in events tab create events
- Go to Event tab of Event task, select user defined or system defined events and select the event that we have created. Same event is used for both the wait and raise event tasks.


* What is the used of Email task ?
Email task is used in the case of session Success or Failure. There are two types of Email task such as 'On Success E-mail' and 'On Failure E-mail'.
In order to apply this Email task, go to components tab in the session and define the email details like Email username, subject and text.

* How to remove the hash symbol from the flat file header in Power Center ?
If you are using Header Options to create header columns in a flat file, you may end up with a hash symbol in the beginning of the header record. 
To remove that follow the below steps.
Edit the session corresponding to the desired flat file & Select the Config Object tab
Create a new custom property with the attribute RemoveOutputHeaderHash and the value Yes.

* What is SDLC life cycle?
The 7 Stages of the Software Development Life Cycle (SDLC)
Requirements & Analysis.
Project Planning.
Design.
Coding & Implementation.
Testing.
Deployment.
Maintenance.


* Which gives better performance Star schema or Snow flake schema ? why ?
The Star schema is in a more de-normalized form and hence tends to be better for performance. 
Star schema uses less foreign keys so the query execution time is limited. 
In almost all cases the data retrieval speed of a Star schema has the Snowflake beat.


* How to swap genders in database ?

UPDATE GENDER
SET GENDER = CASE GENDER WHEN 'MALE' THEN 'FEMALE'
ELSE 'MALE'
END

* How do you derive values into the target with informatica ?
 
col1 col2 col3 Target
A a1 200 500
B a2 300 600
A a3 300 500
B a4 300 600
C a5            200 200

* How do you improve performance of Aggregator transformation ? 
- Group by simple columns.
You can optimize Aggregator transformations when you group by simple columns. When possible, use numbers instead of string and dates in the columns used for the GROUP BY. Avoid complex expressions in the Aggregator expressions.

- Use sorted input.
To increase mapping performance, sort data for the Aggregator transformation. Use the Sorted Input option to sort data.
The Sorted Input option decreases the use of aggregate caches. When you use the Sorted Input option, the Data Integration Service assumes all data is sorted by group. As the Data Integration Service reads rows for a group, it performs aggregate calculations. When necessary, it stores group information in memory.
The Sorted Input option reduces the amount of data cached during the mapping and improves performance. Use the Sorted Input option or a Sorter transformation to pass sorted data to the Aggregator transformation.
You can increase performance when you use the Sorted Input option in mappings with multiple partitions.

- Filter data before you aggregate it.
If you use a Filter transformation in the mapping, place the transformation before the Aggregator transformation to reduce unnecessary aggregation.

- Limit port connections.
Limit the number of connected input/output or output ports to reduce the amount of data the Aggregator transformation stores in the data cache.

* How to remove Special characters completely from a table or a file ?
REGEXP_REPLACE Function :
It will allow you to replace a sequence of characters in a string with another set of characters using regular expression pattern matching.

SELECT regexp_replace('Removing_Special - Characters .,_-$%&^& ',
'[^0-9 A-Za-z]', '');
            OR
Update emp2 set Ename = Regexp_Replace (Ename,'[^A-Z0-9 ]', '');


* How sorted input improves performance ?

* How to use 100 Flat files in informtica without using indirect meathod ?

* How to load unique records into target1, duplicate records in target2 & all instances in target3 ?

* How to get no of occurances of a word in unix file ?
  grep -o '5AB' Journal | wc -l

* Which all transformations have sorted input option
  Aggregator, Joiner and Lookup 
  
* Which all transformations have distinct option
  Source qualifier and Sorter  


=================================================================
 

* How to remove invisible or control m characters from a file 
* How to get Last modified file in unix 
* Scenario of persistent cache and Mapplet
* command to delete a string from a file ?
* How do you do Partitioning ?
* Static or dynamic partitioning? 
* Configuring url
* Unique and non unique record 
* Remove comma from a string 
* Filename of flat file name 
* how to read error tables 
* How to deal with sql query option when you use pass through partition ?

* How do you use web services in informatica ?


* How do we handle duplicates in scd's ?

* Transaction control transformation

* Stored procedure transformation.

* SQL transformation.


* Source side how do you tune performance ?

* Maintain history in Scd1

* If there is a change in empno then cust_key does'nt update the record, then how do you do it ?

* What is your project flow?

* Why do you load dimensions and facts

* It's new loads or old loads

* Associate lkp, new lkp

* How selecting table with less port as master improves performance in Joiner?

* Nth max salary

* If I select two output port in unconnected lkp what would be my output?

* How can i insert, update and delete existing records in the table simultaneously?


* Source side how do you tune performance ?

* How do you understand that you session is taking long time ?

How do you configure email task ?

* Why static lookup is not inserting new rows into the target ?

* Any question for me ?

*  All duplicates in one target1 and non duplicates in target2

* Difference between group by and partition by

* What is a group by expression in oracle ?

* What is a single-group group function ?
Single group function returns single output value…………………….

* How do you Rollback on Error in informatica ?

* If for some reason,the database stored procedure fails,we want this to reflect in the session and result in failure of the session.

* The requirement is when the stored procedure fails, the session should fail and a failure email should be sent.

* How to remove Null values using Filter Transformation?
























No comments:

Post a Comment

Different types of assets in IICS

 The different types of assets in IICS are - Tasks - Data ingestion and Replication - Mappings - Mapplets - Taskflows - Components