Basic Informatica Tutorial

Wednesday, May 22, 2019

Stored Procedure Transformation

Stored Procedure Transformation is Passive transformation.
It is both connected and unconnected.
Procedure are stored in database and we call the stored procedure through stored procedure transformation in Informatica.
To perform the calculations like Salary hike, Min and Max salary, we use stored procedure Transfomation.

Examples :-
------------
If an application updates the customer table in ten different places, there can be a single stored procedure and a standard procedure call from the application for this functionality.
If a stored procedure has four SQL statements in the code, then there only needs to be a single call to the database instead of four calls for each individual SQL statement.

Two types of ports :-
----------------------
Input : This port is used to get Input parameter.
Output: This port will display the output values as per the procedure defined in database.
Return : This port is used to return values.

Stored Procedure Type :-
--------------------------
There are 5 different Options available in Stored Procedure Transformation Properties.

- Normal : It will run row by row basis.
- Source Pre Load : It works like property in Source Qualifier as 'Pre Sql'.
Before reading the data from the source Source Pre Load would act on database.
- Source Post Load : It works like property in Source Qualifier as 'Post Sql'.
After the load, Source Post Load would act on database.
- Target Pre Load : It works like property in Source Qualifier as 'Pre Sql'.
Before reading the data to the target Target Pre Load would act on database.
- Target Post Load : It works like property in Source Qualifier as 'Post Sql'.
After the load, Source Post Load would act on database

To create a stored procedure transformation :-

- Go to the transformations tab and click import stored procedure option.
- Connect to the database where the stored procedures is stored and select it.
- Now connect the appropriate ports to input and out put ports.

- Select the connection information

- Now execute the workflow, we get the expected results.

Friday, May 10, 2019

Loops in Stored Procedures

Whenever we execute a procedure each query or statement is executing once.
Loop is a programming concept which is used to execute a group of statements repeatedly for the specific no of times.

3 Types of Loops :-
- Simple Loop
- For Loop
- While Loop

Thursday, May 9, 2019

Aggregate functions

Aggregate functions :-
The functions SUM, COUNT, AVG, MIN, MAX are the common aggregate functions the result of which does not depend on the order of the records.

Aggregate functions Return a single output by calculating values from the table or column. They return the group value multiple times with each record.

SELECT DEPTNO, COUNT(*) DEPT_COUNT FROM EMP
WHERE DEPTNO IN (20, 30)
GROUP BY DEPTNO;

DEPTNO DEPT_COUNT
---------------------- ----------------------
20 5
30 6

SUM() function :-
It returns the sum of a column. The column should be numeric.

SELECT SUM(SAL) FROM EMP;
SELECT EMPNO, SUM(SAL) FROM EMP GROUP BY EMPNO;

COUNT() function :-
It returns/counts the number of rows in a query. But, it will not count any null values/column in a table.

SELECT COUNT(*) FROM EMP;
OR
SELECT COUNT(EMPNO) FROM EMP;
SELECT EMPNO, COUNT(EMPNO) FROM EMP GROUP BY EMPNO;

AVG() function :-
It returns the average value of a column. The column should be numeric.

SELECT AVG(SAL) FROM EMP;
SELECT EMPNO, MIN(SAL) FROM EMP GROUP BY EMPNO;

MIN() function :-
It returns the minimum or smallest value of a column. The column should be numeric.

SELECT MIN(SAL) FROM EMP;
SELECT EMPNO, AVG(SAL) FROM EMP GROUP BY EMPNO;

MAX() function :-
It returns the maximum or biggest value of a column. The column should be numeric.

SELECT MAX(SAL) FROM EMP;
SELECT EMPNO, MAX(SAL) FROM EMP GROUP BY EMPNO;

NOTE:

Select sum, sum(sal) from emp group by empno => This query will not work as empno is not in group by clause.

Select empno, sum(sal) from emp group by empno => This query will work as empno is present in group by clause.

Select sum(sal) from emp group by empno => This query will work as we are not selecting any column in select statement and group by clause has not issues.

Tuesday, May 7, 2019

Analytical Functions

The functions SUM, COUNT, AVG, MIN, MAX are the common aggregate functions the result of which does not depend on the order of the records.
Analytical functions like LEAD, LAG, RANK, DENSE_RANK, ROW_NUMBER, FIRST, FIRST VALUE, LAST, LAST VALUE depends on order of records.

Analytical functions :-
----------------------
Analytic functions Return the same number of rows as the input. Analytic functions give aggregate result they do not group the result set.
In order to arrange employees in a particular order like ascending or descending order with the help of analytical functional like rank, dense_rank

SELECT empno, deptno,
COUNT(*) OVER (PARTITION BY
deptno) DEPT_COUNT
FROM emp
WHERE deptno IN (20, 30);

EMPNO DEPTNO DEPT_COUNT
---------- ---------- ----------
7369 20 5
7566 20 5
7788 20 5
7902 20 5
7876 20 5
7499 30 6
7900 30 6
7844 30 6
7698 30 6
7654 30 6
7521 30 6

Types of Analytical Functions
- Rank
- Dense_rank
- Row_number
- Count
- Lag
- Lead
- First_value
- Last_value
- First
- Last

RANK function :-
It is used to rank a record within a group of rows
RANK() will return {1, 1, 3, 4, 4, 6} (note that the values 2 and 5 are skipped)

SELECT EMPNO, ENAME, JOB, RANK() OVER (ORDER BY JOB) AS RANK FROM EMP;

(OR)

SELECT E.*, RANK() OVER (ORDER BY JOB DESC) AS RANK FROM EMP E;

DENSE_RANK function :-
Dense_rank function acts like the RANK function except that it assigns consecutive ranks.
DENSE_RANK() will return {1,1,2,3,3,4}

SELECT EMPNO, ENAME, JOB, DENSE_RANK() OVER (ORDER BY JOB) AS RANK FROM EMP;

(OR)

SELECT E.*, DENSE_RANK() OVER (ORDER BY JOB DESC) AS DRANK FROM EMP E;

+---+------------+------+------------+
| V | ROW_NUMBER | RANK | DENSE_RANK |
+---+------------+------+------------+
| a | 1 | 1 | 1 |
| a | 2 | 1 | 1 |
| a | 3 | 1 | 1 |
| b | 4 | 4 | 2 |
| c | 5 | 5 | 3 |
| c | 6 | 5 | 3 |
| d | 7 | 7 | 4 |
| e | 8 | 8 | 5 |
+---+------------+------+------------+

ROWNUM function :-
Rownum is temporary, It is the sequential number.

SELECT ROW_NUMBER() OVER (PARTITION BY JOB ORDER BY JOB) AS ROWNUMBER,
EMPNO, ENAME, JOB FROM EMP;

(OR)

SELECT E.*, ROW_NUMBER() OVER (ORDER BY JOB DESC) AS RN FROM EMP E;

ROWID function :-
Rowid is permanent, It is physical address of the rows or permanent unique identifier.

SELECT EMPNO, ENAME, JOB, ROWID FROM EMP;

LEAD function :-
It is used to display Subsequent row data along with current row data
For example, by using the LEAD() function, from the current row, you can access data of the next row, or the second row that follows the current row, or the third row that follows the current row, and so on.

Syntax :
LEAD ( scalar_expression [ ,offset ] , [ default ] )
OVER ( [ partition_by_clause ] order_by_clause )

SELECT EMPNO, ENAME, JOB, SAL, LEAD(SAL) OVER(PARTITION BY JOB ORDER BY SAL) AS HIGHEST_SAL FROM EMP;

(OR)

SELECT E.*, LEAD(SAL) OVER (ORDER BY SAL DESC) AS LEAD
FROM EMP E;

LAG function :-
It is used to display previous row data along with current row data. Here 1 indicated beside lag column skips 1 row and by default its 1 there is no use of writing it.
2 is indicated to skip two rows in the result.

Syntax :
LAG ( scalar_expression [ ,offset ] , [ default ] )
OVER ( [ partition_by_clause ] order_by_clause )

SELECT EMPNO, ENAME, SAL, LAG(SAL) OVER (ORDER BY SAL DESC) AS PREV_ROW FROM EMP;
(OR)
SELECT EMPNO, ENAME, SAL, LAG(SAL,1) OVER (ORDER BY SAL DESC) AS PREV_ROW FROM EMP;

SELECT EMPNO, ENAME, SAL, LAG(SAL,1,0) OVER (ORDER BY SAL DESC) AS PREV_ROW FROM EMP;

(OR)

SELECT E.*, LAG(SAL) OVER (ORDER BY SAL DESC) AS LAG
FROM EMP E;

FIRST_VALUE function :-
Its used to display first value from the specified column based on the order by clause.

SELECT EMPNO, ENAME, SAL, FIRST_VALUE(SAL) OVER (ORDER BY SAL) AS LOWEST_SAL FROM EMP;

SELECT EMPNO, ENAME, SAL, FIRST_VALUE(SAL) OVER (PARTITION BY JOB ORDER BY SAL) AS LOWEST_SAL FROM EMP;

LAST_VALUE function :-
Its used to display last value from the specified column based on the order by clause.

SELECT EMPNO, ENAME, SAL, LAST_VALUE(SAL) OVER (ORDER BY SAL) AS LOWEST_SAL FROM EMP;
As a result of this syntax it does not return expected results but it returns same rows.

SELECT EMPNO, ENAME, SAL, LAST_VALUE(SAL) OVER (ORDER BY SAL ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) AS HIGHEST_SAL FROM EMP;
This syntax gives you correct results.

PARTITION BY clause :-
This clause is used only with "Over" clause. It is used to breakup or divide table data into many partitions, we can apply aggregate functions and analytical functions.

SELECT DEPTNO, ENAME, SAL, AVG(SAL) OVER (PARTITION BY DEPTNO) AS AVG_SAL FROM EMP;

Monday, May 6, 2019

Correlated Subquery

Subquery which is depending on outer query output is called correlated subquery. First outer query is executed, based on the output, subquery is executed again based on the output, outer query is executed 2nd time.

In other words, a correlated subquery is a subquery that references a column from a table in the outer query, and uses that value as a condition in the subquery. The subquery is executed for each row returned by the outer query, so it is dependent on the data in the outer query.

1) Find the employees whose salary is higher than the average salary of their department.

select * from emp where sal >

(select avg(sal) from emp

where deptno = emp.deptno)

2) Find the customers who have made more than one purchase in the past month.

SELECT *

FROM customers c

WHERE (

SELECT COUNT(*)

FROM orders o

WHERE o.customer_id = c.customer_id

AND o.order_date >= DATEADD(month, -1, GETDATE())

) > 1

Friday, May 3, 2019

Subquery

What is Subquery? Explain the Properties of a Subquery?
A query which is nested within another query is called subquery.
If you want to fetch data from one table based on the input value of other table we write subqueries.
Here inner query is executed first, output will be passed to outer query, based on that outer query will be executed. To write a subquery we need a common column.
Table from which you expect output, write an outer query on that.
Table from which you expect input, write an inner query on that.

Two types of subqueries :
- Single row subquery: Single-row subqueries are used with Comparison Operators.
- Multi row subquery: Multiple-row subqueries are used with Logical Operators.

Single row subquery :
If the subquery generates single output value then its called Single row subquery.
equal to operator is used in the single row subquery.

Syntax :
Select dname from dept where deptno = (select deptno from emp where empno = 7654);

Multi row subquery :
If the subquery generates more than one output value then its called Multi row subquery.
In operator is used in Multi row subquery.

Syntax :
Select * from emp where deptno IN (Select deptno from dept where dname IN ('Accounting', 'Research'));

SELECT * FROM EMP WHERE SAL < (SELECT MAX(SAL) FROM EMP);

Cursors

A cursor is a pointer to this temporary work area (context area) created in the system memory, when a SQL statement is executed. Temporary work area is used to store the data retrieved from the database and manupulate this data. It can hold more than one row but can process one row at a time.

There are two different Types of cursors
- Implicit Cursors
- Explicit Cursors

Implicit Cursors : Implicit cursors are automatically created by oracle whenever SQL statement is executed. At the time of DML operations like insert, update and delete these cursors hold the data that needs to be inserted.
Here when you write a query completely table data will be processed in the cursor area.

Explicit Cursors : Explicit Cursors are created by users. They are used to display multiple records using program or procedure.
Here when you write a query only the requred columns data will be processed in the cursor area.
It should be defined in the declaration section of the PLSQL block and created on select statement.
Fetch operation can get only 1 record because pointer is on the first record inorder to fetch all the records 'Loop' is used.

4 steps to write a cursor
- Declare Cursor
- Open Cursor
- Fetch data from Cursor
- Close Cursor

Cursor Properties
- <cursor name> %ISOPEN
- <cursor name> %FOUND
- <cursor name> %NOT FOUND
- <cursor name> %ROWCOUNT