a

a

Hashing in Informatica Using MD5......( Avoid too many column joins )

           

                 Most of the time we come across situation where we have to join on many columns to find lookup values or calculating Delta values or in SCD2 as well .

Informatica Has provided some functions which helps to calculate unique values by combining all column values to whcih we called Hash Values ( In oracle ORAHASH can be generated using following functions)

dbms_utility.get_hash_value()
dbms_crypto.hash()

Informatica Also provides some functions which can be used to generate hashinhg value which can be used for comparison purposes and out column joins will reduce to one from many .

following are some of the functions which Informatica 9.1 is having 

MD5() : Calculates the checksum of the input value. The function uses Message-Digest algorithm 5 (MD5). MD5 is a one-way cryptographic hash function with a 128-bit hash value. You can conclude that input values are different when the checksums of the input values are different. Use MD5 to verify data integrity.


CRC32(): You want to read data from a source across a wide area network. You want to make sure the data has been modified during transmission. You can compute the checksum for the data in the file and store it along with the file. When you read the source data, the PowerCenter Integration Service can use CRC32 to compute the checksum and compare it to the stored value. If the two values are the same, the data has not been modified.
 Also this function can be used to generate hash but Informatica wont recommend as it can generate same hash for different string .

Sample example to show you Hashing



Values we can see as below which got generated



These generated values further can be used in joins instead of using all three columns in above example .