Compute MD5 hash value using script component

Hi ,

I am looking to convert all the source column to md5 hash value,

I never worked on this scenario.below is my code , please suggest me where and what i need to change in code to get md5 value in derived column hashvalue of all incoming rows.

My script component code is :-

Imports System
Imports System.Data
Imports System.Math
Imports Microsoft.SqlServer.Dts.Pipeline.Wrapper
Imports Microsoft.SqlServer.Dts.Pipeline.ScriptComponentHost
Imports Microsoft.SqlServer.Dts.Runtime.Wrapper
Imports Microsoft.SqlServer.Dts.Runtime
Imports System.Data.SqlClient
Imports System.Data.SqlClient.SqlConnection
Imports Microsoft.SqlServer.Dts.Pipeline
Imports System.Text
Imports System.Security.Cryptography

<Microsoft.SqlServer.Dts.Pipeline.SSISScriptComponentEntryPointAttribute()> _
<CLSCompliant(False)> _
Public Class ScriptMain
    Inherits UserComponent
    Dim connMgr As IDTSConnectionManager100
    Dim sqlConn As SqlConnection
    Dim sqlCmd As SqlCommand
    Dim sqlParam As SqlParameter
    Private inputBuffer As PipelineBuffer
    Public Overrides Sub AcquireConnections(ByVal Transaction As Object)
        connMgr = Me.Connections.Connection
        sqlConn = CType(connMgr.AcquireConnection(Nothing), SqlConnection)
    End Sub

    Public Overrides Sub PreExecute()
        sqlCmd = New SqlCommand("SELECT * FROM TestEmpS ", sqlConn)
        'sqlParam = New SqlParameter("@filem", SqlDbType.VarChar)
        'sqlCmd.Parameters.Add(sqlParam)
    End Sub

    Public Overrides Sub CustomerRecordsInput_ProcessInputRow(ByVal Row As CustomerRecordsInputBuffer)

        Dim reader As SqlDataReader
        ' sqlCmd.Parameters("@KeyCustomer").Value = Row.CUNO
        reader = sqlCmd.ExecuteReader()
        If reader.Read() Then
            'do all field comparisons here to determine if
            '    the record changed since the last ETL.
            If (reader("Name").ToString() <> Row.Name) Then
                ' Row.DirectRowToUpdateRecordsOutput()
                Row.DirectRowToInsertRecordsOutput()
            ElseIf (reader("Name").ToString() = Row.Name) Then
                Row.DirectRowToIgnoreRecordsOutput()
            End If
        Else
            Row.DirectRowToInsertRecordsOutput()
        End If
        reader.Close()
        ' If sqlConn.State = ConnectionState.Open Then sqlConn.Close()

    End Sub
    Public Overrides Sub ReleaseConnections()
        connMgr.ReleaseConnection(sqlConn)
    End Sub
End Class

Thanks in advance.

Regards,

Vipin jha

February 8th, 2015 10:25am

Hi Vipin,

http://blogs.msdn.com/b/csharpfaq/archive/2006/10/09/how-do-i-calculate-a-md5-hash-from-a-string_3f00_.aspx explains how to calculate MD5 hash from input.

I suggest you make it a function in your script task.

PS: Do not do SELECT * in ETL ever

Free Windows Admin Tool Kit Click here and download it now
February 8th, 2015 5:41pm

Hi Arthur,

Actually I never worked on this scenario where I have to use s single Data flow task for loading 100 of different table from source to destination and every table have different number of column.

By searching on internet I found that we can achieve this by using script component and we can get new ,updated and same records on the basis of hash value by comparing from source to destination.

in my below code below points I want to do

1-I want to take the sql query frokm  variable like @sqlquery

2-I want to compute all the column to hasvalue

3-I want to split the result on the basis of below 3 point

New records:- ISNULL(Archive_BBxKey)
Updated   :- BBxKey == Archive_BBxKey && RowChecksum != Archive_RowChecksum
Unchsnaged :- BBxKey == Archive_BBxKey && RowChecksum == Archive_RowChecksum

Imports System

Imports System.Data
Imports System.Math
Imports Microsoft.SqlServer.Dts.Pipeline.Wrapper
Imports Microsoft.SqlServer.Dts.Pipeline.ScriptComponentHost
Imports Microsoft.SqlServer.Dts.Runtime.Wrapper
Imports Microsoft.SqlServer.Dts.Runtime
Imports System.Data.SqlClient
Imports System.Data.SqlClient.SqlConnection
Imports Microsoft.SqlServer.Dts.Pipeline
Imports System.Text
Imports System.Security.Cryptography

<Microsoft.SqlServer.Dts.Pipeline.SSISScriptComponentEntryPointAttribute()> _
<CLSCompliant(False)> _
Public Class ScriptMain
    Inherits UserComponent
    Dim connMgr As IDTSConnectionManager100
    Dim sqlConn As SqlConnection
    Dim sqlCmd As SqlCommand
    Dim sqlParam As SqlParameter
    Private inputBuffer As PipelineBuffer
    Public Overrides Sub AcquireConnections(ByVal Transaction As Object)
        connMgr = Me.Connections.Connection
        sqlConn = CType(connMgr.AcquireConnection(Nothing), SqlConnection)
    End Sub

    Public Overrides Sub PreExecute()
        sqlCmd = New SqlCommand("SELECT * FROM TestEmpS ", sqlConn)   'want to take from variable 
        'sqlParam = New SqlParameter("@filem", SqlDbType.VarChar)
        'sqlCmd.Parameters.Add(sqlParam)
    End Sub

    Public Overrides Sub CustomerRecordsInput_ProcessInputRow(ByVal Row As CustomerRecordsInputBuffer)

        Dim reader As SqlDataReader
        ' sqlCmd.Parameters("@KeyCustomer").Value = Row.CUNO
        reader = sqlCmd.ExecuteReader()
        If reader.Read() Then
            'do all field comparisons here to determine if
            '    the record changed since the last ETL.
            If (reader("Name").ToString() <> Row.Name) Then
                ' Row.DirectRowToUpdateRecordsOutput()
                Row.DirectRowToInsertRecordsOutput()
            ElseIf (reader("Name").ToString() = Row.Name) Then
                Row.DirectRowToIgnoreRecordsOutput()
            End If
        Else
            Row.DirectRowToInsertRecordsOutput()
        End If
        reader.Close()
        ' If sqlConn.State = ConnectionState.Open Then sqlConn.Close()
    End Sub
    Public Overrides Sub ReleaseConnections()
        connMgr.ReleaseConnection(sqlConn)
    End Sub
End Class

If you change my above code and provide wit the changes, i will be grateful for you.

regards,

Vipinn jha

February 9th, 2015 12:18am

I understood you need to generate MD5 to make sure the data in the source matched to what is loaded to the destination. This can be done as the last step.

T-SQL offers MD5 calculation in form of HASBYTES function https://msdn.microsoft.com/en-us/library/ms174415.aspx

But loading different tables dynamically in ETL will require you to generate the package in code that would take care of all the tables.

Perhaps instead you can leverage the canned SSIS components as Transfer SQL Object instead which can move the tables and the data.

Free Windows Admin Tool Kit Click here and download it now
February 15th, 2015 12:19pm

Hi Vipin, 

besides what Arthur suggested (HASHBYTES is a very useful function), you may want to check-out this Codeplex site (http://ssismhash.codeplex.com/) to use this component. It generates row-by-row hash codes and you can choose many algorithms (just like using HASHBYTES, but at data-flow level)

Regards.

Pau

February 16th, 2015 4:22am

This topic is archived. No further replies will be accepted.

Other recent topics Other recent topics