SSIS Fuzzy lookup
I am new to SSIS. I am using fuzzy lookup transromation in SSIS to find approx matching of data in my table. Now after fuzzy lookup transformation I have put conditional split. What I want my output is something like If I cannot get any matches then it should go to one table who does not have any records then if one match then go to another table who has only one match and more than one should go to another table having more than one match. So can any one suggest me what type of condtion should I put in conditonal split based on similarity and confidence level? and Shoud I use only similarity or similarity of particaular column? Please reply as soon as possible Thank you in advacne.
June 24th, 2011 9:59am

If you are on SSIS 2008 you probably should look into the Fuzzy Grouping Transformation http://msdn.microsoft.com/en-us/library/ms141764.aspx Example on how to solve with just a Look up: http://www.sqlteam.com/article/using-fuzzy-lookup-transformations-in-sql-server-integration-services Arthur My Blog
Free Windows Admin Tool Kit Click here and download it now
June 24th, 2011 10:11am

you should set joining fields in the columns tab of fuzzy lookup transform editor, and also you SHOULD set a Lookup columns on same tab, set one or more columns to be fetched from lookup table. for example let's suppose you selected a column and named alias as ColumnFromLookup1 . then in advanced tab, set Similarity threshold to a value appropriate to your fuzzy data logic. then in the conditional split you can check the _similarity column value with this expression: _similarity>=0.5 Note that you should put your threshold value instead of 0.5 or you can check the lookup columns like this: !ISNULL(ColumnFromLookup1) in each of cases above which are equivalent together you can redirect conditional split result to destination as you want. Just note that fuzzy lookup like other fuzzy operations need to check time by time to find best threshold for your data. http://www.rad.pasfu.com
June 24th, 2011 10:32am

Thank you Reza for your quick reply But you know what I am not understanding like what should be my condition? I am putting threshold 0 so now default output of conditional spit will have no matches. Then if i want only 1 matches and mroe then one mathes ?you mean to say that i have to put condtion based on _similarity>= 0.50? Thats what should I do?
Free Windows Admin Tool Kit Click here and download it now
June 24th, 2011 10:42am

why you set the threshold as 0 ? you can set it to 0.50 for first time. then write this expression in conditional split: _similarity >=0.50 then set this new output name as MatchedOutput then connect this matchedoutput to the destination you want. also put data viewer after fuzzy lookup transform, and see what is results. are those results appropriate in your opinion? or you think you should raise the threshold? or you should reduce it?http://www.rad.pasfu.com
June 24th, 2011 10:48am

Okey I guess I should try to do it as you say if it will work or not. But thank you very much.
Free Windows Admin Tool Kit Click here and download it now
June 24th, 2011 10:51am

When I am running my package by putting OLEDB destination in output I am getting error like "[SSIS.Pipeline] Warning: The output column "PIC" (17) on output "OLE DB Source Output" (11) and component "REP_DBT_ACTPART_NEW" (1) is not subsequently used in the Data Flow task. Removing this unused output column can increase Data Flow task performance." For all column name. Do not know why? and what does this error means?
June 24th, 2011 11:20am

and one more thing to notify is that column names which all gives this error are in my lookup table but I am not performaing any matches on them so just leaving as it is but then its throwing error like above. and there are no such column name there in OLEDB Destination table As I am creating that table.
Free Windows Admin Tool Kit Click here and download it now
June 24th, 2011 11:25am

seems that you had metadata changes in source or destination. you need to re-map columns or drop and re-create source/destination/transforms.http://www.rad.pasfu.com
June 24th, 2011 11:30am

and one more thing what if I want to use catch transformation before usinf fuzzy lookup transformation? Because I saw that My package is running but very slowly. SO where should I use it if i want to use it for reference table? and how?
Free Windows Admin Tool Kit Click here and download it now
June 24th, 2011 11:42am

did you created an index for your lookup table in the fuzzy lookup transform?this should raise performance you can not use cache transform in the fuzzy lookup, that option is only available at lookup transform.http://www.rad.pasfu.com
June 24th, 2011 1:37pm

Yes I am creating index. but stil its throwing error and error comes from fuzzy look up only "[Fuzzy Lookup [424]] Error: The table "[FuzzyLookupMatchIndex]" does not appear to be a valid pre-built match index. This error occurs if the metadata record cannot be loaded from the specified pre-built index." "[Fuzzy Lookup [424]] Error: An error was encountered when trying to delete the existing fuzzy match index named "[FuzzyLookupMatchIndex]". It is possible that this table was not created by Fuzzy Lookup (or this version of Fuzzy Lookup), it has been damaged, or there is another problem. Try manually deleting the table named "[FuzzyLookupMatchIndex]" or specify a different name for the MatchIndexName property." "[SSIS.Pipeline] Error: component "Fuzzy Lookup" (424) failed the pre-execute phase and returned error code 0xC0208371." All these errors causes fuzzy lookup transformation fail
Free Windows Admin Tool Kit Click here and download it now
June 24th, 2011 2:01pm

this error says that a problem happened to the index FuzzyLookupMatchIndex. try to set another new name for index in the fuzzy lookup editor and let me know the resulthttp://www.rad.pasfu.com
June 24th, 2011 4:17pm

I tried by changing fuzzylookupMatchIndex Name and package processed fully but at the end it again gave error. Error: [SSIS.Pipeline] Error: component "Fuzzy Lookup" (424) failed the pre-execute phase and returned error code 0x8007000E. Warning: [Fuzzy Lookup [424]] Warning: Not enough storage is available to complete this operation. Warning: Warning: SSIS Warning Code DTS_W_MAXIMUMERRORCOUNTREACHED. The Execution method succeeded, but the number of errors raised (1) reached the maximum allowed (1); resulting in failure. This occurs when the number of errors reaches the number specified in MaximumErrorCount. Change the MaximumErrorCount or fix the errors. Do not understand what does these mean? and why its giving error?
Free Windows Admin Tool Kit Click here and download it now
June 27th, 2011 5:04pm

Does this error mean that where I am stroing my results in table that does not have enough memory or from where I am taking my both lookup and refernce table that does not have enough memory?
June 27th, 2011 5:51pm

I want to follow the process of the following link http://www.sql-server-performance.com/2007/data-cleaning-ssis/ but its a very big process and it is for data cleansing not sure if its applicable to my problem or not? PLEASE LET ME KNOW AS SOON AS POSSIBLE. Thank you in advance
Free Windows Admin Tool Kit Click here and download it now
June 27th, 2011 9:55pm

Whatever method you use Fuzzy lookup always have memory problem. After using method of the link it again shows warning " [Fuzzy Lookup [199]] Warning: Not enough storage is available to complete this operation." and error "[SSIS.Pipeline] Error: component "Fuzzy Lookup" (199) failed the pre-execute phase and returned error code 0x8007000E." at the end one warning " Warning: SSIS Warning Code DTS_W_MAXIMUMERRORCOUNTREACHED. The Execution method succeeded, but the number of errors raised (1) reached the maximum allowed (1); resulting in failure. This occurs when the number of errors reaches the number specified in MaximumErrorCount. Change the MaximumErrorCount or fix the errors." I can not understand how can I get out from this memory space problem?
June 27th, 2011 11:19pm

You apparently have hit the resources limit try using a more robust machine or reduce the number of columns involved.Arthur My Blog
Free Windows Admin Tool Kit Click here and download it now
June 27th, 2011 11:23pm

But the problem is that I have been given this task to do. I have come across these tables for the firsr time. if I reduce the number of records in referecne tbale then How can I come to know if its perfectly working or not? and I know my refernce table has almost 16 to 18 millions of records. How can I reduce number of rows in that.
June 27th, 2011 11:38pm

I want to reduce number of columns of my reference table but unforunately in fuzzy lookup transformation there is no such option so that you can select specific columns.
Free Windows Admin Tool Kit Click here and download it now
June 28th, 2011 11:25am

Hey I have done it. I reduced numbers of columns as well as rows also in my reference table and done fuzzy look up Handed to my manager the result and he was very satisfied about the results So thank you all thank you very much for being with me.
June 30th, 2011 3:59pm

Told you to remove the number of columnsArthur My Blog
Free Windows Admin Tool Kit Click here and download it now
June 30th, 2011 4:05pm

This topic is archived. No further replies will be accepted.

Other recent topics Other recent topics