Removing Duplicates in SSIS
Hi, I am having problem while removing duplicate values in ssis. I have tried using SORT Transform but found that it is not efficient as this doesn't give any control on which duplicate record is deleted. So please tell me an efficient way to do this. Thanks in advance.
June 17th, 2011 4:32am

Then another alternative is take data into temp table then apply distinct on columns you wants .... but its not right option on very large table..... by the way how big is your table.... how many columns are you looking for uniqueness?http://uk.linkedin.com/in/ramjaddu
Free Windows Admin Tool Kit Click here and download it now
June 17th, 2011 4:55am

i have Just 10 columns can u please tell me the query sintax Query : select distinct a, is null (x,y) as b,c,d from z Please rewrite the above quey with correct syntax so that i can use distinct here
June 17th, 2011 5:09am

Hi, I am having problem while removing duplicate values in ssis. I have tried using SORT Transform but found that it is not efficient as this doesn't give any control on which duplicate record is deleted. So please tell me an efficient way to do this. Thanks in advance. I covered this issue on my blog a few months ago, it may prove useful to you: Sort transform arbitration (http://sqlblog.com/blogs/jamie_thomson/archive/2009/11/12/sort-transform-arbitration-ssis.aspx) Don't miss the comments - they may provide an answer to your question. @jamiet http://sqlblog.com/blogs/jamie_thomson/ | @jamiet | About me
Free Windows Admin Tool Kit Click here and download it now
June 17th, 2011 5:14am

Hey Jamiet, just need a quick clarification on what you have said. I was been advised not to use this sort transform (Remove Rows with Duplicate Sort Values) option Saying that sort doesn't have control on deletion of the duplicate rows. Is it True. Please let me know whether the above statement is true or False. Please let me know the reason so that i can explain it to my client. Thanks in advance..
June 17th, 2011 5:30am

Hi, I am having problem while removing duplicate values in ssis. I have tried using SORT Transform but found that it is not efficient as this doesn't give any control on which duplicate record is deleted. So please tell me an efficient way to do this. Thanks in advance. What kind of control are you looking for? (The first, last, max value?) That can be solved by an aggregation/group by: a) First load data in staging table and then use a group by query as source (instead of just the whole table) b) Or use the aggregate transformation (but max, min, etc. don't work on string columns) Or an alternative: c) Sort the data (via source query or sort transformation) and use a Script Component to to more complex comparisonsPlease mark the post as answered if it answers your question | My SSIS Blog: http://microsoft-ssis.blogspot.com
Free Windows Admin Tool Kit Click here and download it now
June 17th, 2011 5:45am

can u be more specific on how we can achieve it using Script Component. I have 7000 records in my database with null values and duplicate values. I want to remove all those rows. I think that was because of the duplicated rows of other columns. I tried loading using aggregate transform using group by function it returned me duplicate rows. I tried using sort (Using Remove Rows with Duplicate Sort Values), it retrieved unique rows for a particular column on which i applied sort. But i was advised not to use sort. Now my point is how to remove all the duplicate records with out using sort and Aggregate transform as i have already used it. Please advice if i am in right way or not. Thanks in advance.
June 17th, 2011 6:22am

I have 7000 records in my database with null values and duplicate values. I want to remove all those rows. Can you give us an example of a couple of rows and tell us which one you want to keep? example (in this case only keep 1st and 4th row or also the last row): keycolumn attributecolumn 1 test 1 test 1 NULL 2 test 3 NULL Please mark the post as answered if it answers your question | My SSIS Blog: http://microsoft-ssis.blogspot.com
Free Windows Admin Tool Kit Click here and download it now
June 17th, 2011 6:45am

can u be more specific on how we can achieve it using Script Component. About the Script Component... You can compare row values with a Script Component. Here is an example of that: http://microsoft-ssis.blogspot.com/2011/04/compare-values-of-two-rows.html You could also create more than one output column (one for the good records and one for the duplicate records). Here is an example of that: http://microsoft-ssis.blogspot.com/2010/12/conditional-multicast-script.html But... Do you have any experience with VB.net or C#? Otherwise these example aren't very useful for you.Please mark the post as answered if it answers your question | My SSIS Blog: http://microsoft-ssis.blogspot.com
June 17th, 2011 6:51am

This topic is archived. No further replies will be accepted.

Other recent topics Other recent topics