Slowly Changing Dimension task really really slow
I'm trying to use the Slowly Changing Dimension task to update/insert into a dimension table that has over 500,000 rows and over 40 fields. During last nights ETL test run it had ran for over 4 hours before I stopped the job. Couple of questions: 1. Is this the right task for the job? 2. Are there property settings that can be set to speed this up? Any help would be appreciated. Thanks, Jeff
June 26th, 2008 5:42pm

The SCD transformation performs row-by-row database calls for each row that passes through the data flow. This often makes it too slow for the data volumes you're describing here. Although creating and/or tuning indexes on the dimension table can speed things up, generally it is better to "roll your own" SCD functionality with a Merge Join transformation if you have very large data volumes.
Free Windows Admin Tool Kit Click here and download it now
June 26th, 2008 6:13pm

Hi Jeff The SCD transformation can be made to run much faster than what you describe with specific indexing on: - your Business Key column(s) - your Start Date, End Dateand Current Record columns (if used) If you have a high volume of new Business Keys, you can divert them with an upstream Lookup, bypass the SCD for those rowsand slap them straight in with a OLE DB Destination on fast load. It can be worth persevering with the SCD to avoidthe pile of specific components needed to achieve the same result. There is no comparison when considering package maintainability:addingsome new columns?just re-run the wizard... Good luck Mike
July 10th, 2008 10:04am

There is another alternative that gets you the benefits of the SCD Wizard (easy config) with higher performance - the Kimball Method SCD component on CodePlex.
Free Windows Admin Tool Kit Click here and download it now
May 19th, 2009 7:38pm

Using Merge Join and Conditional Split to create your own "Slowly Changing Dimension" will be very fast on processing large amount of records. I tested on 3 million records it takes about 1 minute.
July 26th, 2009 5:33pm

If your dimension is a type 1 dimension and your source is SQL 2008 you can retrieve only changes from the source by using change tracking rather than loading all data from the source and pushing it through SCD. If that is the case see how I do this in my blog: http://luisefigueroa.blogspot.com/2010/09/super-fast-data-warehouse-dimensions.html Regards, Luis
Free Windows Admin Tool Kit Click here and download it now
November 12th, 2010 5:06pm

You could use the standard lookups in SSIS to accomplish a faster SCD: http://microsoft-ssis.blogspot.com/2011/01/slowly-changing-dimension-alternatives.html If you have a lot of updates you can insert the updates to a second table and do a batch update with the Execute SQL Task.
January 7th, 2011 3:08am

This topic is archived. No further replies will be accepted.

Other recent topics Other recent topics