SSIS package very slow when loading records
Hello, I have a very large table that I am loading data daily (over 5 million records in table with over 200 K daily inserts). I am using SSIS package to load a single table. In my ssis package I have the following: 1. Foreach Loop to load flat files 2. Bulk Insert Task to load into a staging table which is truncated every time loading a new file 3. A dataflow task with Andy Leonard’s method: a. OLE DB Data Source with a query b. Lookup task c. Conditional Split d. Loading new records into the destination table via a OLE DB Destination Task. The process was very fast until the records grow substantially. There are approximately 5 million records in the destination table and now it takes forever process. I have filtered the query in the Lookup task to only look at the data from the previous 7 days – in order to avoid bringing all 5 million records into the Lookup task cache…now only about 300 K records are cached in the lookup task. My destination table is a single flat table with 275 columns, the same as the files. The primary key is a multi-valued key with a non-clustered index. Initially, it was a clustered index, because it was very slow, I changed it to a non-clustered index but it became slower. Does anyone know what the best solution is to optimize the loading? Kajo
November 12th, 2010 6:10pm

Couple of questions: Your lookup task, how many rows does it cache each time, and what table is it reading? What kind of lookup are you doing, and is the result of the lookup being used in the Conditional Split? Next: What is the access mode used in the Destination? "Table or view" or "Table or view fast load"? Your Destination has a multi-field Primary Key? Any foreign keys it needs to worry about? Maybe the best thing would be to load the data right into a staging table (truncate it before each load), then use T-SQL logic to INSERT only the rows you need from the staging table to the main table. If your Database is SQL 2008 Compatibility Level, then look into the MERGE statement.Todd C - MSCTS SQL Server 2005 - Please mark posts as answered where appropriate.
Free Windows Admin Tool Kit Click here and download it now
November 12th, 2010 9:11pm

Hi Kajo, Just add to Todd, the first important thing you should do is identifying the bottlenecks. You can investigate each part of your data-flow/control-flow in isolation. Once you've identified the bottleneck, I think you will receive more useful responses and suggestions. http://technet.microsoft.com/en-us/library/cc966529.aspx (A link you may want to check, you can simply start from the ‘Measuring Performance’) Hope this helps, Raymond Raymond Li - MSFT
November 15th, 2010 7:44am

What cache mode are you using on your lookup task? if you are using no cache or if you are using partial cash, is the field that you are doing the lookup on indexed? when you run the package from BIDS do you see the lookup task been active (yellow) during most of the executionof the package or does it complete long before the package finishes execution? This should give you a clue as to where is the bottleneck. Also, if you are using full cache mode (whiche i assume you are not due to the size of the table) keep in mind that the package will load the whole table into memory before starting doing the lookups. if you don't have enough memory, the memory buffers will overflow to the swap file really slowing down the package.
Free Windows Admin Tool Kit Click here and download it now
November 15th, 2010 10:12am

What cache mode are you using on your lookup task? if you are using no cache or if you are using partial cash, is the field that you are doing the lookup on indexed? when you run the package from BIDS do you see the lookup task been active (yellow) during most of the executionof the package or does it complete long before the package finishes execution? This should give you a clue as to where is the bottleneck. Also, if you are using full cache mode (whiche i assume you are not due to the size of the table) keep in mind that the package will load the whole table into memory before starting doing the lookups. if you don't have enough memory, the memory buffers will spill down to the swap file really slowing down the package.
November 15th, 2010 10:14am

This topic is archived. No further replies will be accepted.

Other recent topics Other recent topics