SSIS with DQS Performance Issues

I recently started working on SSIS.

I have a SSIS package which loads around a Million records from various Excels files. This is pretty fast like a minute or 2 .

After this is done cleansing activity starts and this is where my problem starts. I have like 20 domains in my DQS in which some of them do basic checks like length of data in column is not more than a certain value(like Name cant be more than 50) and some validate the data based on regular expression like phone number matches a pattern etc.

I also have a composite domain on top of these which internally have 6-7 rules.

like rule 1 if Domain 1 is a particular value Domain 2 cant be empty

When i cleanse 1M records directly from DQS it takes around 40 min to cleanse. When i do the same task on same data through SSIS it takes more than 8 hours and aborts some times

I have tried to increase the DefaultBufferMaxRows to 100000, DefaultBufferSize to 104857600 for the project.

I also tried to update DCChunkSize to 100000 from 10000 in [A_CONFIGURATION] table.

I even used Balanced Data Load Distributor to split the load into 4 cleansing tasks but none or all of them couldn't make a significant improvement. These slightly slightly reduced time by an hour or so, but sometimes it fails due to memory constraints.

Is this the max performance we could get in SSIS?

Also takes uses 100% CPU every time and doesn't free most of it even after finishing or stopping Visual studio. I have to end SQLServer From task manager.

I am running this on 16GB m

July 26th, 2015 5:37pm

Hi rp,

I am trying to involve someone more familiar with this topic for a further look at this issue. Sometime delay might be expected from the job transferring. Your patience is greatly appreciated.

Thank you for your understanding and support.

Regards,
Katherine Xiong

Free Windows Admin Tool Kit Click here and download it now
July 29th, 2015 9:38pm

Hi Rango,

My suggestion is that please enable the SSIS logging to find where is the bottleneck of the performance issue. Is it on source/ destination/ transformation ? In the meantime, we could check the perfmon result of the ssis process contributes to the most CPU/memory on the server. 

Regards,

Doris Ji

August 2nd, 2015 10:17pm

This topic is archived. No further replies will be accepted.

Other recent topics Other recent topics