Tuning SSIS - Sort transformation vs using T-SQL Order statement
Hi Folks I have read it is better to use a T-SQL ORDER statement in SSIS Source Connection Manager rather than a Sort transformation inside a SSIS package (which is fully blocking). However, when I compare the two approaches (Figure 1), both take the same length of time to run (c. 7 seconds on my machine). Can someone explain what I am missing? Regards Steve Steve
October 27th, 2012 7:24pm

Hi Steve, the SORT transformation is a blocking transformation. Read more http://www.bidn.com/blogs/Daniel/ssas/1361/ssis-blocking-non-blocking-and-partially-blocking-transformations Yes, it is a good practice to SORT data at source because one would always want to avoid a SSIS pipeline chokeup by loading the entire dataset in to the memory.Please vote as helpful or mark as answer, if it helps Cheers, Raunak | t: @raunakjhawar | My Blog
Free Windows Admin Tool Kit Click here and download it now
October 28th, 2012 3:46am

Thanks Raunak So why does the package with the SORT transformation in it take the same length of time to execute as the the package with the sort at source (T_SQL with ORDER syntax)? In my test both packages took 7 seconds to complete. So I didn't see any quantifiable performance gain sorting at source. Thats what is perplexing me. Regards Steve
October 28th, 2012 4:56am

Sort in SSIS is done in memory (fully blocking). As long as there is plenty memory left there is probably no issue... still I would advice you to use the database sort.Please mark the post as answered if it answers your question | My SSIS Blog: http://microsoft-ssis.blogspot.com | Twitter
Free Windows Admin Tool Kit Click here and download it now
October 28th, 2012 10:17am

Thanks Raunak So why does the package with the SORT transformation in it take the same length of time to execute as the the package with the sort at source (T_SQL with ORDER syntax)? In my test both packages took 7 seconds to complete. So I didn't see any quantifiable performance gain sorting at source. Thats what is perplexing me. Regards Steve Hi Steve, You may probably have to experiment with huge volume of data, There is a definite performance gain in avoiding a Fully blocking transformation like Sort Transformation.Rajkumar
October 28th, 2012 10:23am

I would say that in both cases it is "fully blocking", i.e. you have to wait until all data are sorted in SSIS or in the database and then start processing. As long as there is enough RAM, there should be no much difference. 7 seconds points to small set of data, so I really wouldn't worry at all. If you are loading data from OLTP DB and there is a need to minimize load on the database server, sort in SSIS. If you can add index to the data in the database then you will not have to sort at all when using ORDER BY.
Free Windows Admin Tool Kit Click here and download it now
October 28th, 2012 8:33pm

Thanks for the reply. The 7 second example was just a small scale test (20 MB file). In the production environment the target table is >50 GB in size. The aim of creating the small scale test environment was to understand the performance difference of the two approaches (sort at source vs sort in the package). But I was curious why there was no difference in performance. I will increase the size of the source data file (currently 20MB) in the test to put more pressure on the RAM and see what happens. Regards Steve
October 28th, 2012 10:00pm

7 second is not good time to compare and on the other hand SSDT is not good comparison tool for performance as well, because it will take some time to load GUI based on execution of tasks and components. so my suggestion is : 1- try with huge number of records to feel the difference 2- deploy your package to SSIS Catalog and run it from there and final word is that: yes database order by will be more efficient (if you try with bigger number of rows you will feel the difference)http://www.rad.pasfu.com
Free Windows Admin Tool Kit Click here and download it now
October 28th, 2012 10:01pm

Also, for the bigger data sets be sure to use 64-bit execution environment (if you wish to sort in SSIS) BIDS always execute in 32-bit mode, so use SQL Agent job or 64-bit DTEXEC.
October 28th, 2012 10:28pm

Steve, what you must understand here is that no package design will just include a source and a SORT transformation, they will be complex with more complex scenarios. Hence it is adviced that avoid full blocking transformation such as SORT in SSIS. Please vote as helpful or mark as answer, if it helps Cheers, Raunak | t: @raunakjhawar | My Blog
Free Windows Admin Tool Kit Click here and download it now
October 29th, 2012 1:12am

Hi Reza Thanks for the suggestion. I tested a package that connected to a data source that was approximately 10 times of the original test (14 millions vs. 1.4 million rows). The difference was significant- the ORDER clause was 2.5 X faster. Regards, Steve
October 29th, 2012 4:02am

Hi Steve, Thanks for putting result of your testing here, this will help lots of questions in this area :) Regards, Rezahttp://www.rad.pasfu.com
Free Windows Admin Tool Kit Click here and download it now
October 29th, 2012 8:49pm

This topic is archived. No further replies will be accepted.

Other recent topics Other recent topics