How to compare multiple rows in an SSIS package
Hi all,I'm new to SSIS, but have been studying/evaluating it for a new project I am about to start on. Pure SQL Server environment with a Cognos 8 front end.One of the transformations I need to do is to calculate the difference between blocks of different records for each week. Consider the following example of hockey teams scores week over week.Key Week Team Score1 Week 1 Red 152 Week 1 Blue 123 Week 1 Orange 54 Week 1 Green 05 Week 2 Orange 86 Week 2 Green 37 Week 2 Red 28 Week 2 Blue 1Transformation Rule 1Calculate the point spread between the first and second place teams. So in my example above, I would want to calculate a new column called WonBy which would calculate as followsKey Week Team Score WonBy1 Week 1 Red 15 3 (1st place = 15 - 2ns place=12)2 Week 1 Blue 12 03 Week 1 Orange 5 04 Week 1 Green 0 05 Week 2 Orange 8 5 6 Week 2 Green 3 07 Week 2 Red 2 08 Week 2 Blue 1 0The 2nd tranformation rule I need to apply is similar, but compare each team against the winner. Lets call this DiffToWinnerKey Week Team Score Diff to Winner1 Week 1 Red 15 02 Week 1 Blue 12 -3 (12 - 15)3 Week 1 Orange 5 -10 (5 - 15)4 Week 1 Green 0 -15 (0 - 15)5 Week 2 Orange 8 0 6 Week 2 Green 3 -5 (3 - 8)7 Week 2 Red 2 -6 (2 - 8)8 Week 2 Blue 1 -7 (1 - 8)Can SSIS do something like this for me? Basically compare rows within a sort group against each other?I've written this in C#, and needed to buffer the rows until a sort break on week, then loop through the rows in the week 4 times to calculate the various metrics.Hopefully, this example makes sense the way I've laid it out. Any help would be appreciated...Thanks for your help...Garry
January 28th, 2010 5:47am

SSIS can help you here. there are 2 ways to doing this. 1> Use the Script component in DFT Task and need to write the code. What you need to do here is a) Make a group on week.... you can take help from the following link http://quest4gen.blogspot.com/2009/10/allocating-set-id-extracting-data-in.html b) for a group you need to find out the top 2 scorer and save them in local variable. 2> Second approach is Just dump your data to the SQL table and then you need to write a Stored procedure . This stored procedure will return you the following columns Key Week Team Score WonBy Diff to Winner Call this SP in OLEDB Source as SQL command. The second approach is more suitable and easy to implement. Thanks-Let us TRY this | Dont forget to mark the post(s) that answered your question http://quest4gen.blogspot.com/
Free Windows Admin Tool Kit Click here and download it now
January 28th, 2010 9:19am

You have two other options inside SSIS that I can think of.First, if you know the "team names" and know they will never change (probably unlikely) then you can use the Pivot Transformation to change your rowset from one row per team to one row per week, with the team names as column headings, and the scores as values in those columns. Then you can use a Derived Column component to do fairly easy intra-row math to find the winner and differential. If you then needed the rows back in the one row per team, use the Unpivot component to change them back.Second, after sorting the rows by week and score, you could use a Script or the Row Number (not Row Count) component to number the rows within the week, such that you get a "first place", "second place" type of ranking for each week. Multicast that flow into two flows (A + B). In flow B, add a Derived Column component to decrement the "rank" by one. Merge Join A + B back together by week and rank - you now have "adjacent" ranked teams for each week in the same row. Simple intra-row math with a Derived Column component (no Pivoting or Unpivoting required) to get your differentials.
January 28th, 2010 10:41am

Check this approach if it helps you:This is the table and the data as you have given in original post: CREATE TABLE Hockey_Stats( KeyId INT PRIMARY KEY, WeekName VARCHAR(10), TeamName VARCHAR(10), Score INT) Go INSERT INTO Hockey_Stats SELECT 1,'Week1','Red',15 UNION SELECT 2,'Week1','Blue',12 UNION SELECT 3,'Week1','Orange',5 UNION SELECT 4,'Week1','Green',0 UNION SELECT 5,'Week2','Red',2 UNION SELECT 6,'Week2','Blue',1 UNION SELECT 7,'Week2','Orange',8 UNION SELECT 8,'Week2','Green',3 GO ALTER TABLE Hockey_Stats ADD DiffToWinner INT ALTER TABLE Hockey_Stats ADD WonBy INT Create a package with 4 variables:MaxScore and SecondMaxScore as IntMaxScorePerWeek as object variableWeekName as StringSTEP1:Execute SQL Task (to capture the max score and week name):Execute SQL Task Editor:General: ResultSet as Full Result Set SLQ Statement as SELECT MAX(Score),WeekName FROM Hockey_Stats GROUP BY WeekNameResult Set: Result Name Variable Name ---------------------------------------------------- 0 User::MaxScorePerWeekSTEP2:Take a Foreachloop container and open the foreach loop editor:Collection: Enumerator as Foreach ADO Enumerator ADO Object Source Variable as User::MaxScorePerWeekVariable Mappings: Variable Index -------------------------------- User::MaxScore 0 User::WeekName 1STEP2a):Take another Execute SQL Task inside for each loop (to capture second highest score for a particular week)Execute SQL Task Editor:General: ResultSet as Single Row SLQ Statement as :SELECT TOP 1 Score FROM(SELECT Score FROM Hockey_Stats WHERE Score NOT IN ( SELECT MAX(Score) FROM Hockey_Stats WHERE WeekName = ?) AND WeekName = ?) A ORDER BY Score DESC Parameter Mapping: VarName Direction datatype Paramname ParamSize --------------------------------------------------------------------------- WeekName Input Varchar 0 -1 WeekName Input Varchar 1 -1Result Set: Result Name Variable Name ---------------------------------------------------- 0 User::SecondMaxScoreSTEP2b):Take a data flow task after execute sql task inside for each loop. Inside DFT take a OLEDB Source,Derived Column and OLEDB Command component.OLEDB Souce : Data Access Mode as SQL Command with command as:SELECT * FROM Hockey_Stats WHERE WeekName = ?. Hit the parameter button and use WeekName variable to do the parameter mapping.In Derived Columns component create 2 columns as:Column Name Expression-----------------------------------------------------------------------------------------Won_By (Score > @[User::SecondMaxScore]) ? Score - @[User::SecondMaxScore] : 0Diff Score - @[User::MaxScore].Click on Component Properties inside OLEDB Command Component editor and write the SQL Command asUpdate Hockey_Stats Set WonBy=?,DiffToWinner=? Where WeekName = ? And KeyId = ?Then click on Column Mappings and do the parameter mapping as:Input Column Destinaiton Column-------------------------------------------------------Won_By Param_0Diff Param_1WeekName Param_2KeyId Param_3 Nitesh Rai- Please mark the post as answered if it answers your question
Free Windows Admin Tool Kit Click here and download it now
January 28th, 2010 2:08pm

I guess you have the data in a DB table.I would go for the approach advised by ETL thats simple and easy to manage.Make a SP that has the logic you need and call it from the OLEDB Source and you have the outputHope this helps !! Sudeep| My Blog
January 28th, 2010 2:32pm

Hi All, Nitesh ur too GOOD....Man....writing the solution with describing each step....AWESOME......really ur too GOOD....
Free Windows Admin Tool Kit Click here and download it now
January 28th, 2010 5:41pm

Thank you all for your thoughtful replies. I will review and see if I can get these ideas working in a test package...There are 750,000 rows in this table, so it will be interesting to see where any performance bottlenecks will show..Garry
January 28th, 2010 7:22pm

Here is an alternative with a Script Component: http://microsoft-ssis.blogspot.com/2011/04/compare-values-of-two-rows.html This example calculates the sales per week if you only have cumulative sales: Please mark the post as answered if it answers your question | My SSIS Blog: http://microsoft-ssis.blogspot.com
Free Windows Admin Tool Kit Click here and download it now
April 16th, 2011 11:43am

This topic is archived. No further replies will be accepted.

Other recent topics Other recent topics