Unable to Merge after Data Conversion on sort key
I have a situation where I have to perform a data conversion on my sort key and then merge it with another set of data. When I run the data through the Data Conversion Transormation, my new data typed column does not retain the sort key property, and I see no way to set it. The data is already sorted and I've already experienced resource problems with this package, so I'd REALLY like to avoid doing yet another sort on data that I already know is sorted. Any suggestions? For those that are curious as to the details of why I'm doing this... I have an XML data source that contains what is supposed to be a DBTIMESTAMP field. A small amount of items (about 20 out of 100,000) contain incorrectly formatted data in this field. My ultimate goal here is to pull out those 20 items, correct the formatting (it's a matter of truncating extraneous trailing characters), and then merge it back with the original data to continue on through the data flow. I'd like to have the XML Source redirect those error rows to a Derived Column, do my truncating, convert it to the appropriate data type, then merge it back with the original data. The problem is, when I redirect that XML Source error row, the data type for the sort key (lets just call it SortKey1) can't be DT_I4, so I have to output it as something else (DT_WSTR) then run it through a data conversion. When I do that, the sort key positioning is lost and I'm unable to merge the data without doing a sort on data that I already know is sorted.
September 9th, 2011 11:49am

You'll have to pull a fast one on SSIS to do this... and you will have to do some manual setup. Use a Script transformation component after the Data Conversion. Set the Script to be Asynchronous: Go to the Inputs and Outputs tab. Change the SynchronousInputId of the output to -1. Add columns to the Output that are exactly the same as the columns on the input. Yes, this is the manual part. No, there is no way to "copy" or "pass through" the input column definitions. Select the Output node, and change it's IsSorted property to True. Select the column(s) that dictate the sort order, and set their sort order. If I read you right, that's just one column - so set it's SortOrderKey to 1. In your script, remove all the method stubs. Add an override for ProcessInputRow. Call Output0Buffer.AddRow, then set all of the Output0Buffer columns to the same value(s) as the input row that's passed in. (You're copying the input row to the ouput row.) All that script does is copy input rows to the output buffer. But the key is that SSIS thinks you're doing something magical in there (aggregating, etc), because you have it set up as asynchronous. That lets you tell SSIS what the sort order is. Even though you're doing nothing really. Talk to me now on
Free Windows Admin Tool Kit Click here and download it now
September 9th, 2011 1:48pm

Ahhhh that's an awesome idea! Basically an empty transformation that allows me to set IsSorted. Thank you very much!
September 9th, 2011 9:19pm

This topic is archived. No further replies will be accepted.

Other recent topics Other recent topics