SSIS - Derived Column Error Rows will not redirect to table
I am new to SSIS and I have created a package that generally successfully takes a Flat File Source, runs it through a Derived Column transformation and then pushes the data into a SQL Server Database. I have a Derived Column Error Output set up, but it is not working. If the data has an error the entire package fails, not just the specific row. I have tested the package to load successfully and then manually gone into the flat file and changed the data to purposely create an error in one row. Having cleared the load table I rerun the package and it fails entirely rather than redirecting the one bad row. In the Derived Column Transformation Editor I set the Error and Truncation values to Redirect Row for all the columns. In SQL Destination Editor for the bad rows I am only mapping the ErrorCode abnd ErrorColumn values to the destination table fields of the same name. Could this be the problem? Should I map all the fields? I would like to map all the source (derived) fields into a single text field in the destination table if possible. The Advanced settings have only the Table Lock and Check constraints boxes checked. Timeout is 30. I am not really sure how to set the Advanced Editor properties. I have left them at the default values. Any suggestions here? Thanks. Steven I should add that when the package fails, the Flat File Source and Derived Column Task are GREEN and the SQL Server Destination (for success) is RED. The SQL Server Destination (for errors) remains WHITE. Also the row count shown on the Data Flow Task shows the total number of rows, even though the actual number of rows loaded into the SQL Server Destination is the Total - 1. From this I am deducing that the Derived Column task is executing and determining the row that is bad, but for some reason is not passing it down the error output path. One more thing... I tried sending the errors to a flat file destination. The package does create the file with the appropriate header row showing all the columns including the ErrorCode and ErrorColumn. But it does not send the erroneous row into the file. All successful rows were however sent to the database. Seems my bad row is just disappearing into thin air...
October 19th, 2010 11:09pm

The problem is that whatever transforms you're doing in the Derived Column component aren't "failing". They aren't "illegal" enough to fail, apparently. Can you post some sample "input" information, then post the expressions in your Derived Column transform that you expect to fail? Talk to me now on
Free Windows Admin Tool Kit Click here and download it now
October 20th, 2010 2:10am

Sure. The data coming in is a delimited text file with records that look something like this: System, ID, Product, ProcessDate, EmpName ...(and another 10 fields or so) ALS, 012345A, ABC, 1/1/2010, JonesCorp... ALS, 012345A, CDE, 1/1/2010, JonesCorp... ALS, 9876543A, ABC, 1/1/2010, SmithCorp... The first four fields constitute the key for the record in the SQL Server table being loaded. Some examples of "errors" I created were to change the second record so the Product field was ABC and thus make it a duplicate key of the first record. I also changed the ProcessDate field so that it was a text field such as "baddate" to create an invalid date. Both of these changes resulted in the failure of the package, but not the redirection of the row. For all of the first four fields all I do in the Derived Column Transfomation Editor is put the Name of the field in the Derived Column Name column, Replace 'FieldName' in the Derived Column column, and the FieldName in the Expression column. The only real transformations I do are to non-key fields. So I am wondering if you are suggesting the error is not from the Derivied Column Task, but somewhere else? As I noted, the Derived Column Task and the Flat File Source show GREEN even when the package fails, so I assumed they were ok. But as I said, I am new to SSIS so I may be missing something basic. Thanks for your assistance. Steven
October 20th, 2010 2:48am

Making a field duplicate will not be treated as error at derived column because it is a valid data as far as derived column is concerned. Changing the ProcessDate field as some string value will be caught at derived column if your derived column is using the ProcessDate column and applying some datetime funciton on it. Now tell us what is the expression you are using in derived column component for ProcessDate column and what is the datatype of that column (as shown in derived column transformation editor). Also, try to use "add as new column" option instead of "Replace" option. Nitesh Rai- Please mark the post as answered if it answers your question
Free Windows Admin Tool Kit Click here and download it now
October 20th, 2010 3:52am

As I noted in the earlier post, I am not doing anything with the ProcessDate except passing it through as noted. So now I understand that a duplicate field and an invalid date are not considered errors at the derived column level. This begs the question: Where do I detect the duplicate field, bad datatype, etc. errors and pass them to my error file (or error table)? Thanks. Steven
October 20th, 2010 4:44am

Steve, If you want to capture bad date inside derived column then you have to cast the ProcessDate column in the derived column component using DT_DATE. Now derived column component will redirect all the bad dates. Inside derived column editor you have to do something like this: DerivedColName: NewProcessDate DerivedColumn: Add as new column Expression: (DT_DATE)ProcessDate As far as duplicate data related error is concerned you have to configure the error output of the destination component. Nitesh Rai- Please mark the post as answered if it answers your question
Free Windows Admin Tool Kit Click here and download it now
October 20th, 2010 5:03am

You said: As far as duplicate data related error is concerned you have to configure the error output of the destination component. I don't understand what you mean by this statement. The SQL Server Destiination is the destination component to which you refer. Where do I configure the error output? In fact, this is the final destination for the data. I don't see any outputs from it -- success or error. I don't see anything relating to errors when I right click on the component and select Edit or Advanced Edit. Do you know where the "error output" configuration is to be found for the SQL Server Destination? Thanks.
October 20th, 2010 5:45am

Okay...first question: why are you using SQL Server destination? Try to use OLEDB Destination. When you open the OLEDB Destinaiton editor, you will see an "Error Output" option in left pane after "Mappings" option. You need to configure that. SQL Server destination has no such option.Nitesh Rai- Please mark the post as answered if it answers your question
Free Windows Admin Tool Kit Click here and download it now
October 20th, 2010 5:56am

Nitesh's suggestion to use the OLE DB Destination is a good one. It doesn't bear on your original problem, but it's considered a best practice to avoid the SQL Destination - it typically doesn't get you any measurable benefit, and is more restrictive. Back to your original issue. The problem is that you're reading data that SSIS doesn't know is "bad". Your flat file source is (apparently) reading all (or most) of these columns as text (DT_STR or DT_WSTR) - and as such, just about any data is valid according to the source component. Since you aren't actually transforming data in your Derived Column transform - you're just copying the data to other columns - there's no invalid operation to "catch" in the Derived Column transform either. The reason the Destination component is failing is because your SQL table that you're inserting to has: Specific non-text types, like datetime, and Has a primary key defined Those two things cause SSIS to report an error that originates with SQL Server. For item #1, your problem is an implicit type conversion. You're sending a string value to SQL Server and instructing it to store it in a datetime column. If you change the value in what's supposed to be a "date" typed column to the string "bad date", then when SSIS attempts to insert that value into your table, SQL Server rejects that row because it can't implicitly convert that string to a meaningful date. For item #2, SSIS has no idea that you have any referential integrity constraints on your destination table. As such, it blindly inserts rows. SQL Server objects to duplicate keys, and therefore reports an error to the Destination component. There are several techniques to handle each problem. For the type conversion issue, you should probably do what Nitesh has suggested and explicitly convert your datetime columns to DT_DBTIMESTAMP (or another date type) inside SSIS before you send it to the Destination component. If you do this - using a Derived Column or Data Conversion component - then you can trap errors in that conversion and route them elsewhere. You may also elect to not do that, and configure your Destination component to route errors - but you'll have to change the config of your destination to do row-by-row inserts so that only one row at a time is rejected if it's in error. If you leave the batch size as it is, you'll get a whole batch of records rejected if just one is in error. For the duplicate key issue, you can again choose to detect and handle the issue inside SSIS using one of these techniques. Or, you can leave it up to SQL Server to report those errors, and configure the Destination to route errors - again, requiring row-by-row inserts to avoid rejection of entire batches. Talk to me now on
October 20th, 2010 8:06am

Todd, Sorry for the delayed response, but thank you very much for your comprehensive discussion of the issues. Once I understood how these things worked I was able to set things up to send the data row by row to SQL Server (using the OLE Db Destination as suggested). Bad rows are diverted to a different table from good rows and a report allows the user to see the bad rows. BTW, I did this by setting the Rows per Batch and the Maximum insert commit size to "1" in the OLE Db Destination Editor. Was this correct? Since the Rows per Batch was set to "1" I was thinking I could set the Maximum Commit Size to zero (or any other value for that matter). Please let me know if I am understanding these options. However, I was not able to get the Derived Column Transformation to generate an error on a specific row with a bad date. I set the values of the field "ProcessDate" as follows: Derived Column Name : ProcessDate Derived Column: Replace 'ProcessDate' Expression: (DT_DBTIMESTAMP) ProcessDate Data Type: database timestamp [DT_DBTIMESTAMP] As noted in the earlier messages, I set the ProcessData field to a string "baddate" for a particular row, but when I ran the package the Derived Column Transform did not error out. Any ideas why? Nevertheless, SQL Server did reject the row and the OLE Db Destination passed the row into the error table. If I do get the Derived Column Transform to check for bad dates, I assume I can send the failed row(s) to the same error table in SQL Server, right? Again, thanks for all your help getting me this far. Steven
Free Windows Admin Tool Kit Click here and download it now
October 27th, 2010 7:51pm

Derived Column Name : ProcessDate Derived Column: Replace 'ProcessDate' Expression: (DT_DBTIMESTAMP) ProcessDate Data Type: database timestamp [DT_DBTIMESTAMP] Configure Error OutPut of derived column component with "Redirect Row" for Error and "Ignore Failure" for Truncation. You can also select "<add as new column>" in Derived Column instead of Replace ProcessDate.Nitesh Rai- Please mark the post as answered if it answers your question
October 28th, 2010 2:32am

Nitesh, I noticed earlier you also suggested that I configure this with <add as new column>. Why does this make a difference for error detection? It is my understanding that by doing this I will create new column IN ADDITION to the existing ProcessDate column as the output. I do not want a new column -- just the same number of columns as are coming from the flat file. I don't understand why it doesn't produce an error when I set it to REPLACE. Please explain. As for the Error Output configuration, the Error was already set to "Redirect Row" as was the Truncation. Is there a reason why setting Trucation to "Redirect Row" would prevent the Derived Column Transform from detectiing the error in the date? Thanks. Steven
Free Windows Admin Tool Kit Click here and download it now
October 28th, 2010 8:47am

How is the derived column component's error output configured? Nitesh Rai- Please mark the post as answered if it answers your question
October 28th, 2010 8:51am

Every column is configured the same. Each has the name of the field in the Column, Error is set to Redirect Row, Truncations is set to Redirect Row and Descriptions says Computation.
Free Windows Admin Tool Kit Click here and download it now
October 28th, 2010 9:57am

Can you try setting "Ignore Failure" on truncation?Nitesh Rai- Please mark the post as answered if it answers your question
October 28th, 2010 7:18pm

If you don't "add as new column" then your cast to a different data type isn't really getting you anything but a delay in your package execution. In SSIS, you can't change the data type of a column. Period. You can mess with the contents of a column, but not the type. So if your column is a string type, and your expression casts it to a datetime, then the contents will be cast to a datetime... and then cast right back to a string to stuff into the column. If you "add new" then you'll get a date typed column. The Derived Column expression might not return an error because it's casting rules may not be violated. When you insert into SQL Server, it has different casting rules, so its rules may be violated. What data does SQL fail with, that SSIS does not? It may have something to do with your data types as well. A smalldatetime in SQL has a smaller range of acceptable values than a DT_DBTIMESTAMP. It may also be that the region settings on your computer (where SSIS executes) means it can convert the string to a date, but the region settings on the server cause it to interpret the date differently, resulting in an illegal interpretation. (You should reformat date strings to yyyy-MM-dd format before any cast to prevent this.) Using a batch size of 1 is correct - but will make your package slower on inserts, because it's now one round-trip to the DB per row. You can optimize your package a little bit more (and make it more complicated) by having two destinations. Make your first destination have a batch > 1, and send failures to a second destination with a batch of 1. When one row fails the insert in the first destination, ALL the rows in the batch will fail and get redirected to the second destination. The second destination will be able to insert all but the one row. This gets you batch performance for most of the time, but resolves individual row errors too. Talk to me now on
Free Windows Admin Tool Kit Click here and download it now
October 29th, 2010 9:03am

Nitesh and Todd, First to Nitesh: I tried making the date field (and later all the fields) set to Ignore Failure for the Truncation error but it made no difference. Still not getting an error in the Derived Column task. However I did notice something that may provide a clue. Recall that I set the value of the date to the string "baddate". The error table that I sent errors to has all the fields set to varchar(50) so sending the string to that table should not be a problem. When the error does occur (at the OLE Db Destination, not Derived Column) the bad row arrives in the error table with ProcessDate set to NULL. I am guessing that the Derived Column task did this, passed it to SQL Server and SQL Server generated an error because ProcessDate is part of the table key (for the non-error table) and as such cannot be NULL. Furthermore, I tried setting another NON-KEY DATE field, EffectiveDate, in the same record to "baddate" after resetting ProcessDate back to its correct value. When I did, the OLE Db task loaded all the records successfully and set the EffectiveDate to NULL. Also recall that on EffectiveDate there is nothing in the Expression column of the Derived Column task except the name of the field, EffectiveDate. This seems to confirm Todd's analysis that Derived Column not generating an error, but just creating NULLs as part of the casting process. Assuming that is the explanation, I have a rather basic question to ask: If I must use the <add as a new column> option to generate the error, is there a way to prevent the original value of ProcessDate from also being in the output? I want the same number of columns coming from the Flat File to leave the Derived Column task. Finally to Todd:Thanks again for your clear explanation of what is happening under the covers and for your suggestion on how to handle the batch v. row-by-row option. I like the idea of the double OLE Db Destinations and will give it a try. Thanks again to both of you. Steven
October 29th, 2010 1:06pm

I figured out I can just not map some of the output to the OLE Db Destination. I still am not getting the Derived Output task to generate an error -- even when I <add as a new column>. Frankly I am just baffled by the fact that a simple obvious error like a non-date string ("bad date") being set to a date value does not generate an error. But I am going to punt and let SQL Server do the error generation. Have to admit I am not finding SSIS top be what I hoped for. In any case, I appreciate the effort the two of you made in helping me. I am going to mark this one as answered. Thanks.
Free Windows Admin Tool Kit Click here and download it now
October 30th, 2010 9:01pm

This topic is archived. No further replies will be accepted.

Other recent topics Other recent topics