Validating Excel data in SSIS data flow using script component
Hi
I have a set of packages using excel source to extract data from excel files and I need to validate the data in the data flow (Not control Flow. The data in Excel looks like this
Timestamp
Value
1/6/11 0:01 2257309.00
1/6/11 0:06 2257328.00
1/6/11 0:11 2257348.00
1/6/11 0:16 2257369.00
.....
What I have now in data flow is : Excel Source->Sort->Script Component (To add row numbers) ->Row Count
I need to take the first 3 rows and check the timestamp values to verify they are in the same time interval i.e every 5 minutes. If not, then generate an exception code and
How can I validate the data in the Data flow using the Script Component task?
Thanks,
Mcr1322MC
May 27th, 2011 3:19pm
You may use a Split perhaps (at least makes logical sense to me) or you may want to use Script Transformation Component - http://www.sqlservercentral.com/articles/Integration+Services+%28SSIS%29/64766/
Arthur My Blog
Free Windows Admin Tool Kit Click here and download it now
May 27th, 2011 3:53pm
Using a transformation script (you can use the same one as you've already got to add row numbers) you'd have a saved "last seen date/time" variable (much like the row count variable you already have). Initialize it to DateTime.MinValue in
the PreExecute. In ProcessInputRow, if it's value isn't MinValue, then use code like this:
TimeSpan difference = Row.timestamp - this.lastSeenTimeStamp;
if (difference.TotalMinutes > 5)
{
ComponentMetaData.FireError("", 0, "Your error message", 0, "");
}
(The arguments to FireError may be off - that's all from memory.) The above is in C#, but VB.Net is quite similar.
Talk to me now on
May 27th, 2011 5:51pm
Hi Tood,
How can I access the time stamp value of the previous row? It seems that I am missing something with the declaration/use of the lastseentimestamp variable an I am unable to compare.
Could you please clarify?
thxs
mcr1322MC
Free Windows Admin Tool Kit Click here and download it now
May 31st, 2011 6:37pm
Add a variable to your script in the same place as you've defined your row count variable. (You are maintaining a "row count" variable in order to place the row count on the next row, right?)
You don't "access" the timestamp on the previous row. You "remember" it for the next row.
If you can't figure it out, post your code.
Talk to me now on
May 31st, 2011 9:20pm