SSIS variable reset
hi iam using a rowcount task and after that a derived column task and the variable which is used in the rowcount task is assigned to a new column in the derived column task.the variable is getting reset to 0 when assigned as a new column. any reason why ? Thanks
November 18th, 2010 7:26am

Because the Row Count task is not done counting rows until ALL the rows have passed through it. It's like this: you want to paint on the side of each sheep passing through your barn door the total numer of sheep in your pasture. As the first sheep passes through the door, what number do you paint? You won't know until ALL the sheep are in, and by then, you have lost the opportunity to do the marking.Todd C - MSCTS SQL Server 2005 - Please mark posts as answered where appropriate.
Free Windows Admin Tool Kit Click here and download it now
November 18th, 2010 9:38am

what is the scope of your variable. make it at package level and get the row count , it should retain the count till the packahe is running .Ashwani Roy Blog Twitter Please click the Mark as Answer button if a post solves your problem!
November 18th, 2010 9:40am

Sorry, Ashwani Roy, but I must disagree. You can't KNOW how many rows have been counted until they have all passed through the Row Count object. The solution would be this: Use the Row Count transform to count the rows, then use an Execute SQL Task AFTER the Data Flow task, and have a command of : UPDATE <MyTable> SET <MyColumng> = ? and map the Paramter to the User::RowCount variable.Todd C - MSCTS SQL Server 2005 - Please mark posts as answered where appropriate.
Free Windows Admin Tool Kit Click here and download it now
November 18th, 2010 10:24am

hi iam using a rowcount task and after that a derived column task and the variable which is used in the rowcount task is assigned to a new column in the derived column task.the variable is getting reset to 0 when assigned as a new column. any reason why ? Thanks Hi all, the Rowcount should be before that the Derived column using the variable to get the number. Link the Rowcount output pipeline with derived column to create a column based on variable previously set. Vctor M. Snchez Garca (ES) (BI) Hope this help. Please vote if you find this posting was helpful. if this is an answer to your question, please mark it.
November 18th, 2010 10:50am

Guys: The Row Count is a synchronous Transform, in that rows just pass right through it without stopping. So they would pass right through and on to the Derived Column Transform downstream of it. Which, by the way, is ALSO a synchronous transform. So here's the scenario: You kick of the package and it reads the first 10,000 rows and they pass through the Row Count and the Derived column. Then another 10,000, and another. By the time the Row Count has got its final tally of all the rows, THEY HAVE PASSED THROUGH IT AND THE DERIVED COLUMN TRANSFORM AND ARE MOST LIKELY ALREADY WRITTEN TO THE DESTINATION. The Derived Column Transform can only update rows that ARE CURRENTLY PASSING THROUGH IT, it cannot touch the ones that are already past. You need a way to break the operation into two steps. TWO operations, or put in an ASYNCHRONOUS transform BETWEEN the Row Count and the Derived Column such that the rows would be counted, then bottled up/paused until they are ALL counted, then they would be released to the Derived Column transform. BUT, even still, I don't know if the Run-Time engine will release the lock on the Row Count variable while the Data Flow task that contains it is still running. Maybe it will, maybe it won't. To be sure, use my first suggestion: an Execute SQL task run AFTER the Data Flow.Todd C - MSCTS SQL Server 2005 - Please mark posts as answered where appropriate.
Free Windows Admin Tool Kit Click here and download it now
November 18th, 2010 11:02am

Todd, perfectly explained. But the question asked by BLR_ETL was about why the variable used on RowCount component returns 0. This is because the new column (created on derived column based on variable) is being set before the count. As i understand. The scenario question dont require multiple passes mantaining the rowcount value. And Derived Column is syncronous, really? how it kick out rows before accept all in? Also you indicate that the Execute SQL component will update all rows... maybe he want set the number of rows inserted in a process... I think that the asker should say something more... Regards!Vctor M. Snchez Garca (ES) (BI) Hope this help. Please vote if you find this posting was helpful. if this is an answer to your question, please mark it. http://bifase.blogspot.com | http://twitter.com/atharky
November 18th, 2010 11:17am

It's not even feasible to do this with asynchronous scripts unless you get REALLY tricky. Even though you can (theoretically) count the rows and ensure you've seen them all before you pass the next buffer on, you have to update the SSIS variable. You will have to do that update manually by locking and unlocking the variable yourself. Unfortunately, you can't do that - because the Derived Column has already locked the variable in the PreExecute phase and won't let go until PostExecute. So to get around that, you'd have to code a Script component that locked the variable for reading when it received rows - not in the PreExecute step - and unlock it again (manually). Don't start down that road. You won't like where you end up. Use two Data Flow Tasks in series. Persist your data between them to a RAW file or a SQL table. Talk to me now on
Free Windows Admin Tool Kit Click here and download it now
November 18th, 2010 11:21am

And Derived Column is syncronous, really? how it kick out rows before accept all in? Victor: "Syncronous" basically means "one row in = one row out, without passing Go or collecting $200". As fast as they come in, they go out. Now, wether they go out door number one or door number two is determined by the logic in the component. Watch a Sort transform some time. All the rows come in, they get sorted, THEN they get passed along. That's an asyncronous task. Aggregate is another. Script Tasks can have either (or both) types of outputs (I think). Todd M: Thanks for clarification on why staging rows with an asyncronous transform wouldn't work. It was a "I wonder if this would work" kind of answer. Not surprised it won't.Todd C - MSCTS SQL Server 2005 - Please mark posts as answered where appropriate.
November 18th, 2010 11:32am

Im sorry Todd C, i changed the meaning between synchronous and asynchronous, but meant asynchronous. Maybe i did not understand the question ... if you first put a RowCount and assigned to the variable and then use a derived column to add a new column to flow, the variable would contain the number of rows or not?Vctor M. Snchez Garca (ES) (BI) Hope this help. Please vote if you find this posting was helpful. if this is an answer to your question, please mark it. http://bifase.blogspot.com | http://twitter.com/atharky
Free Windows Admin Tool Kit Click here and download it now
November 18th, 2010 12:41pm

No, it would not. Talk to me now on
November 18th, 2010 12:44pm

I was totally wrong, and I appreciate the "collateral" support to learn. really. I do not want to leave without proposing another option without breaking the process into two parts, but getting the number of rows as a new column in the same flow (maybe loosing performance but...). If cant stage the data or do updates on destination (flat files, raw, etc) can be valid... i hope With this thread as source, i have wrote a post in my blog, please check it (http://bifase.blogspot.com/2010/11/utilizando-el-rowcount.html). Is in spanish, but at least review the last image. Solution in a single data flow: Flat file source -> derived column [add column JoinKey-(DT_I1)1] -> Sort [by JoinKey] -> MultiCast -> Multicast Output01 to Aggregate [Count All, Group by Join Key] -> Sort[by JoinKey] -> Merge Join, right input-> Multicast Output02 to Merge Join, left input -> Merge Join [ Inner join by JoinKey, Select data from left, select Count column from right] Too degradation in performance? Víctor M. Sánchez García (ES) (BI) Hope this help. Please vote if you find this posting was helpful. if this is an answer to your question, please mark it. http://bifase.blogspot.com | http://twitter.com/atharky
Free Windows Admin Tool Kit Click here and download it now
November 18th, 2010 2:27pm

ya I agree with you.. I cannot read everything on my mobile ... Ashwani Roy Blog Twitter Please click the Mark as Answer button if a post solves your problem!
November 18th, 2010 3:05pm

Thank you all for sharing your inputs it was worth a good learning.i used a Esql task which Todd also mentioned and that solved the problem
Free Windows Admin Tool Kit Click here and download it now
November 19th, 2010 12:10am

Yes, it would work. So would eating soup with a fork, if you have enough time. BTW, the Sort transform is also asymchronous, and your solution adds what, two of them? along with the Aggregate (also asynchronous). Basically, you STILL have to wait for ALL the rows to be counted before you can START updating each row with that number. All on one Data Flow or on two, or an Execute SQL, there's no way around it. You can't know the END number until you GET to the end, and once you get there, you have to start again at the beginning to do the update.Todd C - MSCTS SQL Server 2005 - Please mark posts as answered where appropriate.
November 19th, 2010 6:55am

Yes, yes... i understand it with your previous explain. But i just wanted propose another solution... and don´t seem too bad to some scenarios, but not is yours. Víctor M. Sánchez García (ES) (BI) Hope this help. Please vote if you find this posting was helpful. if this is an answer to your question, please mark it. http://bifase.blogspot.com | http://twitter.com/atharky
Free Windows Admin Tool Kit Click here and download it now
November 19th, 2010 7:30am

This topic is archived. No further replies will be accepted.

Other recent topics Other recent topics