Debugging and data flow row counts incorrect
I've recently upgraded several packages to SSIS 2012, and I'm finding that the row counts shown in the data flow while debugging are entirely nonsensical. My data flow goes something like this at the moment: OLE DB Source - 138,292 rows RowCount - 325,974 rows Lookup - 730,972 rows Derived Column - 9,878 rows Lookup - 29,634 rows RowCount - 148,170 rows OLEDB Destination This is very different from what I saw in 2008 R2. The first row count and lookup should most certainly not exceed the number of rows that have been read from the source at any given time. Are these counts being displayed counting something different in 2012 than what they counted in 2008 R2? I can also see when it goes into post execution that in both the output windows/progress the number of rows that have been actually been written to the OLEDB Destination is an entirely different number than what is shown flowing through the data flow pipeline while debugging (251,1095). Another oddity - the data flow tasks never turn green, even after the control flow shows that the data flow task has completed successfully.
June 26th, 2012 12:43pm

I noticed after I posted this that Data Tools had eaten up 2.5 GB of memory! After re-starting it I'm seeing proper row counts in the data flow pipe. I think there must be a memory leak causing these issues - it's not the first time I have seen it since switching to SSIS 2012, not the first time it's grown to 2.5GB of memory usage.
Free Windows Admin Tool Kit Click here and download it now
June 26th, 2012 1:05pm

It must be your source, perhaps you use a wicked SQL to grab the records. You need to also make sure the memory gets inflated by SSIS, too.Arthur My Blog
June 26th, 2012 1:45pm

I don't think so. When it's between reading the data and inserting the data the memory usage in devenv is quite constant. I think the problem is that we have a package which runs other packages, and with each of these packages that is run in debug it seems to take up more and more memory. I'm currently opening them one at a time and running them, then closing on completion, but even now the memory usage is still climbing up 25-75MB with each new package that gets opened and executed, and never seems to drop when the package is closed after. Current package bumped memory usage from 903 to 937, then it settled down to 920. Executing the package brought it back up to 937 where it stayed for the duration of execution, and on closing the memory usage stays at 930. Next package it goes up to 943, calms down to 934, goes to 936 when executing (where it remains), jumps to 951 on completion, and on closing the package it stays at 945. Next package it goes up to 956, this one executed too fast to see what the memory was at during execution, but it's 962 on completion, and 970 on closing the package. 989, 976, 982 when exec, 993 on completion, 992 on close. 1004, 1022 when exec, 1016 on close. etc, etc, etc. If I close devenv, restart and just load up one of the packages - I start at 171, during exec it's 181, and after execution devenv is using 197MB of memory, and does NOT release it until devenv is restarted. I've even closed the solution, and it does not release any of the memory it has allocated. When it gets up over 2 GB of memory usage, the corruption issues in the dev environment that I outlined in my initial message start occuring. I have 8 GB of memory on this machine and it is definitely not maxing out even when it goes over that 2 GB mark.
Free Windows Admin Tool Kit Click here and download it now
June 26th, 2012 3:04pm

Here's an example screenshot of the phenomena from VMMap: I closed each package after execution, and at the end where it stays flat I closed the solution itself and left it for several minutes.
June 26th, 2012 3:50pm

Do you run packages from a master package? E.g. having a parent launching its children? If so consider using the out of process processing. This way each runs in its own space and thus cleans after itself.Arthur My Blog
Free Windows Admin Tool Kit Click here and download it now
June 26th, 2012 3:59pm

Yes, I do run from a master package normally - but the above examples were running the packages one at a time. I've also found that I can re-create the problem of strange row counts when running from a master package with just 5 packages open, and devenv using only 510 MB of memory (having grown only 30MB with the execution of the 5 packages). It's again a very simple package, reading directly from one table, doing a row count, derived column, lookup, and another row count, finally an insert. The records between the OLEDB source and the row count show 764,314, between the row count and derived column shows 592,012, between the derived column and lookup shows 627,356, between the lookup and the next row count shows 627,356, and between the row count and thd destination shows 874,764. An example screenshot that I took sometime after writing this: None of the subsequent portions of this data flow should ever have more rows than the prior portion. I could see the lookup maybe if there were duplicate values in the lookup table (but there aren't), however there's nothing I can think of to explain the row count tasks being so far off from the source/destinations. I can try executing out of process, but as Jaimie Thomson points out here that will result in child package failure NOT causing the parent package to fail. That is not desirable for our purposes (or anyone's, I would think), if that is actually the case.
June 26th, 2012 4:34pm

Here's the end result of that particular package, which is entirely different from the above pic taken while it was in the middle of executing: And the actual info written to those destinations from the progress tab: Oh, and the next package that ran in the sequence reported the row counts fine - and this particular package, if I open it and re-run it, also reports the row counts properly, but only if I close Data Tools and restart it.
Free Windows Admin Tool Kit Click here and download it now
June 26th, 2012 9:03pm

No insight on this issue?
July 3rd, 2012 11:50am

Sorry for the delay, I tried to build the same package, but then my laptop has experienced an issue. On the surface, the issue seems to be in the Lookup Component. What is in it?Arthur My Blog
Free Windows Admin Tool Kit Click here and download it now
July 3rd, 2012 12:17pm

Hi Arthur, The lookup is just for a very simple lookup table, there are no duplicate records in it. The OLEDB query contains a 'code' and the lookup table query is for the code and the corresponding ID. The numbers are actually off before the lookup is even reached though, as the rowcount directly after the OLEDB source has less rows than the derived column that comes after - and I don't think that should be possible as there is nothing in a rowcount or derived column task that would add rows. The results of the package execution are also exactly what I would expect if it had shown the proper numbers, it is only the dev env that shows the numbers all out of whack. I am 99% convinced now that this is a bug in SQL Server 2012 Data Tools - I'm just not able to re-create it, it only seems to occur randomly.
July 3rd, 2012 4:58pm

It could be a bug, how about filing a MS Connect bug?Arthur My Blog
Free Windows Admin Tool Kit Click here and download it now
July 3rd, 2012 5:03pm

I have a package that shows the same behaviour (incorrect rowcounts in DataFlow), and closing and restarting SSDT makes no difference. So this indeed seems to be some bug.
July 4th, 2012 11:34am

Let's submit a defect to MS ConnectArthur My Blog
Free Windows Admin Tool Kit Click here and download it now
July 4th, 2012 12:04pm

My suggestion is....... I think we need to compare it with SSIS2008 IF and that is IF you have the package in 2008 with the same data and etc..., check it, run it, and save the results. Then do the same thing with SSIS2012 and compare them together. Sincerely Nik -- Please kindly mark the post(s) that answered your question and/or vote for the post(s).
July 5th, 2012 11:15am

Hey Nik, My packages are actually all direct upgrades from SSIS 2008 - and they did not show this behavior there. The same package actually doesn't always show the same behavior in SSIS 2012 even. Arthur - I will file a connect issue when I am able to reproduce the behavior outside in a test case scenario that I am able to send to MS along with the issue. It's been my experience that issues on connect rarely go anywhere unless they can be easily reproduced. Unfortunately I'm not likely to have the time to do this for a month or more.
Free Windows Admin Tool Kit Click here and download it now
July 5th, 2012 3:05pm

then thats good that you have tested it so do as Arthur suggested. good luckSincerely Nik -- Please kindly mark the post(s) that answered your question and/or vote for the post(s).
July 5th, 2012 4:42pm

Arthur - I will file a connect issue when I am able to reproduce the behavior outside in a test case scenario that I am able to send to MS along with the issue. It's been my experience that issues on connect rarely go anywhere unless they can be easily reproduced. Unfortunately I'm not likely to have the time to do this for a month or more. It must be enough to have the images, let's do it, and we also have Peter.Arthur My Blog
Free Windows Admin Tool Kit Click here and download it now
July 5th, 2012 5:38pm

I've submitted this as a connect issue with a link back to this thread here: https://connect.microsoft.com/SQLServer/feedback/details/753266/data-tools-and-ssis-data-flow-row-counts-incorrect
July 11th, 2012 11:35am

Thank you, I have it my to do list to repro, but my VM with SSIS 2012 has died, besides, I have added my vote.Arthur My Blog
Free Windows Admin Tool Kit Click here and download it now
July 11th, 2012 3:41pm

In these cases, are you executing a single package in SSDT, or are multiple packages being run? (i.e. using execute package tasks) The way this works is that SSDT is listening for the OnPipelineRowsSent event. The event contains the component ID, and information about the path. When the event occurs, the SSIS designer finds the right path, and updates the UI with the new row count. Now, I've seen weird counts like this occur when a data flow task was copied/cloned from one package to another, which results in two tasks with the same ID (GUID). If both packages are open (or get opened during the run), the designer can get confused and apply the count to the wrong data flow path. I'm not sure if this is the actual issue - there could be another bug there - but it's behavior we've seen and tried to fix in the past. If you don't see this issue when only one package is being run, then this is probably the root cause. (One thing to test out is to add in a Row Count transform. If Row Count is giving you the wrong numbers, then this becomes a more serious problem)
July 12th, 2012 3:11pm

Hi Matt, We do indeed generally run these packages from an execute package task, and I often do have many packages open at the same time that were all derived from a base package. That said, when I run package 1, which executes an additional 20 packages, I sometimes see the issue, sometimes not, and it almost seems to occur randomly among these packages. I did also see the something similar happen over the weekend when I was running one of the packages without using the execute package task - it wasn't quite the same issue with rowcounts, but the package had failed and did not appear to stop executing. I only noticed it had failed when I went to the data flow and saw the dreaded red, but the spinners were still spinning away. I don't recall what other packages I had open at that time, but I'll try and keep that in mind for the future. It's a similar problem to the 2nd screenshot of my data flow above - in that the package was finished executing, but the data-flow tab still had the indicators for executing tasks going. Our row counts are definitely giving the right numbers though.
Free Windows Admin Tool Kit Click here and download it now
July 16th, 2012 10:31am

This topic is archived. No further replies will be accepted.

Other recent topics Other recent topics