SSIS custom data source component - buffer overflow handling
Developing a SQL Server Integration Services Custom Data Source Data Flow Component. SSIS has a maximum buffer size and row limit to "chunk" work based on buffer size. This requires a source component to recognize when the buffer is full, and then suspend until another buffer is available and resume loading from the source to the new buffer. What I don't see covered anywhere is: - how to effectively deal with a full buffer > calculate the size, count rows and ensure closing the buffer before overflow (except that I can't see how to get the max size or max rows from the parent data flow task) > wait for a buffer overflow exception and rewind to the last row/bytes written to resume with a new buffer. - how to store the "bytes/rows to skip" information for the component to pick up the next time it's invoked by SSIS.Neil Thomson
August 14th, 2012 11:29am

Neil I do not think you need to have "a source component to recognize when the buffer is full". Why would you?Arthur My Blog
Free Windows Admin Tool Kit Click here and download it now
August 14th, 2012 12:11pm

Neil, if you think your post is was a mistake then I can delete it, but you may also choose to accept the above as an answer. In any case making it abusive does not make it good to your profile.Arthur My Blog
August 14th, 2012 7:06pm

I'd missed this in my earlier reading.... From http://msdn.microsoft.com/en-us/library/ms136088.aspx "There is no way to determine when a batch of rows has been sent to the next component because the movement of rows by the data flow task is transparent to the component developer, and the RowCount property is always zero on output buffers. When a source component is finished adding rows to its output buffer, it notifies the data flow task by calling the SetEndOfRowset method of the PipelineBuffer, and the remaining rows in the buffer are passed to the next component." While the above clarifies that the source component doesn't need to worry about overflowing the [output] buffer, there is still the question of how the SSIS system stops the source from pushing more rows into pipeline if there are no more buffers available. All I can think of is that the AddRow() delays responding to the source component until buffers are available - effectively suspending the source component. Just making sure that all the scenarios are covered.... Neil Thomson
Free Windows Admin Tool Kit Click here and download it now
August 14th, 2012 7:34pm

... While the above clarifies that the source component doesn't need to worry about overflowing the [output] buffer, there is still the question of how the SSIS system stops the source from pushing more rows into pipeline if there are no more buffers available. All I can think of is that the AddRow() delays responding to the source component until buffers are available - effectively suspending the source component. Just making sure that all the scenarios are covered.... Neil Thomson It does not stop the source, it shovels all the rows into the runnable memory - split into buffers, then the buffer(s) may spill to disk if the runnable memory becomes full. In fact, there is very little control over what SSIS does to the source if at all. This of this - you query a distant Oracle box, and you want SSIS to fiddle with the input from it?Arthur My Blog
August 14th, 2012 7:48pm

... While the above clarifies that the source component doesn't need to worry about overflowing the [output] buffer, there is still the question of how the SSIS system stops the source from pushing more rows into pipeline if there are no more buffers available. All I can think of is that the AddRow() delays responding to the source component until buffers are available - effectively suspending the source component. Just making sure that all the scenarios are covered.... Neil Thomson It does not stop the source, it shovels all the rows into the runnable memory - split into buffers, then the buffer(s) may spill to disk if the runnable memory becomes full. In fact, there is very little control over what SSIS does to the source if at all. This of this - you query a distant Oracle box, and you want SSIS to fiddle with the input from it?Arthur My Blog
Free Windows Admin Tool Kit Click here and download it now
August 14th, 2012 7:50pm

This topic is archived. No further replies will be accepted.

Other recent topics Other recent topics