Download http request results to memory instead of file?
I have a requirement to download a gazilion finance related files from serveral http sites and load to SQL Server. I was able to use the script task and leverage the webclient object to download the requested data to files; no sweat. Then in a file task I read in the file into the process flow (w/in the data flow) and the rest is history. Is there a way I could bypass the "downloading to a file and re-reading it in step"? I'd like to simply stream the request using webclient.DownloadData(). My question is how do I get the resulting bytearray into the process/data flow so that I can transform/load where ever I like? I'm also interested in if this will help with performance. I'm thinking is should but will this turn into a memory hog then and slow things down. Also, If I have to use the download and read the file approach, I'm concerned about babysitting the temp file downloads. Would I need to continuously clean up after myself for every download? Thanks in advance, Bob
April 13th, 2011 11:43am

Yes, you can. Take the code you've currently got in the Script Task, and place it into a Script component inside the Data Flow. With a couple modifications, you can get what you want. First, add a Script to the Data Flow, and configure it as a Source. Inside the Script editor (not the code editor), you'll have to go to the Input/Output tab and define all the columns that you expect to generate from the file you're downloading. (You're defining the rowset structure here.) Second, go to the code editor. Paste in your Script Task code to the OutputRows method, and remove all the "save to file" bits. Add in some parsing code that will read your bytes and pull out the "columns" - using tools like Encoding.ASCII.GetString. Call the AddRow method of Outpu0Buffer, and place those column values into Output0Buffer properties. When you reach the end of the file, call Output0Buffer.SetEndOfRowset. Talk to me now on
Free Windows Admin Tool Kit Click here and download it now
April 13th, 2011 12:21pm

Thanks Todd, that's good news. I'll give it a shot. If by chance if you have a simple example of this, I'd be forever grateful. Otherwise, I'll push thru it. Thanks, Bob
April 13th, 2011 12:38pm

Todd, will I still be able to do this if I set the ScriptComponent up as a Transformation type? The reason is I have some input (used to build the http URL) in addition to my http request results. When I set it up as a Transformation, I don't quite understand where I place my output logic as you described above. I only see the Input0_ProcessInputRow and not the OutputRows method found if I specify a source type. Sorry to be so confusion, hopefully you can help me out, Thanks again, Bob
Free Windows Admin Tool Kit Click here and download it now
April 13th, 2011 3:59pm

You can set it up as a transform. The question I would have for you then is that I'd have to assume you're going to push "out" many more rows than you see comin "in" to the component? I have to assume so - which means that you need to use the "Transform" model - but configure your script as an asynchronous one. Have fun with that :) Come back when you run into difficulties. Talk to me now on
April 14th, 2011 4:04pm

This topic is archived. No further replies will be accepted.

Other recent topics Other recent topics