What will be the best way to Extract the XML stored in a Table
Hello, What will be the best way to extract XML data saved in a Column of one table. The XML Data is big in size some time ~10MB and will be extracted into ~15 tables. I was thinking to use "XML Source" Data Flow component and use its "XML Data" property. So how can I have the column data (XML) assign to the XML Data here? Note: All XML in the column are of same schema and I have the schema stored in a share folder that SSIS can access. Also there may be many solutions, which I love to learn, but using the "XML Source" component (as asked above) will be good for me now, as the flow from another SSIS package is ready to reuse (which use similar XML, but form a folder/location). Thanks.
December 27th, 2010 2:09pm

use two data flow task, in first data flow, use an OLEDB Source and select the xml column from source table, then connect it to a RecordSet Destination, Note that you should set an OBJECT type package variable in recordset destination, so you should create this variable in package scope before this step, let's name this as AllDataVar , after configure it in recordset destination and map columns, back to control flow create new string type variable in package scope, let's name this as XMLDataVar add a foreach loop container, set enumerator as ado, and set AllDataVar as source variable, then in variable mapping tab, set variable as XMLDataVar and index as 0. in the foreach loop container add another data flow task , in the this data flow task, put an xml source, set source from variable, and set XMLDataVar as source variable, then set an oledb destination to any table you want to be as destination. note that you should set xsd schema in xml source appropriately. http://www.rad.pasfu.com
Free Windows Admin Tool Kit Click here and download it now
December 27th, 2010 2:21pm

Thanks Reza. I will try this today and let me see how this goes. Thanks.
December 28th, 2010 1:05pm

Hi Reza, I tried to follow your stpes for a sample table and it was failing with the For Each Loop with below errors: Error: ForEach Variable Mapping number 1 to variable "User::XMLDataVar" cannot be applied. When I set the Enumeration mode = All tables (ADO.NET dataset only) [XML Source [1]] Error: The component "XML Source" (1) was unable to read the XML data. [DTS.Pipeline] Error: component "XML Source" (1) failed the prepare phase and returned error code 0xC02090D0. When I set the Enumeration mode = Rows in all the tables 9ADO.NET dataset only "OR" rows in the first table. What could be the issue in my setup here? I can see the first dataflow task success and the for each loop fail for the XML Source component. Here is what I did: For each loop -> Edit -> Collection Tab -> Enumerator = Foreach ADO Enumerator Ado Object Source variable = AllDataVar Variable Mappings -> XMLDataVar -> Index = 0 (Is this should be set some where before?, I only have a String variable with this name) Add DataFlow inside this loop Add Xml Source -> Edit -> XML from variable : XMLDataVar I am stil wondering where we are setting the data for XMLDataVar. Note: For my OLEDB Source I have SQL Command Text as below: SELECT QueueId, TransId, XmlFile FROM dbo.TestQueue So is there something I shoudl change / or the Index of the XMLdatavar should change? Thanks, Prabhat
Free Windows Admin Tool Kit Click here and download it now
December 29th, 2010 6:33pm

did you create first data flow task? I mean this step: "in first data flow, use an OLEDB Source and select the xml column from source table, then connect it to a RecordSet Destination, Note that you should set an OBJECT type package variable in recordset destination, so you should create this variable in package scope before this step, let's name this as AllDataVar , after configure it in recordset destination and map columns," this step will fill data from TestQueue table into AllDataVar. and don't worry about how to fill XMLDataVar, what you configured in variable mapping tab of foreach loop will fill XMLDataVar in each iteration. http://www.rad.pasfu.com
December 30th, 2010 12:14am

did you create first data flow task? I mean this step: "in first data flow, use an OLEDB Source and select the xml column from source table, then connect it to a RecordSet Destination, Note that you should set an OBJECT type package variable in recordset destination, so you should create this variable in package scope before this step, let's name this as AllDataVar , after configure it in recordset destination and map columns," this step will fill data from TestQueue table into AllDataVar. and don't worry about how to fill XMLDataVar, what you configured in variable mapping tab of foreach loop will fill XMLDataVar in each iteration. http://www.rad.pasfu.com
Free Windows Admin Tool Kit Click here and download it now
December 30th, 2010 12:14am

Yes I have done the first step to have the First Data Flow task as per the steps. And this work and I can see the records assigned to the Record Set destination using a dataviewer. Thanks.
December 30th, 2010 12:50am

OK, and another thing: what you have in your tables? do you have XML data on each record of TestQueue? or just you have xml file path? ( my first assumption was that you have xml data in column directly )http://www.rad.pasfu.com
Free Windows Admin Tool Kit Click here and download it now
December 30th, 2010 1:05am

Yes I have XML Data in the column. QueueId int SubmissionId int XmlFile xml InsertTime datetime Thanks.
December 30th, 2010 1:14am

based on the column number in the object variable, you should set index in variables mapping. for example if your select query in first data flow task is: SELECT QueueId, TransId, XmlFile FROM dbo.TestQueue then you should use Index 2 with variable XMLDataVar in foreach variable mapping tab, also you can set second Data flow "DelayValidation" property to true, this is because of suppress validation errors on compile time.http://www.rad.pasfu.com
Free Windows Admin Tool Kit Click here and download it now
December 30th, 2010 1:17am

Let me try this and let you know in few min. Thanks.
December 30th, 2010 1:25am

Just tried and get below error after I change the index = 2 and delay validation for the DataFlow = true and also set true for Package level. Foreach Loop Container Error: The enumerator failed to retrieve element at index "1". Error: ForEach Variable Mapping number 1 to variable "User::XMLDataVar" cannot be applied. Shall I share the package with you so that you can have a look?
Free Windows Admin Tool Kit Click here and download it now
December 30th, 2010 1:30am

you can mail it to me, and I will take a look, my address : a dot raad dot g at gmail dot comhttp://www.rad.pasfu.com
December 30th, 2010 1:31am

Sent for your review with sample XML file and table creation script. Thanks.
Free Windows Admin Tool Kit Click here and download it now
December 30th, 2010 1:36am

Sorry for delay in reply. Actually I was in vacation and could not update here. The problem is resolved with the Help of Reza. many thanks for Reza here. Actually the solution that Reza gave should work but looks like there is some issue/bug in "xml data from variable". So we tried to create a script task inside foreach right before data flow task, to load it to file and then use this temp xml file in data flow. And this solved the issue. The script task looked like: Public Sub Main() Dim sw As New System.IO.StreamWriter("d:\temp.xml", False) sw.Write(Dts.Variables("User::XMLDataVar").Value.ToString()) sw.Flush() sw.Close() Dts.TaskResult = ScriptResults.Success End Sub And now all good and working.
January 19th, 2011 6:19am

This topic is archived. No further replies will be accepted.

Other recent topics Other recent topics