Load XML files into tables via SSIS
Hi I am new to SSIS and am looking to load XML files (with a DTD definition) into tables via a SSIS package. I have created a XML task and am able to load the XML and output it to file. I have also stripped out the DTD definition and am able through a dataflowusing XML source an OLE db Destination load and map the XML to table sin my DB. But have no idea how to get the data in when it has a DTD definition included. I either want to put each file into a row in a table then query it. Or from the SSIS package input the relevant info into a set of staging tables or the real tables. Any help/pointers would be appreciated.
July 19th, 2007 4:02pm

If you use "XML Source"- a Data Flow Source, you should insert the xml data directly using a Ole DB Destination (can be found in Data Flow Destinations of the Toolbox)Gigi Ciubucwww.sqlserver.ro
Free Windows Admin Tool Kit Click here and download it now
July 19th, 2007 6:27pm

XML Source doesn't like the XML file with the DTD reference in it. When I tell it to use inline schema it errors with DTD is prohibited in this XML document. And when I use anXSD previosly generated it complains about DTD being pohibited when I run it. 2 options I have (well - that I can think of) is to have a step to strip out the DTD reference then save the file (I have tried this manually and it works - although I have thousands of XML files so I will need to automate this process) and use with OLE DB OR read the XML file into a variable and use this in the data flow and have XML source use it. I am currently testing this method at the moment but have a problem in that I can read the Nodelist into a variable using XML task then open a Data Flow and attempt to use the variable in XML Source (using XML data from variable) and then load with OLE DB, but when I run it it errors with "The component XML Sorce 1 was unable to read the XML data. ata at the root level is invalid. Line 1, position 1. This was using PutResultsInOneNode False. But I can output the XML to file using the same XML task and it looks well formed. If I set PutResultsInOneNode True then it wraps ResultRootNode round the XML (but I already have a root node) and does not insert any rows. I have run out of ideas with this method. I dont know if there is way to check the XML variable is actually populated within the Data Flow before I use XML source, if anybody knows a methed then let me know. I would assume that the variable would still be populated between the Control flow and the Data Flow. Anyway - back to the drawing board for me.
July 19th, 2007 7:43pm

When you use "Xml Task" did you choose for property "Operation Type" the value "Validate" then for "Validation Type" property the value "DTP" (default is "XSD")? What did you get?
Free Windows Admin Tool Kit Click here and download it now
July 19th, 2007 11:25pm

Since the xml source adapter doesn't accomodate DTDs, strip out the DTD, as you mentioned.Use the Xml Task with the following settings. Operation Type: XSLT Source Type: Variable Source: Variable's name containing the xml text Save Operation Result: True DestinationType: Variable OverwriteDestination: True Destination: Variable's name which is tocontain the originalxml minus the DTD. SecondOperandType: Variable SecondOperation: Variable containing the following text Since XSL doesn't know about DTDs, telling it tocopy everything strips out DTDs. Then use the Variable specified in the Xml task's SecondOperand as the Source data for the xml source. Code Snippet <xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> <xsl:template match="/"> <xsl:copy-of select="." /> </xsl:template> </xsl:stylesheet> A noteon how to paste a multi-line xml document into a Integration Services String variable: Integration Services String variables textboxes are not multi-line, in the Windows sense of a line (CR+LF), So, in order to paste multi-line text (which xml docs almost always are), save a temporary copy with a unix line ending. That is, create an xml file in visual studio, and paste your sample original xml in there. Go to File/Advanced Save options, and save the xml with the the settings of Encoding: Unicode (utf-8 without signature) - CodePage 65001, and most importantly, set the Line endings dropdownto "Unix (LF)". After selecting "OK", copy and paste the text from Visual Studio's xml file editor into the IS variable, and you'll note all the xml data appears.
July 20th, 2007 8:16am

ggciubuc wrote: When you use "Xml Task" did you choose for property "Operation Type" the value "Validate" then for "Validation Type" property the value "DTP" (default is "XSD")? What did you get? Do you mean what did I get when I run it, if so it passes validation having a step as you describe. My package contains 2 XML Tasks 1 to validate the file and the 2nd to load it. The validation works ok but i'm stall having the same trouble loading the file.
Free Windows Admin Tool Kit Click here and download it now
July 20th, 2007 11:05am

jaegd wrote: Since the xml source adapter doesn't accomodate DTDs, strip out the DTD, as you mentioned.Use the Xml Task with the following settings. Operation Type: XSLT Source Type: Variable Source: Variable's name containing the xml text Save Operation Result: True DestinationType: Variable OverwriteDestination: True Destination: Variable's name which is tocontain the originalxml minus the DTD. SecondOperandType: Variable SecondOperation: Variable containing the following text Since XSL doesn't know about DTDs, telling it tocopy everything strips out DTDs. Then use the Variable specified in the Xml task's SecondOperand as the Source data for the xml source. Code Snippet <xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> <xsl:template match="/"> <xsl:copy-of select="." /> </xsl:template> </xsl:stylesheet> A noteon how to paste a multi-line xml document into a Integration Services String variable: Integration Services String variables textboxes are not multi-line, in the Windows sense of a line (CR+LF), So, in order to paste multi-line text (which xml docs almost always are), save a temporary copy with a unix line ending. That is, create an xml file in visual studio, and paste your sample original xml in there. Go to File/Advanced Save options, and save the xml with the the settings of Encoding: Unicode (utf-8 without signature) - CodePage 65001, and most importantly, set the Line endings dropdownto "Unix (LF)". After selecting "OK", copy and paste the text from Visual Studio's xml file editor into the IS variable, and you'll note all the xml data appears. Yes this worked. The DTD details were stripped out and I can access the columns and put in the relevant tables. Thx
July 20th, 2007 11:41am

Hiya, I've been following this thread after my manager gave me a project to insert XML data into SQL2005 from an online XML Feedand thought that SSIS would be a good way to do it. Now i'm attempting to follow the steps Jaegd had wrote and when I do I get the following error when I execute the package: [XML Task] Error: An error occurred with the following error message: "Data at the root level is invalid. Line 1, position 1.". [XML Task] Error: Property "New Source" has no source Xml text; Xml Text is either invalid, null or empty string. What am I doing wrong to get this error? I am probably being really stupid but I have followed the steps given and it appears that it has no XML text to look at? Is this correct? Any help would be greatly appreciated, Thanks, Dan
Free Windows Admin Tool Kit Click here and download it now
September 29th, 2008 6:52pm

is it possible to set Source: Variable's name containing the xml file location like C:\data\xml\my_file_00001.xml ? or is it possible to load the xml data from a xml file into a variable?
November 5th, 2010 5:44am

This topic is archived. No further replies will be accepted.

Other recent topics Other recent topics