XML Source - Guidance?
I have a large (4 GB) xml file. I use the SSIS XML Source, generate anxsdand conversions/destinations and this does a great job of creating the multiple outputs including cross reference tables with autogenerated id's (I have no script task or component, so they must be auto). These load nicely into the six tables. The guidance I seek is what should I do for the next xml source file as it will be an incremental data addition to the base file. I don't see any obvious way to keep the package from autogenerating id's that may conflict with the existing ids. Am I missing the obvious? It may be, as I am enjoying SSIS, but I am not an expert at all. I can post pertinent items if needed. Thank you for any guidance.
July 12th, 2007 3:42pm
SSIS doesn't auto-generated IDs, as far as I am aware. Are you sure the IDs aren't coming from identity columns or some other process?
July 12th, 2007 11:29pm
The auto generated ids added as output columns by the xml source adapter idswill not produce a name conflict with an xml attribute/element of the same name, so the answer is, "don't worry about it". For example, the following xml document has an element withthe name of an output column for which theSSIS xml adapter would normally generate an output column named ("batters_Id"). However,this attribute'sis already "batters_Id". In response,the xml source adapter will instead add "<output>_Id_1" instead, ratherthan the default output column named "<output>_Id". Code Snippet <items><item id="0001" type="donut"><name>Cake</name><ppu>0.55</ppu><batters batters_Id="10"><batter id="1001">Regular</batter></batters><topping id="5001">None</topping></item></items>
July 13th, 2007 6:45am
jaegd, I understand and thank you for the response! Using your fine example, my thought is let's run the package again using a new xml file. Won't the source adapter find a "batters_Id" and assign it a value of "10" again? This would prevent insertion of the new record if the "batters_Id" is the primary key. I am sure there is a simple answer, I am not experienced enough to find it yet. Thank you very much. This is great information!
July 13th, 2007 2:02pm
The xml source adapter's column names for auto-generated columnsthat won't produce conflicts at the column name level, whereas the auto-generated values of those columns will (since they're not guids). So, if two different xml files of the same schema have load attempts made, saythe first a baselinesource followed by an incremental source, there highly likelywould be a conflict on auto-generated ids, so you would have tohandlethat occurance. Rather than using the ids produced by the source adapter (which are very close to the concept of row numbers and intended for join purposes), treat the xml data like any other source for which new/existing entities are to be detected and routed properly.
July 13th, 2007 7:07pm
hi BDH, I'm facing exactly the same scenario you mentioned. Everyday I'd have several XML files to be imported to the same schema. Now on each day, for each XML file, SSIS give me the same set of values for those _id fields (it starts from 1 for each file). Do you have any luck on a solution? thx xxgeorge
April 29th, 2010 6:36pm
Hello, We may insert XML data through .Net application into the database via the usage of stored procedure. The whole program has been illustrated in the best way in this link. You may visit it... http://www.visiontechno.net/studymats/xmlinsertion.html
November 4th, 2010 4:09am