SSIS Bug when handling CSV Files with commas embedded in quoted strings
My group and I have been wrangling with this for a couple of days, and we're stumped.Is the csv snippet "This string", 1234, "This, another string", "thatstring" has been throwing errors that look like it is using the comma inside the quoted string "This, another string" as a delimiter, and causing the row to grow by one field.We've looked on the web, and apparently this is a common problem.I just find it hard to believe that something so major is still a bug.We're using SQL 2005 SP1 btw.Help,Doug
July 31st, 2008 11:07am
Double click Flat File Connection Manager, input "Text qualifier" as ". This can guarantee anything between "" will stay its original meaning.
July 31st, 2008 12:02pm
Already tried that, the problem persists.You haven't heard of this problem before?
August 1st, 2008 8:30am
Could you make a sample data and package and send email to me? I will take a look. And, upgrade to sp2 may be a good choice.
August 2nd, 2008 11:19am
Yes. I have. Its apparently a bug. I have seen it posted as a bug in other threads. Here: http://forums.microsoft.com/MSDN/ShowPost.aspx?PostID=2384028&SiteID=1 What I would like to know now, is, where can we go to see the actual list of bugs AND something we can show our bosses/clients... I would like to point them to something to read because they're beginning to look at me as if I cant do the work.They want to verify what I am telling them about SQL 2005(SSIS) and its inability to handle simplistic tasks without me having to code several extra hours of "file pre processing" that should (was) already in the product. This is very frustrating and isnt a great way of being introduced to the new technology. I held off moving to "dot net" for a long time because what I had been using (VB6/SQL 2000) was working perfectly fine and I am going to go back to what works. I mean, Excel has a checkbox that asks "treat consecutive delimiters as one?". So then why doesnt SSIS? These are from the same company, no? Ugh.
August 6th, 2008 10:14am
Maybe you can try this one: http://www.microsoft.com/downloads/details.aspx?FamilyID=B51463E9-2907-4B82-A353-E15016486E1D&displaylang=en Overview UnDouble is an SSIS dataflow component that:- Removes quotes that bracket input strings if present.- Replaces doubled quotes with single quotes inside the string.The SSIS flat file adapter doesn't process qualified strings where double quotes are used to indicate a single embedded quote.This transform takes a text column, and, removes bracketing quotes if present, plus replaces double quotes inside the text with sinqle quotes.SSIS users that want to duplicate DTS 2000 flat file processing can call their output column unqualified, then place this transform to clear up the quotes.The conversion is performed in-place. Select the desired string column on the "Input ColumnsTab" of the advanced editor, then change its Usage Type to READWRITE.Included in this install is 4columns.txt, a file that contains a variety of quote/character combinations. Use a flat file source to feed this data to the UnDouble component. This component is part of a series of components that illustrate increasingly complex behavior, each one exercising a greater proportion of the SSIS object model. If studying in order, this component follows SeeBuffer, and precedes ValidUnDouble. And here is another workaround: http://forums.microsoft.com/MSDN/ShowPost.aspx?PostID=1513072&SiteID=1
August 6th, 2008 10:21am
Thanks for that! I was in the process of writing a script that would do something similar.I also looked at the other components and downloaded some that might be ofa bitof help. As is often the case, its just a matter of finding the answer somewhere. Do you knowof any downloadable component, somewhere, that would allow for the importing of a variable record length text file? I have a file which has records that vary in length (127 bytes or 323 bytes) depending.. I wrote my own script to process it but was wonder if there is a download as I like to see other ways of handling it too.. Thanks!
August 6th, 2008 10:58am
I totally agree with your frustrations as I am having the same problem. Incidentally the Microsoft Access Import wizard deals with embedded quotes just fine, so Microsoft are not being consistent.-- Patrick
September 24th, 2008 10:02am
This is still a bug. SQL2008 SP1 Sigh. Michael
April 4th, 2011 5:51am
This is still a bug. SQL2008 SP1 Sigh. Michael
April 4th, 2011 5:51am
I don't think anyone has addressed the original question. which is about the embedded comma, not the embedded text qualifier. You can remove/replace embedded text qualifiers, but I am having the problem described by the first poster. I have an SSIS job which imports a CSV file with embedded commas within qualified text strings. Works fine when started from Visual Studio. But when deployed to SQL Server 2008, the same package completely ignores the text qualifiers and treats embedded commas as field delimiters. I posted a question on this today at http://social.msdn.microsoft.com/Forums/en-US/sqlintegrationservices/thread/77da8178-cd5d-46aa-8843-70882329333c. Nearly 3 years later, 2 new releases of SQL Server, and still the same problem! Actually is it 6 years since SQL Server 2005 came out? I get the impression no-one at Microsoft actually cares about these long-standing problems with SSIS having such a noddy (or shoddy) approach to processing CSV files. You cannot actually import a standard specification CSV file, out of the box. To succeed, it seems your CSV file must have: No embedded text qualifiers, no embedded field delimiters, no embedded row delimiters, no line breaks in records. Forget ragged records also. So if you have of these features in your CSV file, you have to develop a "workaround". DTS, MS Office products, even Visual Studio can make a better job of this, so it's not as if these are new methods. Faisel
May 11th, 2011 11:32am
I am encountering the exact same issue, a comma embedded in an address field is interpreted as a field delimiter, thus throwing an exta column in my output CSV file causing other programs that deal with this output file to choke. Hard to believe it is still a bug in SQL Server 2008 R2 and especially after posts on this issue dating back to 2008Jagannathan Santhanam
August 5th, 2011 11:54am
I am encountering the exact same issue, a comma embedded in an address field is interpreted as a field delimiter, thus throwing an exta column in my output CSV file causing other programs that deal with this output file to choke. Hard to believe it is still a bug in SQL Server 2008 R2 and especially after posts on this issue dating back to 2008 Jagannathan Santhanam Ditto, so irritating.
August 15th, 2011 8:41am
Same... delimiting on vertical bar, but commas in text fields are killing the import. Why?
September 3rd, 2011 9:10pm
I have not seen this reported when a delimeter used is the vertical bar (|). Can you post the picture of your FFS (Flat File Source) setup and how the record looks here?Arthur My Blog
September 3rd, 2011 9:49pm
I had the same issue with SQL 2008. For the Text qualifer I put " and not "". Everything works fine now. Hope this helps someone! Rashad
February 7th, 2012 12:04pm
i have upgraded to SQL SERVER 2012, and the problem is still there ........
April 17th, 2012 4:21am
I have a similar problem. Every field in my CSV is surrounded by double quotes. Using the double quote as a text qualifier, I get random/intermittant columns that decide to leave the trailing double-quote. The Degenerate Dimension
August 9th, 2012 2:23pm
I developed a script (to be put inside a Script Transformation component or as PoSh) to remove any fooling characters and change the delimiter effectively resolving this issue: SSIS: How to Remove Occasional Quotes and Replace The Column DelimiterArthur My Blog
August 9th, 2012 2:36pm