Flat file import behaves differently between Visual Studio 2008 and SQL Server 2008
Hi everyone,
I am having an SSIS flat file import problem. I have a file which contains a list of names and identfiers that I want to import into a table called Names
eg. My flat file looks like:
"ID","NAME", ADDRESS1, ADDRESS2
"AA","Acme, Manufacturing Company","1 Main Street","Main Square"
"AB","Abacus, Counting Instruments","20 Number Way","Counstville"
I have used Visual Studio to create a SSIS job with a Flat File connection. When I run the job from Studio, I get the following results into the Names table:
ID, NAME, ADDRESS1,
ADDRESS2
AA, Acme, Manufacturing Company,1 Main Street, Main Square
AB, Abacus, Counting Intruments,20 Number Way, Counstville
The results are as expected, and everything is fine.
But since I deployed the job to SQL Server, I started getting truncation errors. In order to discover what was going on, I set the data flow task to ignore truncation errors on all imported fields. Now I am getting the results:
ID, NAME, ADDRESS1, ADDRESS2,..
"AA", "Acme, Manufacturing Company", "1 Main Street"
"AB", "Abacus, Counting Instruments", "20 Number Way"
In other words, the server has ignored the double-quotes as text qualifiers in the import file and where it has come across a comma in a text qualified field, it has taken that to be a field delimiter, and split the data accordingly even though it should
not have.
Over time, I have had various problems importing CSVs with SSIS. But what's baffling me is why the processing difference between Visual Studio and SQL Server initiated SSIS jobs. This is not a permissions problem as both jobs run successfully,
but the results are different between them.
Is it possible this is to do with component versions? I am running Visual Studio 9.0.30729.1 SP. The SQL Server 2008 says it is version 10.0.1600.22.I also set the SQL job to run in 32-bit mode, but still the same results.
So to work round this problem, I have been looking at a script task to take entire lines from the CSV file and parse out the fields using a method described in
http://social.msdn.microsoft.com/forums/en-US/sqlintegrationservices/thread/302b1c34-0198-46b1-a748-8a5a35d5b823/. Although I haven't quite got it working successfully using C# script.
But is there an easier answer? I know SSIS doesn't exactly follow the CSV specification standard, out of the box, but why the difference between a job run on the client and the same job on the server? Using the script
task makes it more fiddly to update any future changes to the CSV file format.
If anyone knows why this is happening or has any pointers or comments, I would appreciate them. Thanks.
May 11th, 2011 1:47pm
I've never experienced this before and this question is probably going to sound silly but have you set the
Text Qualifier property of the Flat File Connection to ". Need to ask just in case :-)Jeff Wharton MSysDev (C.Sturt), MDbDsgnMgt (C.Sturt) MCT, MCPD, MCITP, MCDBA, MCSD, MCAD
Free Windows Admin Tool Kit Click here and download it now
May 11th, 2011 1:53pm
Sorry, I should have been clearer:
The Visual Studio result, looks like:
ID, NAME,
ADDRESS1, ADDRESS2
AA Acme, Manufacturing Company 1 Main Street Main Square
AB Abacus, Counting Intruments 20 Number Way Counstville
whilst the SQL Server Agent result looks like:
ID, NAME, ADDRESS1,
ADDRESS2,..
"AA" "Acme Manufacturing Company"
"1 Main Street"
"AB" "Abacus Counting Instruments"
"20 Number Way"
May 11th, 2011 1:54pm
Very strange indeed. The only time I've seen this is when a Text Qualifier has not been set. I will need to watch this post to see what the solution
turns out to be.
Jeff Wharton MSysDev (C.Sturt), MDbDsgnMgt (C.Sturt) MCT, MCPD, MCITP, MCDBA, MCSD, MCAD
Free Windows Admin Tool Kit Click here and download it now
May 11th, 2011 2:09pm
I set the text qualifier to " in the General tab. The header row delimiter is {LF}. In the Columns tab, the Row delimiter is {LF} and the Column delimiter is Comma {,}. In the Advanced tab, all the fields are set to TextQualiied=True (also
column delimiter=comma{,}). As mentioned this job works fine when run from Visual Studio, but produces different results when deployed and run from SQL Server Agent.
May 11th, 2011 2:17pm
That's what makes this so strange.Jeff Wharton MSysDev (C.Sturt), MDbDsgnMgt (C.Sturt) MCT, MCPD, MCITP, MCDBA, MCSD, MCAD
Free Windows Admin Tool Kit Click here and download it now
May 11th, 2011 2:24pm
I have just tried this again with a "cut-down" SSIS package to demonstrate the problem. But the results are the same.
I have a flat file connector, set with a UNC path that is accessible to both VS client and SQL Server, and I have a SSIS configuration file which I copied to the SQL Server computer in the same directory path as the VS client computer. The configuration
file contains just the SQL data connector information. These portions seem to be working. In the package designer, I have a Data Flow task in the Control Flow pane, and in the Data Flow pane, there is the Flat File Source component as data scource.
This goes straight to a OLE DB Destination component, which is connected to a table in my SQL Server database with appropriate field mappings. Again the results are as expected with VS, but SQL Server seems to ignore the " text qualifier and
brings the " characters into the import. Then it runs into field truncation problems because the " imported characters expands the data field lengths, and especially because it then thinks any "embedded" commas present in a text qualified string
are field delimiters. Surely this is not expected behaviour?
I appreciate the comment about the text qualifier. But SQL Server seems to ignore it in my case, whilst VS works with it properly. I do not have sysadmin access to the SQL Server or much admin access to the Windows Server it is on.
So I will try this job again later today by deploying it to a different SQL Server where I have total admin access, to eliminate if there are any server setup problems such as component versions and updates. Like Jeff,
I haven't come across this before. I have written various text file import routines before with SSIS, without using much more than the known-about workarounds for CSV file import issues. I don't really want to go down the line of preparing the CSV
file to strip embedded commas and text qualifiers characters before importing. Although I am already pre-processing the import file to remove embedded double-quotes. The only other workaround I can currently see is from the previous link at
http://social.msdn.microsoft.com/forums/en-US/sqlintegrationservices/thread/302b1c34-0198-46b1-a748-8a5a35d5b823/. But why this difference between client and server initiated SSIS jobs?
May 11th, 2011 3:45pm
I will also see if I can replicate this at work tomorrowJeff Wharton MSysDev (C.Sturt), MDbDsgnMgt (C.Sturt) MCT, MCPD, MCITP, MCDBA, MCSD, MCAD
Free Windows Admin Tool Kit Click here and download it now
May 11th, 2011 3:52pm
I have now tested this on a different SQL Server which worked correctly. I tried the example and my actual import routine. Both worked correctly when started from VS and as SQL Server Agent jobs. So I do not think this is to do with SSIS.
I can only think it's to do with the SQL Server I am doing this work on. I will have to find out if it is fully updated and patched. The only other difference is the work server is 64-bit, but I would'nt have thought that was the problem.
Thank you for looking at this. If I find out the actual cause, I will post it up.
May 11th, 2011 11:54pm
Hi Faisel,
Did you ever find a solution to this problem?
I'm now struggling with the same thing. Import works fine on my development machine, breaks on the client's server. I'd love to hear any suggestions.
Free Windows Admin Tool Kit Click here and download it now
July 30th, 2011 8:59am