CSV - missing delimiter and added End-of-line causes wrong row to be loaded
hello,
i understand that there is a fair amount written about CSV and wrong number of columns - but this is a more peculiar (by no means unique) situation.
the CSV files in question have the following content:
<quote>
"SomeBusiness and AnotherBiz, Supplier,<LF>
<LF>
Run date/time: 01-JUN-12 11.03.21"<LF>
<LF>
"Prov SomeBusiness Name:<All>","Prov Org Name:<All>","SomeBusiness Name:<All>","Comm Org Name:<All>"<LF>
<LF>
"colHeader 1","colHeader 2","colHeader 4"<LF>
"some titleA1","some otherB1",5,6<LF>
"some titleA2","some otherB2",12,-8<LF>
</quote>
these files are UNIX format (EOL = LF \n) and are encoded in UTF8.
the problems;
in line 1
there is no Delimiter to end the String - that is because it is ended on line 3.
however, we have to work around the EOL that were added on this file.
what would be a decent approach to correct these Strings that are split across multiple lines?
thanks,
Nicolas
data warehouse | data migration | ETL/ELT | UK | http://www.brainpowered.net/cms
June 6th, 2012 7:03am
Hello Nicolas,
I advocate you use the following approach:
Set your flat file conn manager to take this file as a single column file (it will have one long textual column defined in it);After you are able to load (hopefully a not too large file), you can further split the records when you read from it based on the double LFs;The records will go to separate files based on some lofic (rules you see fit best), finallyLoad/process each such file as you need.
PS: perhaps further splitting is necessary for some rows.
Please let us know if you need help with the split component, but here is a blog post I deem helpful: http://beyondrelational.com/modules/2/blogs/106/posts/11126/ssis-script-component-split-single-row-to-multiple-rows.aspxArthur My Blog
Free Windows Admin Tool Kit Click here and download it now
June 6th, 2012 9:22am
Hello Nicolas,
I advocate you use the following approach:
Set your flat file conn manager to take this file as a single column file (it will have one long textual column defined in it);After you are able to load (hopefully a not too large file), you can further split the records when you read from it based on the double LFs;The records will go to separate files based on some lofic (rules you see fit best), finallyLoad/process each such file as you need.
PS: perhaps further splitting is necessary for some rows.
Please let us know if you need help with the split component, but here is a blog post I deem helpful: http://beyondrelational.com/modules/2/blogs/106/posts/11126/ssis-script-component-split-single-row-to-multiple-rows.aspxArthur My Blog
June 6th, 2012 9:22am