Flat file source - split delimited address field to additional fields.
A client of ours can only export data in the following .csv delimited format: ID, Name, Company, Address, Town, County, Postcode, Country The Address field contains anything from 1 to 5 lines of local address data separated by Carriage Return. We need to get this data into our SQL database using SSIS on a daily basis. I have managed to import the file as Flat File, and have successfully changed the Carriage Returns in the Address field to commas. I now have data in the Address field that represents: Address "20 Test Street, Testville" "22 Test Avenue" "Flat 2, 41 Test Close, Testville" I would like the data to look like: Add1,Add2,Add3,Add4,Add5 20 Test Street,Testville,,, 22 Test Avenue,,,, Flat 2, 41 Test Close, Testville,, Flat 5, Test House, Test close, Test Road, Testville I have tried using Derived Columns and the FindString and SubString commands, but I can't seem to get it to work because of the differing number of commas in each record. Sometimes there are none (second row in above examples) and sometimes more (other rows in above example). To get the first value in the field I tried: SUBSTRING(AddressCommas,1,(FINDSTRING(AddressCommas,",",1) - 1)) but as stated it doesn't work with the differing number of commas. Are there any other ways I can go about this? Note: the client has no control over the quality of their data.
August 25th, 2012 1:54pm

A client of ours can only export data in the following .csv delimited format: ID, Name, Company, Address, Town, County, Postcode, Country The Address field contains anything from 1 to 5 lines of local address data separated by Carriage Return. We need to get this data into our SQL database using SSIS on a daily basis. I have managed to import the file as Flat File, and have successfully changed the Carriage Returns in the Address field to commas. I now have data in the Address field that represents: Address "20 Test Street, Testville" "22 Test Avenue" "Flat 2, 41 Test Close, Testville" I would like the data to look like: Add1,Add2,Add3,Add4,Add5 20 Test Street,Testville,,, 22 Test Avenue,,,, Flat 2, 41 Test Close, Testville,, Flat 5, Test House, Test close, Test Road, Testville I have tried using Derived Columns and the FindString and SubString commands, but I can't seem to get it to work because of the differing number of commas in each record. Sometimes there are none (second row in above examples) and sometimes more (other rows in above example). To get the first value in the field I tried: SUBSTRING(AddressCommas,1,(FINDSTRING(AddressCommas,",",1) - 1)) but as stated it doesn't work with the differing number of commas. Are there any other ways I can go about this? Note: the client has no control over the quality of their data. If you want some more flexibility instead of writing complex expressions, then you can use "Script Component" Transformation to use .NET code before your data is pushed to destination. some hints would be: - Add some derived Columns (Add1,Add2,Add3,Add4,Add5) - Add "Script Component" Transformation to the Data Flow Task - Modify the "Script component" using .NET code. There isn't much you can do with the data which is not normalized or is inconsistent. Hope this helps- Please mark the post as answered if it answers your question
Free Windows Admin Tool Kit Click here and download it now
August 25th, 2012 3:49pm

Hi Chris I have tried designing a package as per your req. I have uploaded it to my SkyDrive. Just change the Flat File Source Connection and run. dtsx Package: http://sdrv.ms/MN5gmc
August 25th, 2012 4:07pm

This topic is archived. No further replies will be accepted.

Other recent topics Other recent topics