How to clean up, deleting unwanted lines fromflat file in SSIS?
Hi All, I have to load a flat file data to database. The flat file is ; delimited, but the header information is repeating in the files, in between the data. Also there are some empty lines as well inbetween the data. What is the best way to clean the unwamted lines from the file. How do I achieve this. my data look something like this ##### 0 of 500 (below) ##### Record Number; Policy Number; Phase Code; SubPhase Code; Val Basis; Admin Plan Code; ERROR message A;B;C;D;E;F;G A;B;C;D;E;F;G A;B;C;D;E;F;G 22575 records have valued successfully in the GAAP #1 valuation 13946 records have errored in the GAAP #1 valuation ##### 1 of 500 (below) ##### Record Number; Policy Number; Phase Code; SubPhase Code; Val Basis; Admin Plan Code; ERROR message A;B;C;D;E;F;G A;B;C;D;E;F;G A;B;C;D;E;F;G 22575 records have valued successfully in the GAAP #1 valuation 13946 records have errored in the GAAP #1 valuation I want this to be cleaned up like Record Number; Policy Number; Phase Code; SubPhase Code; Val Basis; Admin Plan Code; ERROR message A;B;C;D;E;F;G A;B;C;D;E;F;G A;B;C;D;E;F;G A;B;C;D;E;F;G A;B;C;D;E;F;G A;B;C;D;E;F;G Also my files are big each one holding avg of 901000 records. Thanks all in advance. Ann
May 15th, 2012 10:20am

Couple options 1) Get whoever is providing the files to create them properly 2) Use a script source and strip the unwanted lines on the fly as you import it 3) use a flat file source and declare all of your columns as string - use a conditional split to remove the unwanted header lines 4) Build a file preprocessor in c# or VB to clean up the file before it gets touched by SSIS Chuck
Free Windows Admin Tool Kit Click here and download it now
May 15th, 2012 10:23am

I would suggest using just a single Split Component. You can use three outputs generated by the split. One is the needed data, the others would be the blanks and headers. You accumulate the data into the target, and if you do not need the headers then you just do not consume that output. You will need the proper SSIS Expression to detect the header and it seems that the correct one would be FINDSTRING(Column1,"#####",1) > 0 A SSIS Expression for blanks could be LEN(Column1)== 0 .... Arthur My Blog
May 15th, 2012 10:39am

read the txt file and read all as in one column , something like this link (this link has 3 pages) then use a script as oldJeep had mentioned (page 2) clean out the the unwanted records. and use Arthur suggetion , i would use a if statment to check if that line doesn't have a ";" semicolon, if not reject record Sincerely SH -- MCITP , MCTS -- Please kindly mark the post(s) that answered your question and/or vote for the post(s).
Free Windows Admin Tool Kit Click here and download it now
May 15th, 2012 1:53pm

thanks for your reply , i tried the same but looking for lines with ';' It worked.
May 16th, 2012 11:52am

This topic is archived. No further replies will be accepted.

Other recent topics Other recent topics