Extracting value from 1st line of flat file, extracting data from all other lines, then merging
Hi there, I have a flat file with data that's fixed width; however, the 1st line of the file (the header) is a different format than all the detail lines of the file. I need data from the header line AND the detail lines. I need to extract one value from that 1st line, read in some values from all the subsequent detail lines, then apply the value from that first line (header) to all the rows I read from the detail lines. Example of file: HEADER20111301TESTDATAAA DETAIL1TESTDATA1TESTDATE2 DETAIL1TESTDATA3TESTDATE4 DETAIL1TESTDATA5TESTDATE6 So I need to pull out 20111301 from the first line (easy to do), then pull out the data from subsequent lines (TESTDATA1, TESTDATA2, etc). My destination table has three fields: all the detail data PLUS the one value from the header line I pulled out applied to each row. I considered this: - Flat File Source for the detail lines - Script Component as another source for just the header line of the same file - Merge Transformation to combine the two. Any better suggestions? I considered using the Import Column transformation too to get the data from the header line. Both options seem like overkill. Thanks
June 20th, 2011 6:53pm

... Example of file: HEADER20111301TESTDATAAA DETAIL1TESTDATA1TESTDATE2 DETAIL1TESTDATA3TESTDATE4 DETAIL1TESTDATA5TESTDATE6 ... I considered this: - Flat File Source for the detail lines - Script Component as another source for just the header line of the same file - Merge Transformation to combine the two. Any better suggestions? I considered using the Import Column transformation too to get the data from the header line. Both options seem like overkill. After the file source I would use a Split component and do the split based on the expression Substring(1stCol,1,6) = "HEADER", in one flow path I would capture that date and save to a variable which will be later used in an update T-SQL used in an Execute SQL Task to set that fields in your table. The data from the other data flow will be directed to the target staging table.Arthur My Blog
Free Windows Admin Tool Kit Click here and download it now
June 20th, 2011 7:02pm

Yes, I considered a Conditional Split as well but I thought that too was overkill - it has to scan every line of the input when I'm only concerned about the 1st - the header line. Another possible option maybe: After the Flat File Source do a Script Component transformation where I extract the header (and value I need) from the file in its PreExecute and add the value to the row when processing the row?
June 20th, 2011 7:13pm

No, if you want top efficiency you need to pre-process the file in terms of an actual split. I mean you would chop off its header in a first task using Script Task thus making two files out of it utilizing some System.IO and then use two flat file sources from that point.Arthur My Blog
Free Windows Admin Tool Kit Click here and download it now
June 20th, 2011 7:23pm

No, if you want top efficiency you need to pre-process the file in terms of an actual split. I mean you would chop off its header in a first task using Script Task thus making two files out of it utilizing some System.IO and then use two flat file sources from that point. I was hoping to keep all the necessary components in one Data Flow - in particular, handling the Header data; but I think that may be the best solution: extracting the Header data as a Script Task before the Data Flow. Thanks
June 20th, 2011 7:47pm

Alot of the files I work with have the header/detail/trailer pattern. The easiest way we have found to deal with this is to split the rows based on record type and load into thier own import table and manipulate the data however we need in T-SQL.
Free Windows Admin Tool Kit Click here and download it now
June 20th, 2011 8:02pm

This topic is archived. No further replies will be accepted.

Other recent topics Other recent topics