concatenating similar files from different folders
Hi, I have some folders namely FolderA, Folder B... and each folder contains some files like file1, file2..etc. File1 can be present in both folder A and B and so on. I want to create an SSIS package which merges all the similar files. Like File1 from folder A and file1 from folder B has to be merged. How can i proceed with this? All suggestions are most welcome. Also, which one will be efficient? Using the SSIS or using windows scripting? Thanks in advance, Nikesh
July 26th, 2012 4:52am

what do you mean by merging files? you mean appending rows ? or merging columns?http://www.rad.pasfu.com
Free Windows Admin Tool Kit Click here and download it now
July 26th, 2012 5:01am

Hi Reza, I need to append the similar files, not merging columns. File1 in folder A and file1 in folder B will be having same metadata. So we an append. Thanks, Nikesh
July 26th, 2012 5:21am

Hi Nitesh Lets say File1 Name,id anil,1 sunil,2 Say File 2 Name,id ram,3 shyam,4 and your megrged file l Name,id anil,1 sunil,2 ram,3 shaym,4 like above means use union all component in DataFlowTask it will do.. but two source componets should refers recpective folder files..
Free Windows Admin Tool Kit Click here and download it now
July 26th, 2012 5:30am

Hi Kingxxx1, Thanks for your reply. But the problems is that its has to be dynammic. Below are the challenges I am facing. 1->How to read all the files from different directories? Suppose my parent directory is C:\data and it has different folders like folderA,folderB and so on. All the files are distributed among these folders. 2->When a particular file , say wc12345.dat is found in folderA, I need to check for the occurence of the same file wc12345.dat in all other fiolders and append to that file. Hope you understood the problem. Thanks, Nikesh
July 26th, 2012 5:50am

yes, as Kingxxx1 said, you can use UNION ALL Transformation, but detailed help needs more information from you. are those files have same structure? I mean all File1s have same structure? how about File2s ? or all files have same structure UNION all seems to be good if all files have same structure, but if they differs we need to think about other methods, could you explain more about your case/http://www.rad.pasfu.com
Free Windows Admin Tool Kit Click here and download it now
July 26th, 2012 6:12am

Hi Nikesh, In this case you have two options, 1. Go For an FileWatcherTask(Third part component). 2. Retrive the full name of the file in to variable .. In script get the file name of the file and store it temporarily and search the particular file in other loacation.. if yes get the location of the second file store it in a vaiable and pass the variable to the DFT source.. check the below link for an custom componet or you can use konesans file watcher also.. http://www.sqlis.com/post/file-watcher-task.aspx But if are resticted using custom component means go for WMI event task and do some work arround with script..
July 26th, 2012 6:16am

Hi Nikesh, In this case you have two options, 1. Go For an FileWatcherTask(Third part component). 2. Retrive the full name of the file in to variable .. In script get the file name of the file and store it temporarily and search the particular file in other loacation.. if yes get the location of the second file store it in a vaiable and pass the variable to the DFT source.. check the below link for an custom componet or you can use konesans file watcher also.. http://www.sqlis.com/post/file-watcher-task.aspx But if are resticted using custom component means go for WMI event task and do some work arround with script..
Free Windows Admin Tool Kit Click here and download it now
July 26th, 2012 6:16am

Yes, the files with same names will be identical.So we can use the union all operator on those files.File1 and file2 will be structurally different. Thanks, Nikesh
July 26th, 2012 7:51am

if you have different structures for each couple of files, then this means that: if you want to use data flow task, you will need one specific for each couple of files. because Data flow task doesn't support dynamic metadata. So I think in your case writing .NET scripts to do that is better method. for reading data from file you can use System.IO.StreamReader and for writing data into file you can use System.IO.StreamWriter or if you are not good in programming with .net, you can use some codes that can be find to append two files together, like this: http://stackoverflow.com/questions/5096194/c-appending-contents-of-one-text-file-to-another-text-filehttp://www.rad.pasfu.com
Free Windows Admin Tool Kit Click here and download it now
July 26th, 2012 8:03am

if you have different structures for each couple of files, then this means that: if you want to use data flow task, you will need one specific for each couple of files. because Data flow task doesn't support dynamic metadata. So I think in your case writing .NET scripts to do that is better method. for reading data from file you can use System.IO.StreamReader and for writing data into file you can use System.IO.StreamWriter or if you are not good in programming with .net, you can use some codes that can be find to append two files together, like this: http://stackoverflow.com/questions/5096194/c-appending-contents-of-one-text-file-to-another-text-filehttp://www.rad.pasfu.com
July 26th, 2012 8:03am

Hi Reza.. Take a look at this.. 1->How to read all the files from different directories? Suppose my parent directory is C:\data and it has different folders like folderA,folderB and so on. All the files are distributed among these folders. 2->When a particular file , say wc12345.dat is found in folderA, I need to check for the occurence of the same file wc12345.dat in all other fiolders and append to that file. what can be the better solution for this..
Free Windows Admin Tool Kit Click here and download it now
July 26th, 2012 8:16am

Hi Nikesh, Here is a rough sketch what you can do: 1 - Loop through all the files in your system using "For each Loop container" set it to iterate through the files in a folder (or use similar scheme to loop through multiple folders). 2 - Add a Data Flow task within "For each Loop Container" and set the source as your flat files and destination as Flat File Destination from the tool box. 3 - In the data task Use "Lookup" and "Conditional split" to ignore the duplicate rows. Hope this helps
July 27th, 2012 10:24am

Hi Nikesh, Here is a rough sketch what you can do: 1 - Loop through all the files in your system using "For each Loop container" set it to iterate through the files in a folder (or use similar scheme to loop through multiple folders). 2 - Add a Data Flow task within "For each Loop Container" and set the source as your flat files and destination as Flat File Destination from the tool box. 3 - In the data task Use "Lookup" and "Conditional split" to ignore the duplicate rows. Hope this helps
Free Windows Admin Tool Kit Click here and download it now
July 27th, 2012 10:29am

Hi, I could implement this and the steps are given below: 1->For each loop container for file, with an option to traverse subdirectories. 2-> In the container, add one script task to set the dynamic connection strings. 3->File system task to create dynamic directories based on the file names. 4->Data flow task source and destination as flat files, all the similar files will get appended to the destination file. Here we had 3 kind of files with different metadata, they are identifiable with their names itself. So we can have different dataflow task for all these 3 types of files. One drawback is that whenever a new file is added, say a new file with different metadata, then we need to alter the package to accomodate this file. Thanks for all your suggestions, Nikesh
August 2nd, 2012 3:26am

This topic is archived. No further replies will be accepted.

Other recent topics Other recent topics