Comlplex situation : Please help.
The problem looks simple but is complex. I have to cope data from source system which has 86 tables.These tables are imported as is it in staging. First time when the job runs it creates these 86 tables and copies the data. Then there is a daily job which incrementally tops us te data in staging. The staging data is never deleted. The issue is 1. When any new tables is added in source it should be automatically created in staging. 2. How to incrementally load the data from source to staging. 3. if any row is updated in source that row showed be marked as old and the updated row from course be loaded as new row in staging. I hope there could be someone who can precisely guide me. Hrid.
January 14th, 2011 2:55am

tell us more about what did you do till now? how did you create ETL package for 86 tables? did you used 86 source/destination? or all tables have same structure and you used loop structure with single source/destination? we could help you if you provide preliminary information about your ETL package first.http://www.rad.pasfu.com
Free Windows Admin Tool Kit Click here and download it now
January 14th, 2011 3:36am

Thanks Reza for replying. Till now I have used loop through to copy all the 86 tables. All the table structure is same as it is one to one transfer with same table name being same. I am stuck in the next part which I have already mentioned. Mentioning again below. Need you help otherwise it is problem for me. 1. When any new tables is added in source it should be automatically created in staging. 2. How to incrementally load the data from source to staging. 3. if any row is updated in source that row showed be marked as old and the updated row from course be loaded as new row in staging.
January 14th, 2011 9:31am

OK, where you load all tables from? I recommend to select all tables from sysobjects with select name from sysobjects where xtype='U' , you should use a first data flow task with oledb source with above query, and store results into Object type variable with recordset destination. then in next step use Foreach Loop with ado enumerator to loop through object varaible and store each iteration table name into a string type variable, inside the foreach loop put the data flow task which copy table from source database to destination, you should use "table name or view name from variable" as access mode in oledb source/destination, and map columns, Note that you should set default values for variables for validation purpose . this will guarantee that everytime package encounter new tables, It will consider all tables in source database and import them. and about incremental load, this is good article and worth reading : http://sqlblog.com/blogs/andy_leonard/archive/2007/07/09/ssis-design-pattern-incremental-loads.aspx http://www.rad.pasfu.com
Free Windows Admin Tool Kit Click here and download it now
January 14th, 2011 9:44am

This pertains to numbers 2 and 3: This will be very easy if your tables have columns to indicate the timing of the data. It would be helpful for every table to have a Load_Date, Mod_Date, and/or Valid_From and Valid_To date columns or you can have an Active flag or a Checksum. Then, when the data goes to insert or update your record, do a Lookup on the key. If the record exists you can check whether the record has changed by comparing the mod_date or checksum and then updated the valid_from/valid_to columns or checksum column, whichever you choose to do. If this is new to you or seems like it could be helpful, let us know and we can explain in greater detail. Brian
January 14th, 2011 10:32am

Hi Brian, Thanks for your valuable views. In the source database we have a row version and no history is maintained, a row is simply updated in source. In the staging 1 already have the original row. So next time when I copy that updated row from source , the original version of the row should be marked as expired and the updated row becomes current or new. The main issue is 86 tables. I am copying these tables as it is via loop through as it is not feasible to have 86 data flow task. Please guide how to handle these row changes in this aspect. Hrid
Free Windows Admin Tool Kit Click here and download it now
January 14th, 2011 11:01am

@Hiradyavan; did you tried my suggestion?http://www.rad.pasfu.com
January 14th, 2011 11:53am

Hi Reza, Yes your first suggestion has worked successfully, I was about to write you. I am stuck in incremental load and how to mark changed rows. Since I am looping through and no physical DFT is there, I have no idea how to do it. Regards Arvind Garg
Free Windows Admin Tool Kit Click here and download it now
January 14th, 2011 12:08pm

Since I am looping through and no physical DFT is there, I have no idea how to do it. Regards Arvind Garg why you think no DFT is there?! you should use a data flow task inside foreach loop, and you can implement incremental load in this data flow, dosn't worry about source and destination table for this data flow task, this task will get source and destination table names form package variables which reset in each loop iteration. Does it make sense to you? http://www.rad.pasfu.com
January 14th, 2011 1:02pm

Yes it make sense. So what I do is to create a DFT which loops that many number of times as there are tables. Now concept is clear t me, let me see how successfull I am in implementing it. Thanks for giving me the concept.
Free Windows Admin Tool Kit Click here and download it now
January 14th, 2011 1:09pm

How to implement this :- If a row changes in source then this row which is already copied to destination at some earlier date should be marked as old in the target. This updated row is coped again as a new row. Please guide this too.
January 14th, 2011 1:12pm

you can find rows which has changes with the solution in Andy's article , and then with an OLEDB Command set a IsOld field as true, then insert new row within data flow task. http://www.rad.pasfu.com
Free Windows Admin Tool Kit Click here and download it now
January 14th, 2011 1:51pm

Little bit of change in requirements. Hope you mind helping me out again. 1. first time loading, the table should be created in target 2. there after only appending the data and creating any new table 3. If the datastructure of any table/s changes in the source it shoud skip the loading for that table/s and append data for other tables. 4. A mail should be send which reports of the process ie, how many new tables created, for already created tables, how many rows appended and which tables were rejected( those tables whose data structure changed in the source. Please help me step by step.
January 17th, 2011 2:37am

This topic is archived. No further replies will be accepted.

Other recent topics Other recent topics