Lookup primary key of a master table which is does not contains unique key
Hi I have a strange requirement in ETL operation. My Source contains only the details table data. Out of it, I have to load the master table and refer the Master table primary key ID to load Details table. I can easily load the Master Table with aggregate transformation. But problem is how to look up the Master Table Primary Key ID to load the Details table, as the master does not contains any unique key to lookup. This may seems to be strange but this is my requirement. You can refer the Source and Destination data model as below. They may give you clear picture. Can you guys help me out on this? My Source Table -------------------------- Citye Type City Name Source New York Via City1 Pittsburg Destination City Chicago Source New York Via City1 Philadelphia Destination City Chicago My Destination ------------------------ Master Table ID 1 2 Detail Table Master Table ID Citye Type City Name 1 Source New York 1 Via City1 Pittsburg 1 Destination City Chicago 2 Source New York 2 Via City1 Philadelphia 2 Destination City Chicago Thanks
March 22nd, 2008 1:54am

I don't understand what is the issue very well. What would be logic behind:"I can easily load the Master Table with aggregate transformation."In other words, What does it make a Master ID?Cannot you just use a similar logic to derived the ID in the details rows?I would probably try using a script component to generate and assign the Master ID and then split the pipeline inot master and detail rows
Free Windows Admin Tool Kit Click here and download it now
March 22nd, 2008 3:32am

I hope you understand the data model given by me. Let me explain you in detail. I wanted toload the destination which has two Tables named Master and Detail Table. Detail tablehas reference of Master Table Primary Key. My source data will be used to load the Detail table. But to load the details table, I need the reference key which is Master table's primary key. In this case, what I do usually. First, I load the master table without duplicates row (as source may contains master and details table data in a single row). To achieve this, I use Aggregate Transformation to remove duplicates. Once Master is Loaded, I create another dataflow task to load the detail table. Again I use the same data source, and Lookup (Transformation) the (loaded) master table with the keys which I used to Aggregate for the Master table. (Note that these keysshould be unique to identify a source row.If not, lookup (transformation) results more than onerow and dont get the correct primary key ofMaster Table).Assumes that, look up results exactly single rowandable to get he Primary key of master table. Now our detail table is ready to load as we got the master table primary id. This is how I do master and details table loading. But in a certain case, I dont have data to identify master table's uniqueness. If you look at my Data model in the previous post, You may find out that there are no data to identify uniqueness. So I will not get the exact Master Table primary key id to load the detail table. I hope you understand the issue. Can you please help me out? Thanks
March 22nd, 2008 5:04am

assuming that the resulting database be use for an OLTP application, the master table appears to be incomplete. however, the master table should have unique rows, which it does. the cities can be added to the master table since they're also unique. so, the master table can be called "Cities". the city types are also unique, but they're not unique to cities. so, the city types should be in a seperate table that can be called "city types". now, there should be two tables: "cities" and "city types". i don't know what the details table should represent, but i'm assuming that they represent routes. however, you haven't provided enough information about what makes up a route.hth
Free Windows Admin Tool Kit Click here and download it now
March 22nd, 2008 10:54am

This requirement is for OLTP database. But I have to load the OLTP database using my Excel Source through SSIS. Answer to your question why I need master/details table?. The master table used as Root table for Unique cities (detail table 1) and Charges table (details table 2, I did not specify this table in my earlier post) for those unique cities or route. For Example, The route, From Source City: New York ->Via City 1: Philadelphia -> to Destination City: Chicago has different types oftravelling Feautures and each has different amount of charges. More in detail, for the above said route, there could be a Bus with basic features say which cost $ 100. as well another bus with Additional features say which cost $ 200. You can refer the below data model to reperesent the above scenario. So Master Table has two details table one has list of cities in a specific route and anotherlist of charges for travelling features. I hope I provided enough infomation that why I need master table. As you suggested, "city type" could be a another table. I agree with you this point but not "Cities" table as it requires Master. Master Table Route No/ID Route Name 1 NYCCHI Detail of Cities (1) Route No/ID City Type City Name 1 Source City New York 1 Via City 1 Philadelphia 1 Destination City Chicago Details of Amount (2) Route No/ID Features Types Amount 1 Basic Features 100 1 Additional Features 200
March 22nd, 2008 10:32pm

Well, you can't relate the tables unless you have something to actually relate them on. If it is the unique combination of cities in the route, then you might want to store that in the master table, or in a secondary table that can be used to map a set of cities to the route ID. If you can generate this unique master table from the detail data, there's no reason you can't use the same logic to look up the appropriate route.
Free Windows Admin Tool Kit Click here and download it now
March 24th, 2008 2:29am

ok. i'm assuming that all the routes are unique, otherwise, i don't think this database would make sense. the "routes" table is the master and the columns would be as follows: ID, Name, Source, Via City 1, Destination, Feature Type. the "cities" table would have the following columns: ID, Name. the "city types" table would have the following columns: ID, Type. the "features" table would have the following columns: ID, Type, Amount. the "cities" table would be related to the "routes" table on "Name". the "city types" table would be related to the "routes" table on "Source", "Via City 1", and "Destination". the "features" table would be related to the "routes" table on "Feature Type".hth
March 24th, 2008 6:33am

I don't see how you can get from here Amzu wrote: My Source Table -------------------------- Citye Type City Name Source New York Via City1 Pittsburg Destination City Chicago Source New York Via City1 Philadelphia Destination City Chicago to here: Amzu wrote: Master Table Route No/ID Route Name 1 NYCCHI Detail of Cities (1) Route No/ID City Type City Name 1 Source City New York 1 Via City 1 Philadelphia 1 Destination City Chicago Details of Amount (2) Route No/ID Features Types Amount 1 Basic Features 100 1 Additional Features 200Unless you source file has a route name (or any sort of master identification) for every row there is no way you draw a logic to tell which detail groups go together. So far, 3 of us have failed understanding the data source you are trying to use.BTW, this is not an SSIS issue but rather a data architecture one.
Free Windows Admin Tool Kit Click here and download it now
March 25th, 2008 4:52am

Rafael Salas wrote: Unless you source file has a route name (or any sort of master identification) for every row there is no way you draw a logic to tell which detail groups go together. So far, 3 of us have failed understanding the data source you are trying to use.BTW, this is not an SSIS issue but rather a data architecture one. It's not just you three, for what it's worth. I've been watching and listening to see what others suggest, because I simply can't make sense of this data model. I totally agree that the right thing here is to step back and design a proper data model instead of trying to use SSIS as a tool to make a bad database design work.
March 25th, 2008 4:57am

The badmodel aside (and it appears that Amzu has no control over it anyway), I think you can get what you need by using a Script Task. What you really need to do is identify each unique set of source to destination rows. Once this is done, mapping to the destination schema is simple. You can load the Excel data into a Data Flow, and then use a Script Task to loop through each row. Write logic to assign a sequential number (called group_id) to each group starting at the city_type of source and ending at the row with type Destination City for each set. This is assuming that City Type always starts with a Source type and ends with a Destination City type (like bookends). When you are done, your data flow should have 3 columns: group_id, city_type, and city_name. Now you should be able to use normal loading logic to give each group a unique key in the master table.
Free Windows Admin Tool Kit Click here and download it now
March 25th, 2008 11:59am

Sorry Rafael, I did not update the Source Table when I reply to Duane Douglas, Source table would be like this. Citye Type City Name Amount Feature Type Source New York 100 Basic Via City1 Pittsburg 100 Basic Destination City Chicago 100 Basic Source New York 75 Basic Via City1 Philadelphia 75 Basic Destination City Chicago 75 Basic Source New York 200 Additional Via City1 Pittsburg 200 Additional Destination City Chicago 200 Additional Source New York 150 Additional Via City1 Philadelphia 150 Additional Destination City Chicago 150 Additional Moreover if you look at my first post. I stated that it is an strange requirement. Please Refer below "I have a strange requirement in ETL operation. My Source contains ....... as the master does not contains any unique key to lookup. This may seems to be strange but this is my requirement. You .... me out on this?" I do agree that it is an Data model issue rather an SSIS. But looking at Data and Data Model, DB Designer wants to normalize the tables as much as he can, eventhough there are duplicate master table. I thing what Tod McKenna suggested will work for me. Thank you for you guys spending time on this issue.
April 1st, 2008 7:09pm

Detail and Master? Those are terms from network data bases, not SQL. My guess is that you are trying to model routes. The schgema for that is: CREATE TABLE Routes (route_nbr CHAR(3) NOT NULL, route_leg_start_city VARCHAR(25) NOT NULL, route_leg_end_city VARCHAR(25) NOT NULL, CHECK( route_leg_start_city <> route_leg_end_city), PRIMARY KEY ((route_nbr, route_leg_start_city, route_leg_end_city) ); --CELKO-- Books in Celko Series for Morgan-Kaufmann Publishing: Analytics and OLAP in SQL / Data and Databases: Concepts in Practice Data / Measurements and Standards in SQL SQL for Smarties / SQL Programming Style / SQL Puzzles and Answers / Thinking in Sets / Trees and Hierarchies in SQL
Free Windows Admin Tool Kit Click here and download it now
March 30th, 2011 6:54pm

This topic is archived. No further replies will be accepted.

Other recent topics Other recent topics