Newbie to SCD Type 2
Hi I am working on SCD Type 2 Part of SSIS. I had the unique key called GRPKEY on the basis of that key I am isolating matching and Not matching Inputs. At the same time for the non matched records, I generating the new record using the Script transformation
on the basis of Max(GrpKey) and finally I put the Oledb Destination to dump the new records. This didn't work as it doesn't show any records. As in the data viewer, it just doesn't shows any of the records. As for the Matched records, I need to update the
record and make it as an expired record by updating the enddate With GETDATE() and updating the indicator column 'CurrentInd' as N which suggests the record is expired on that particular day after it is updated and then a new record is generated for that particular
record and keeping the startdate as GETDATE() and inserting the CurrentInd as 'Y'. I can theoritically think of that but I dont know how COuld I do it??? ANybody could youplease explain me how to do it? Thanks in AdvanceSri
July 3rd, 2011 3:07pm
use the SCD Kimball to implement SCD Type 2:
http://dimensionmergescd.codeplex.com/http://www.rad.pasfu.com
Free Windows Admin Tool Kit Click here and download it now
July 3rd, 2011 6:25pm
Hi Sri,
Let me explain you how Type 2 works in SSIS.
From the Source table, do a lookup based on the unique key.Give the "Lookup No match output" to a derived column. In the derived column create two new columns, one for start_date (GETDATE()) and other for record_indicator ("Y"). "Y" indicates new record.
Give the output to an OLEDB Destination and insert the records in source table into target table.
Now if an update or insert happens on the source table, lets see how we can capture that information. From the lookup that we used previously, give the "Lookup Match Records" to a sort transformation. Also take an OLEDB Source and select the target table inside
that. Do sorting on that as well. Now give the sort outputs of both the lookup match output and the target table (which you selected in the OLEDB Source) to a Merge Join transformation. Inside that do a left outer join and select the columns from the target
and the source (source records comes from the lookup match output). For not getting confused, please rename columns coming from target and source (like column_t for target and column_s for source column).
Give the output of the merge join to a multicast. One output of multicast goes to an aggregator inside which you have to do "Max(surrogate_id of target table). Sort the other output from the multicast and also the output of the aggregator and give the two outputs
to yet another Merge Join. Do a simple join inside. Give the output of that to a conditional split transformation. Identify the critical columns in your case and give the condition like this:
(GRPKEY_S==GRPKEY_T)&&(SURROGATE_ID_T==MAX_SURROGATE_ID_T)&&(CRIT_COL1_S!=CRIT_COL1_T || CRIT_COL2_S!= CRIT_COL2_T).
Give the output of that to another multicast. One output of that will go to a derived column. Inside that create two new columns, one for end_date (GETDATE()) and the other for the record_indicator ("N"). "N" indicates old record. Give the output from the derived
column to an OLEDB Command. Inside that give an update command to update the target table.(Eg: Update target table set end_date = ?, record_indicator = ? where surrogate_id = ?). Make sure to map the parameters correctly in the column mappings.
Finally give the other output from the second multicast to another derived column. Inside the derived column create two new columns, one for start_date (GETDATE()) and other for record_indicator ("Y"). "Y" indicates new record, for any new row gets which gets
inserted. Give the output of that to an OLEDB Destination and insert the new row to the target table.
This is one way of doing Type 2 transformation is SSIS. I'm sure experts can come up with much simpler ways to achieve this. Hope this helps you.
Cheers....Sharath
July 4th, 2011 7:17am
Hi
Before creating a package first you have analysis the target dimension table and source table or file ...
And then you find out which is the unique key..
i give you one example :
source - customer ,sales,product
customer -columns - custid,custnm,phone no
target - Customer dim ,sales dim ,product dim
customer dim is scd2
columns :custsk,custid,custnm,phone no,inserted date,modified date,statusflag
then find out the critical column ...phone no
in packages
ole db source
cust dim
target Ole db table
look up transformation in cust id into customer dim
sort by cust id and
filter by status flag = "Y"
if it is not present
if it is not matching
put one derived column
put one short transformation
then you join
add two column
sort by cust Id
1)inserted date getdate()
2)statusFlag "Y"
merge join with cust_id
insert into table customer dim
using ole db command
to update modified date= getdate
status = "N"
put one derived column
add two
column 1)inserted date getdate() 2)statusFlag "Y"
insert into table customer dim
thanks
nathan
Free Windows Admin Tool Kit Click here and download it now
July 4th, 2011 8:52am