Using Lookup Transformation when joining BETWEEN two values
Hi
I have hit a new scenario that I haven't encountered before with SSIS. I am working on SSIS 2008.
I have a fact table which has a monetary value on it (lets call it Amount), and a lookup table which has user defined bands for the amounts (0-100, 100-500,etc...). The lookup table has two columns MinValue and MaxValue which define the boundaries.
So basically, in my data flow I need to lookup the primary key of the lookup table where fact.Amount between lookup.MinValue and lookup.MaxValue.
Now I know that this will work if I use "no cache" on the lookup transformation and edit the query, but is there any way to do this while still using the cache? The lookup data set is tiny, but I would get a huge performance gain if I can use the cache.
Thanks in advance
Craig Bryden - Please mark correct answers
November 17th, 2010 7:39pm
If you are thinking of doing the "Lookup" operation inside a query by using the BETWEEN operator (inside a ExecuteSQL task), you won't have an option of using a cache (sounds pretty basic, I know). However, if you are planning to do the lookup via the Lookup
transform, you can set up cache for the lookup table so that the performance of your package will enhance.
Can you explain a bit more on what exactly is stopping you from setting up cahce on the lookup table resultset??
Hope this helps.
Cheers!!
Muqadder.
Free Windows Admin Tool Kit Click here and download it now
November 17th, 2010 9:11pm
Also, if your fact table and the reference table are in the same server, you could use a SQL command data access mode in the OLE DB source and have a join between the fact and the reference table in your source query itself.
You wouldn't need a Lookup transformation if you go by this approach. (Again, this is possible only if both the tables are in the same server).
Hope that helps.
Cheers,
ShalinShalin P. Kapadia
November 17th, 2010 9:27pm
Muqadder. The lookup transform (to my knowledge) only allows a lookup based on equals(=), not
between.
I am hoping someone can tell me how to change that behaviour
ThanksCraig Bryden - Please mark correct answers
Free Windows Admin Tool Kit Click here and download it now
November 17th, 2010 9:33pm
Shalin
They are currently on the same server, but the architecture will allow for them to be on separate servers later on. I have considered using a linked server as a work-around, but that will be a last resort. Craig Bryden - Please mark correct answers
November 17th, 2010 9:34pm
Craig,
You can use the Lookup task even with the "Partial Cache" mode and then change the query for the Lookup in the Advanced tab.
This link would definitely help you -
http://www.bidn.com/spaces/2/jacksonville-business-intelligence-sig/articles/integration-services/98/ssis-2008-lookup-altering-the-sql-statement
It is impossible to do it with the Lookup task with the "Full Cache" cache mode selection. Consider doing it using a sql stored procedure in that case or as I mentioned above, change the OLE DB source data SQL query (in case your current server
architecture remains the same).
Just 1 doubt - You have a fact table and a reference table. You join these 2 to get the FactRangeID. Where are you inserting this ID? If you need to insert this ID in the same fact table, you will have to use and UPDATE statement, for which you will have
to use an
OLE DB Command transformation.
The best practices says that the usage of OLE DB Command transformation must be avoided as far as possible since it is a row-based operation (for each row coming in, the task would run) and not a set-based (single execution for a set of records).
This would impact the performance of your package. However, if there is no other go, you will have to use it.
Please let us know in case you need any further information.
Cheers,
ShalinShalin P. Kapadia
Free Windows Admin Tool Kit Click here and download it now
November 18th, 2010 1:05am
I'm thinking you'd be better off coding a Script to do a cached range lookup - but that can be complicated. (Your situation is called "range lookup" BTW.)
The SSIS dev team has a few
Lookup patterns that are very useful on their blog, one of them is about doing range lookups.
Talk to me now on
November 18th, 2010 11:33am
Simplest I can think of is doing it via a SQL proc/in-line sql statement that uses a BETWEEN operator to check which range the incoming record belongs to. You can monitor the performance and see if it is below your acceptable threshold. Otherwise,
you can go the Scripting way as suggested by Todd already.
Hope this helps.
Cheers!!
Muqadder.
Free Windows Admin Tool Kit Click here and download it now
November 18th, 2010 12:56pm
Thanks Shalin
I knew about the partial cache solution, but performance was just too slow
Thanks for taking the time to answerCraig Bryden - Please mark correct answers
November 18th, 2010 4:57pm
Thanks Todd
That's what I neededCraig Bryden - Please mark correct answers
Free Windows Admin Tool Kit Click here and download it now
November 18th, 2010 4:58pm