Merging two tables using effectivestart and effectiveend date fields

Hi All,

I have two tables. Status and Fourhistory tables.

Status table contains a status column with effectivestart and end dates as history. This column is having history at month level.

Fourhistory table maintains 4 columns as part of history with the use of effectivestart and end dates. Here history capturing is at day level.

Desired Result: I want to merge the status column into FourHistory table.

Below i have given some possible sample scenarios which i face and the third table contains the expected ouput.

Please help me, how to achieve this in T-SQL query.

Thanks in advance.

create table dbo.#Status(
	ID varchar(50),
	Status varchar(50),
	EffectiveStartDate datetime,
	EffectiveEndDate datetime,
	Is_Current bit
)

Insert into #status values('1','Same','2009-07-01 00:00:00.000',	'2013-07-31 23:59:59.000',	0	)
Insert into #status values('1','Lost 13','2013-08-01 00:00:00.000', NULL	,	1	)
Insert into #status values('2','New 13','2013-02-01 00:00:00.000',	'2014-12-31 23:59:59.000',	0	)
Insert into #status values('2','Same','2015-01-01 00:00:00.000',		NULL	,	1	)
Insert into #status values('3','Same','2009-07-01 00:00:00.000',	'2009-09-30 23:59:59.000',	0	)
Insert into #status values('3','Lost 09','2009-10-01 00:00:00.000',	'2013-12-31 23:59:59.000',	0	)
Insert into #status values('3','Inter','2014-01-01 00:00:00.000',	NULL	,	1	)

CREATE TABLE dbo.#FourHistory(
	ID varchar(50),
	Fee varchar(100),
	Bill varchar(50),
	A_Date date,
	B_Date date,
	EffectiveStartDate datetime,
	EffectiveEndDate datetime
)

Insert into #FourHistory values('1','Variable','Not Applicable','2010-09-02','2010-09-01','2009-07-18 00:34:15.000',NULL)
Insert into #FourHistory values('2','Activity','Not Applicable','1900-01-01','1900-01-01','2009-07-18 01:38:34.000','2011-08-22 15:35:24.000')
Insert into #FourHistory values('2','Monthly','Not Applicable','2012-05-02','2012-04-27','2011-08-22 15:35:25.000','2014-10-22 13:24:57.000')
Insert into #FourHistory values('2','Monthly','Wholesale','2014-10-01','2014-10-01','2014-10-22 13:24:58.000',NULL)
Insert into #FourHistory values('3','Variable','Not Applicable','2006-05-08','2006-05-19','2009-07-17 23:22:07.000','2009-07-18 01:11:26.000')
Insert into #FourHistory values('3','Activity','Not Applicable','2001-01-01','2001-01-01','2009-07-18 01:11:27.000','2010-02-11 12:48:51.000')
Insert into #FourHistory values('3','Variable','Not Applicable','2009-09-18','2009-10-22','2010-02-11 12:48:52.000','2014-01-17 00:05:29.000')
Insert into #FourHistory values('3','Daily','Wholesale','2014-01-15','2014-01-20','2014-01-17 00:05:30.000',NULL)

CREATE TABLE dbo.#Result(
	ID varchar(50),
	Fee varchar(100),
	Bill varchar(50),
	A_Date date,
	B_Date date,
	Status VARCHAR(50),
	EffectiveStartDate datetime,
	EffectiveEndDate datetime
)


Insert into #Result values('1','Variable','Not Applicable','2010-09-02','2010-09-01','Same','2009-07-18 00:34:15.000','2013-07-31 23:59:59.000')
Insert into #Result values('1','Variable','Not Applicable','2010-09-02','2010-09-01','Lost 13','2013-08-01 00:00:00.000',NULL)
Insert into #Result values('2','Activity','Not Applicable','1900-01-01','1900-01-01',NULL,'2009-07-18 01:38:34.000','2011-08-22 15:35:24.000')
Insert into #Result values('2','Monthly','Not Applicable','2012-05-02','2012-04-27','New 13','2011-08-22 15:35:25.000','2014-10-22 13:24:57.000')
Insert into #Result values('2','Monthly','Wholesale','2014-10-01','2014-10-01','New 13','2014-10-22 13:24:58.000','2014-12-31 23:59:59.000')
Insert into #Result values('2','Monthly','Wholesale','2014-10-01','2014-10-01','Same','2015-01-01 00:00:00.000',NULL)
Insert into #Result values('3','Variable','Not Applicable','2006-05-08','2006-05-19','Same','2009-07-17 23:22:07.000','2009-07-18 01:11:26.000')
Insert into #Result values('3','Activity','Not Applicable','2001-01-01','2001-01-01','Same','2009-07-18 01:11:27.000','2009-09-30 23:59:59.000')
Insert into #Result values('3','Activity','Not Applicable','2001-01-01','2001-01-01','Lost','2009-10-01 00:00:00.000','2010-02-11 12:48:51.000')
Insert into #Result values('3','Variable','Not Applicable','2009-09-18','2009-10-22','Lost','2010-02-11 12:48:52.000','2013-12-31 23:59:59.000')
Insert into #Result values('3','Variable','Not Applicable','2009-09-18','2009-10-22','Inter','2014-01-01 00:00:00.000','2014-01-17 00:05:29.000')
Insert into #Result values('3','Daily','Wholesale','2014-01-15','2014-01-20','Inter','2014-01-17 00:05:30.000',NULL)


SELECT * FROM #status

select * From #FourHistory

select * From #result



September 10th, 2015 10:37am

You should explain the logic behind #result. I've tried to find it but maybe I miss something.
Free Windows Admin Tool Kit Click here and download it now
September 10th, 2015 11:07am

Ok Sorry for that.

For ex: If you take ID: 1, we have 2 records with two different status in status table and 1 record in fourhistory table. So, i need to maintain history for all 4 cols (fourhistory) + 1 col(status table) total 5 columns in fourhistory table.

Status - same is having effective start and end dates as: 2009-07-01 00:00:00.000 & 2013-07-31 23:59:59.000

so, while merging the status "same" in the fourhistory table, since my fourhistory table's effectivestartdate is greater than status table's effectivestartdate, i will maintain 2009-07-18 00:34:15.000 as my effectivestartdate (This is what available in that result table) and my effectiveenddate will be taken from the status table for status "same".

The same will be followed for next status "Lost 13".

Hope i explained the context.

September 10th, 2015 11:20am

>> have two tables. Status and Fourhistory tables. <<

No. Read any book on data modeling. A Four is not an entity; it is a number. 

A <something in particular>_status is an attribute, not an entity or relationship. There is no generic status in the universe. Is this  marriage? Employment? Graduation? There is no generic id in the universe; it has to identify something in particular. We do not use bit flags in RDBMS; that was 1950's assembly language. We use predicates instead. 

Why do you live in a magic world of VARCHAR(50)? It is used for encodings, identifiers, etc. Why do you mix DATE and DATETIME2(0)? This is absurd. 

Why are there no keys in this posting? No constraints? No standard data elements.

This design flaw is called attribute splitting and you are trying to repair a crappy design by putting the something_status in the table where it belongs. You need a new schema, not a query. And then you need to fire the guy who made this mess. 

>> Status table contains a status [sic: status of what?] column with effective_start_date and effective_end_date as history. This column is having history at month level. <<

A useful idiom is a report period calendar. It gives a name to a range of dates. The worst way would be to use temporal math; it tells the world you do not think in sets or understand declarative programming yet. Here is a skeleton: 

CREATE TABLE Month_Periods
(month_name CHAR(10) NOT NULL PRIMARY KEY, 
month_start_date DATE NOT NULL, 
month_end_date DATE NOT NULL, 
CONSTRAINT date_ordering
CHECK (month_start_date < month_end_date), 
ordinal_period INTEGER NOT NULL UNIQUE 
 CHECK(ordinal_period > 0)
etc);

These report periods can overlap; a fiscal quarter will be contained in the range of its fiscal year. There can be gaps between them; we have to wait a year between each Annual Going out Of Business Sale! and there might be long stretches of time without any special sales. But we want to know their ranges so that the table is fairly constant once it is created.

I like the MySQL convention of using double zeroes for months and years, That is 'yyyy-mm-00' for a month within a year and 'yyyy-00-00' for the whole year. The advantage is that it will sort with the ISO-8601 data format required by Standard SQL. 

Here is a skeleton to record a history that has no gaps in its timeline:

CREATE TABLE Events
(event_id CHAR(10) NOT NULL,
previous_event_end_date DATE NOT NULL
CONSTRAINT Chained_Dates
REFERENCES Events (event_end_date),
event_start_date DATE NOT NULL,
event_end_date DATE UNIQUE, -- null means event in progress
PRIMARY KEY (event_id, event_start_date),
CONSTRAINT Event_Order_Valid
CHECK (event_start_date <= event_end_date),
CONSTRAINT Chained_Dates 
CHECK (DATEADD(DAY, 1, previous_event_end_date) = event_start_date).
<< other stuff for this event >>
);

-- disable the Chained_Dates constraint
ALTER TABLE Events NOCHECK CONSTRAINT Chained_Dates

-- insert a starter row
INSERT INTO Events(event_id, previous_event_end_date, event_start_date, event_end_date)
VALUES ('Foo Fest', '2010-01-01', '2010-01-02', '2010-01-05');

-- enable the constraint in the table
ALTER TABLE Events CHECK CONSTRAINT Chained_Dates

-- this works
INSERT INTO Events(event_id, previous_event_end_date, event_start_date, event_end_date)
VALUES ('Glob Week', '2010-01-05', '2010-01-06', '2010-01-10');

-- this fails
INSERT INTO Events(event_id, previous_event_end_date, event_start_date, event_end_date)
VALUES ('Snoob', '2010-01-09', '2010-01-11', '2010-01-15'); 

>> Foobar_History table maintains 4 columns as part of history with the use of effective_start_date and effective_end_date pairs. Here history capturing is at day level. <<

Looking at the sample data, I will guess that we can start with a repair job before we throw this out. 

CREATE TABLE Foobar_Status
(foobar_gtin CHAR(15) NOT NULL,
 foobar_status CHAR(10) NOT NULL
  CHECK(foobar_status 
     IN ('Inter', 'Lost 09', 'Lost 13', 'New 13', 'Same')), 
 effective_start_date DATE NOT NULL, 
 effective_end_date DATE, 
  CHECK(effective_start_date < effective_end_date)
);

I grabbed the GTIN as an industry standard product identifier, but I really have no idea what to use. 

Are there also 'Lost 01' thru 'Lost 99'? Did I guess the constraints you left off? 

Here is the syntax for table insertion; you were still using the old Sybase syntax! 

INSERT INTO Foobar_Status 
VALUES
('1', 'Same', '2009-07-01', '2013-07-31'), 
('1', 'Lost 13', '2013-08-01', NULL),
('2', 'New 13', '2013-02-01', '2014-12-31'),
('2', 'Same', '2015-01-01', NULL),
('3', 'Same', '2009-07-01', '2009-09-30'),
('3', 'Lost 09', '2009-10-01', '2013-12-31'),
('3', 'Inter', '2014-01-01', NULL);

Let's do a correct posting for the next table before we throw it out: 

CREATE TABLE Foobar_History
(foobar_gtin CHAR(15) NOT NULL, 
 fee_type VARCHAR(10) NOT NULL
 CHECK (fee_type IN ('N/A', 'Wholesale', 'Retail')), 
 bill_type VARCHAR(10) NOT NULL
 CHECK (bill_type IN ('Activity', 'Daily', 'Monthly', 'Variable')), 
 a_date DATE NOT NULL, 
 b_date DATE NOT NULL,
 effective_start_date DATE NOT NULL, 
 effective_end_date DATE,
  CHECK(effective_start_date < effective_end_date)
 
);

INSERT INTO Foobar_History 
VALUES('1', 'Variable', 'N/A', '2010-09-02', '2010-09-01', '2009-07-18', NULL),
('2', 'Activity', 'N/A', '1900-01-01', '1900-01-01', '2009-07-18', '2011-08-22'),
('2', 'Monthly', 'N/A', '2012-05-02', '2012-04-27', '2011-08-22', '2014-10-22'),
('2', 'Monthly', 'Wholesale', '2014-10-01', '2014-10-01', '2014-10-22', NULL),
('3', 'Variable', 'N/A', '2006-05-08', '2006-05-19', '2009-07-17', '2009-07-18'),
('3', 'Activity', 'N/A', '2001-01-01', '2001-01-01', '2009-07-18', '2010-02-11'),
('3', 'Variable', 'N/A', '2009-09-18', '2009-10-22', '2010-02-11', '2014-01-17'),
('3', 'Daily', 'Wholesale', '2014-01-15', '2014-01-20', '2014-01-17', NULL);

I hope this helps you clean up the code; if I guessed wrong then fix it and we can try again with usable DDL and sample data. Frankly, you might have to do some of this by hand. 
Free Windows Admin Tool Kit Click here and download it now
September 10th, 2015 2:43pm

First, Thanks for spending your time for this detailed explanation.

Whatever you mentioned as part of DataModel, all those so-called Sophisticated and meaningful column or table naming/designs are there with us.

Thanks for taking care of that part. Since its a public forum, I cant tell you the exact Table/Column name. Moreover I have to present the table structure for the Forum readers as a hassle free handling, unless it doesn't affect my actual architecture. So, that they can play around easily to come up with the solution.

Coming to your Old Sybase Syntax comment. Nothing in this world is useless just because they are old.

I already tried a way to get the solution. But its not working, close to the expected result. Instead of wasting the time, I did a hardcoded - SELECT 'INSERT INTO #TABLENAME ', bla bla ,etc from SOURCETable.

So, I guess whatever provided is sufficient in terms of working for the solution. The actual Data Model, design, naming convention, Constraints, Partitioning, etc all those stuffs are there in offline for the actual work.

I just expect the solution to incorporate with my Orignial DataModel.

Thanks for writing. Am sure your article will be really helpful for our new forum readers.

Back to the business, do you have any idea/solution for this to arrive the result?

September 10th, 2015 11:04pm

The following does what you want (I think, there may be some special case I missed).  It produces the result you show in #result except that for ID=3 my code produces "Lost 09" rather that just "Lost".  I'm guessing that what you have in #result for that case is a type and you really wanted "Lost 09".  If it is not and you really wanted "Lost", let us know when you would want to drop the trailing digits and when you would keep them (since you kept them for "New 13").

;With cteStatus As
(Select s.ID, s.Status, s.EffectiveStartDate, s.EffectiveEndDate, s.Is_Current,
  Row_Number() Over(Partition By s.ID Order By s.EffectiveStartDate) As rn
From #Status s)
Select s.ID, h.Fee, h.Bill, h.A_Date, h.B_Date,
  Case When h.EffectiveEndDate <= s.EffectiveStartDate And s.rn = 1 Then Null 
       Else s.Status End As Status, 
  Case When h.EffectiveStartDate <= s.EffectiveStartDate And s.rn = 1 Then h.EffectiveStartDate
       When s.EffectiveStartDate < h.EffectiveStartDate Then h.EffectiveStartDate 
	   Else s.EffectiveStartDate End As EffectiveStartDate,
  Case When s.EffectiveEndDate Is Null Then h.EffectiveEndDate
       When h.EffectiveEndDate Is Null Then s.EffectiveEndDate
       When s.EffectiveEndDate > h.EffectiveEndDate Then h.EffectiveEndDate 
	   Else s.EffectiveEndDate End As EffectiveEndDate
From cteStatus s
Inner Join #FourHistory h On s.ID = h.ID And 
           (IsNull(h.EffectiveEndDate, '99991231') >= s.EffectiveStartDate And h.EffectiveStartDate <= IsNull(s.EffectiveEndDate, '99991231')
		   Or (h.EffectiveEndDate <= s.EffectiveStartDate And s.rn = 1))
Order By ID, EffectiveStartDate;

Tom

Free Windows Admin Tool Kit Click here and download it now
September 11th, 2015 1:00am

Here's another option... It should be a little faster since it eliminates the sort operators in the execution plan.

SELECT 
	fh.ID,
	fh.Fee,
	fh.Bill,
	fh.A_Date,
	fh.B_Date,
	s.Status,
	CASE WHEN COALESCE(fh.EffectiveEndDate, '19000101') > COALESCE(s.EffectiveStartDate, '19000101') THEN fh.EffectiveStartDate ELSE s.EffectiveStartDate END AS StartDate,
	CASE WHEN COALESCE(fh.EffectiveEndDate, '99991231') < COALESCE(s.EffectiveEndDate, '99991231') THEN fh.EffectiveEndDate ELSE s.EffectiveEndDate END AS EndDate
FROM 
	#FourHistory fh
	FULL JOIN #Status s
		ON fh.ID = s.ID
		AND (fh.EffectiveStartDate BETWEEN s.EffectiveStartDate AND COALESCE(s.EffectiveEndDate, '99991231') OR s.EffectiveStartDate BETWEEN fh.EffectiveStartDate AND COALESCE(fh.EffectiveEndDate, '99991231'));

September 11th, 2015 2:00am

Here's another option... It should be a little faster since it eliminates the sort operators in the execution plan.

SELECT 
	fh.ID,
	fh.Fee,
	fh.Bill,
	fh.A_Date,
	fh.B_Date,
	s.Status,
	CASE WHEN COALESCE(fh.EffectiveEndDate, '19000101') > COALESCE(s.EffectiveStartDate, '19000101') THEN fh.EffectiveStartDate ELSE s.EffectiveStartDate END AS StartDate,
	CASE WHEN COALESCE(fh.EffectiveEndDate, '99991231') < COALESCE(s.EffectiveEndDate, '99991231') THEN fh.EffectiveEndDate ELSE s.EffectiveEndDate END AS EndDate
FROM 
	#FourHistory fh
	FULL JOIN #Status s
		ON fh.ID = s.ID
		AND (fh.EffectiveStartDate BETWEEN s.EffectiveStartDate AND COALESCE(s.EffectiveEndDate, '99991231') OR s.EffectiveStartDate BETWEEN fh.EffectiveStartDate AND COALESCE(fh.EffectiveEndDate, '99991231'));
Free Windows Admin Tool Kit Click here and download it now
September 11th, 2015 2:23am

The following does what you want (I think, there may be some special case I missed).  It produces the result you show in #result except that for ID=3 my code produces "Lost 09" rather that just "Lost".  I'm guessing that what you have in #result for that case is a type and you really wanted "Lost 09".  If it is not and you really wanted "Lost", let us know when you would want to drop the trailing digits and when you would keep them (since you kept them for "New 13").

;With cteStatus As
(Select s.ID, s.Status, s.EffectiveStartDate, s.EffectiveEndDate, s.Is_Current,
  Row_Number() Over(Partition By s.ID Order By s.EffectiveStartDate) As rn
From #Status s)
Select s.ID, h.Fee, h.Bill, h.A_Date, h.B_Date,
  Case When h.EffectiveEndDate <= s.EffectiveStartDate And s.rn = 1 Then Null 
       Else s.Status End As Status, 
  Case When h.EffectiveStartDate <= s.EffectiveStartDate And s.rn = 1 Then h.EffectiveStartDate
       When s.EffectiveStartDate < h.EffectiveStartDate Then h.EffectiveStartDate 
	   Else s.EffectiveStartDate End As EffectiveStartDate,
  Case When s.EffectiveEndDate Is Null Then h.EffectiveEndDate
       When h.EffectiveEndDate Is Null Then s.EffectiveEndDate
       When s.EffectiveEndDate > h.EffectiveEndDate Then h.EffectiveEndDate 
	   Else s.EffectiveEndDate End As EffectiveEndDate
From cteStatus s
Inner Join #FourHistory h On s.ID = h.ID And 
           (IsNull(h.EffectiveEndDate, '99991231') >= s.EffectiveStartDate And h.EffectiveStartDate <= IsNull(s.EffectiveEndDate, '99991231')
		   Or (h.EffectiveEndDate <= s.EffectiveStartDate And s.rn = 1))
Order By ID, EffectiveStartDate;

Tom

September 11th, 2015 4:59am


That gets the wrong EffectiveStartDate for some rows.  See, for example, the 3rd, 5th and 6th rows for ID=2.

Tom

Doh! Good catch Tom. Can't believe I missed that. Guess that what happens when insomnia makes you answer forum questions after 2:00 AM...

Anyway, here is an updated version...

WITH CTE AS (
	SELECT 
		fh.ID,
		fh.Fee,
		fh.Bill,
		fh.A_Date,
		fh.B_Date,
		s.Status,
		CASE WHEN COALESCE(fh.EffectiveStartDate, '19000101') > COALESCE(s.EffectiveStartDate, '19000101') THEN fh.EffectiveStartDate ELSE s.EffectiveStartDate END AS StartDate,
		CASE WHEN COALESCE(fh.EffectiveEndDate, '99991231') < COALESCE(s.EffectiveEndDate, '99991231') THEN fh.EffectiveEndDate ELSE s.EffectiveEndDate END AS EndDate
	FROM 
		#FourHistory fh
		FULL JOIN #Status s
			ON fh.ID = s.ID
			AND (fh.EffectiveStartDate BETWEEN s.EffectiveStartDate AND COALESCE(s.EffectiveEndDate, '99991231') OR s.EffectiveStartDate BETWEEN fh.EffectiveStartDate AND COALESCE(fh.EffectiveEndDate, '99991231'))
)
	SELECT 
		c.ID,
		c.Fee,
		c.Bill,
		c.A_Date,
		c.B_Date,
		c.Status,
		COALESCE(LAG(DATEADD(ss, 1, c.EndDate), 1) OVER (PARTITION BY c.ID ORDER BY COALESCE(c.EndDate, '99991231')), c.StartDate) AS StartDate,
		c.EndDate
	FROM 
		CTE c;
Of course the LAG function introduces a sort operator... Thus negating the performance gain... Either way, still an interesting exercise. :)
Free Windows Admin Tool Kit Click here and download it now
September 11th, 2015 9:51am

You're right, this is an interesting problem.  And I really like your approach better than mine if we can make it work (yours is simpler and easier to understand).  But this one still gets the wrong value for the startdate of the first row for ID = 1.

Tom

September 11th, 2015 11:09am

That would be the result of me not paying attention to Intellisense... 

Bad code:

CASE WHEN COALESCE(fh.EffectiveEndDate, '19000101') > COALESCE(s.EffectiveStartDate, '19000101') THEN fh.EffectiveStartDate ELSE s.EffectiveStartDate END AS StartDate,

Corrected Code:

CASE WHEN COALESCE(fh.EffectiveStartDate, '19000101') > COALESCE(s.EffectiveStartDate, '19000101') THEN fh.EffectiveStartDate ELSE s.EffectiveStartDate END AS StartDate,

Free Windows Admin Tool Kit Click here and download it now
September 11th, 2015 12:39pm

This topic is archived. No further replies will be accepted.

Other recent topics Other recent topics