Finding gaps and filled with the last validate data

Dear colleagues, currently I am facing a complex escenario related with gaps and sequences, but I was trying with diferent cases but I did not get the correct results, I am sure about the use of windows functions, but I would like a little help.  I have a table with the information grouped by PublicationId, Provider, MetricId and Amount by Date, one row by each month, but in some cases these data don't have a sequencial values, for example I have the data for the next sequence:


I need to get the sequence by each month, in this case I need to project the month from February to May (with the last previous value, for this case of January) , this is:


The data for testing are:

DECLARE @PublicationsByUser AS TABLE
(
  Id   INT,
  PublicationId  INT,
  MetricId       INT,
  ProviderId     INT,
  DateCreated    DATE,
  Amount         FLOAT
);


INSERT INTO @PublicationsByUser
VALUES (1,1,1,1,'20150101',10),
  (2,1,1,1,'20150201',15),
  (3,1,1,1,'20150401',20),
  (4,1,1,1,'20150501',20),
  (5,1,1,1,'20150801',25),
  (6,1,1,1,'20150101',10),
  (7,2,1,1,'20150101',1),
  (8,2,1,1,'20150601',6);

SELECT * FROM @PublicationsByUser



Thank you for your help.

August 26th, 2015 4:21pm

Try

;with cte as (select *, COALESCE(LEAD(DateCreated, 1) OVER(partition by PublicationId, MetricId, ProviderId order by DateCreated), dateadd(day,1, DateCreated))  as NextDate from @PublicationsByUser)
select cte.PublicationId, cte.MetricId, cte.ProviderId, d.CalendarDate as DateCreated, cte.Amount, cte.NextDate  from cte

CROSS APPLY (select D.CalendarDate
from dbo.Dates D where D.[DayOfMonth] = 1 and D.CalendarDate >= cte.DateCreated and D.CalendarDate < cte.NextDate) D
I have a Permanent Calendar table in our database called Dates. In your case you can create that table on the fly if you don't have it.

Free Windows Admin Tool Kit Click here and download it now
August 26th, 2015 4:59pm

Thanks Naomi, what is the structure of your dbo.Dates for trying to run this script. 
August 27th, 2015 12:34am

I am on a new laptop right now and don't yet installed our databases on it, but for your purpose the 2 columns of interest are CalendarDate and DayOfMonth.
Free Windows Admin Tool Kit Click here and download it now
August 27th, 2015 12:52am

Hello Naomi, I tested the solution but I don't get any valid results, this was the script:

DECLARE @PublicationsByUser AS TABLE
(
  Id   INT,
  PublicationId  INT,
  MetricId       INT,
  ProviderId     INT,
  DateCreated    DATE,
  Amount         FLOAT
);


INSERT INTO @PublicationsByUser
VALUES (1,1,1,1,'20150101',10),
  (2,1,1,1,'20150201',15),
  (3,1,1,1,'20150401',20),
  (4,1,1,1,'20150501',20),
  (5,1,1,1,'20150801',25),
  (6,1,1,1,'20150101',10),
  (7,2,1,1,'20150101',1),
  (8,2,1,1,'20150601',6);

SELECT * FROM @PublicationsByUser


DECLARE  @Dates AS TABLE
([DayOfMonth] tinyint,
 CalendarDate date
 );




 INSERT INTO @Dates
 values(1,'20150101'),(2,'20150201'),(3,'20150301'),(4,'20150401'),(5,'20150501'),(6,'20150601'),(7,'20150701');

;with cte as (select *, COALESCE(LEAD(DateCreated, 1) OVER(partition by PublicationId, MetricId, ProviderId order by DateCreated), dateadd(day,1, DateCreated))  as NextDate from @PublicationsByUser)
select cte.PublicationId, cte.MetricId, cte.ProviderId, d.CalendarDate as DateCreated, cte.Amount, cte.NextDate  from cte

CROSS APPLY (select D.CalendarDate
from @Dates D where D.[DayOfMonth] = 1 and D.CalendarDate >= cte.DateCreated and D.CalendarDate < cte.NextDate) D

Could you help me with some detail that I forgot

August 27th, 2015 2:55am

small modification on Naomi's response 
;with cte as (select *, COALESCE(LEAD(DateCreated, 1) OVER(partition by PublicationId, MetricId, ProviderId order by DateCreated), dateadd(day,1, DateCreated))  as NextDate from @PublicationsByUser)

SELECT PublicationId, MetricId, ProviderId, CalendarDate as DateCreated, Amount
FROM CTE 

UNION ALL

select cte.PublicationId, cte.MetricId, cte.ProviderId, d.CalendarDate as DateCreated, cte.Amount  from cte

CROSS APPLY (select D.CalendarDate
from dbo.Dates D where D.[DayOfMonth] = 1 and D.CalendarDate > cte.DateCreated and D.CalendarDate < cte.NextDate) D
ORDER BY PublicationID , DateCreated		
Free Windows Admin Tool Kit Click here and download it now
August 27th, 2015 3:22am

DayOfTheMonth is what its name applies, e.g. the day of the month, in this particular case it's 1 for all rows.

E.g.

DECLARE @PublicationsByUser AS TABLE
(
  Id   INT,
  PublicationId  INT,
  MetricId       INT,
  ProviderId     INT,
  DateCreated    DATE,
  Amount         FLOAT
);


INSERT INTO @PublicationsByUser
VALUES (1,1,1,1,'20150101',10),
  (2,1,1,1,'20150201',15),
  (3,1,1,1,'20150401',20),
  (4,1,1,1,'20150501',20),
  (5,1,1,1,'20150801',25),
  (6,3,1,1,'20150101',10),
  (7,2,1,1,'20150101',1),
  (8,2,1,1,'20150601',6);

--SELECT * FROM @PublicationsByUser


DECLARE  @Dates AS TABLE
([DayOfMonth] tinyint,
 CalendarDate date
 );


 INSERT INTO @Dates
 values(1,'20150101'),(1,'20150201'),(1,'20150301'),(1,'20150401'),(1,'20150501'),(1,'20150601'),
 (1,'20150701'), (1, '20150801');

;with cte as (select *, COALESCE(LEAD(DateCreated, 1) OVER(partition by PublicationId, MetricId, ProviderId order by DateCreated), dateadd(day,1, DateCreated))  as NextDate from @PublicationsByUser)
select cte.ID, cte.PublicationId, cte.MetricId, cte.ProviderId, d.CalendarDate as DateCreated, cte.Amount, cte.NextDate  from cte

CROSS APPLY (select D.CalendarDate
from @Dates D where D.[DayOfMonth] = 1 and D.CalendarDate >= cte.DateCreated and D.CalendarDate < cte.NextDate)
 D
 order by cte.Id, D.CalendarDate
August 27th, 2015 8:39am

Hi Geovanny,

A tricky way to get the expected output, though I would still suggest you follow Naomi's advice to use a calendar table.

DECLARE @PublicationsByUser AS TABLE
(
  Id   INT,
  PublicationId  INT,
  MetricId       INT,
  ProviderId     INT,
  DateCreated    DATE,
  Amount         FLOAT
);

INSERT INTO @PublicationsByUser
VALUES (1,1,1,1,'20150101',10),
  (2,1,1,1,'20150201',15),
  (3,1,1,1,'20150401',20),
  (4,1,1,1,'20150501',20),
  (5,1,1,1,'20150801',25),
  --(6,1,1,1,'20150101',10),
  (7,2,1,1,'20150101',1),
  (8,2,1,1,'20150601',6); 
   
;WITH Cte AS
(
SELECT *,LEAD(DateCreated) OVER(PARTITION BY PublicationId ORDER BY DateCreated) endDt FROM @PublicationsByUser
),
Cte2 AS
(
SELECT PublicationId,MetricId,ProviderId,DATEADD(MONTH,1,DateCreated) DateCreated,DATEADD(MONTH,-1,endDt) endDt,Amount FROM Cte WHERE DATEDIFF(MONTH,DateCreated,endDT)>1
UNION ALL
SELECT PublicationId,MetricId,ProviderId,DATEADD(MONTH,1,DateCreated),endDt,Amount FROM Cte2 
WHERE DateCreated<endDT  
)
SELECT PublicationId,MetricId,ProviderId, DateCreated,Amount FROM CTE2
UNION ALL
SELECT PublicationId,MetricId,ProviderId, DateCreated,Amount FROM @PublicationsByUser
ORDER BY PublicationId,DateCreated

If you have any question, feel free to let me know.


Free Windows Admin Tool Kit Click here and download it now
August 28th, 2015 1:52am

Hi Geovanny,

If you are using SS 2012 o greater then you could use an off-set function like LEAD to get the next value of [DateCreated] and calculate the difference in months between current row and next one. The rest will be to use the apply operator to get as many rows as needed to replicate current row from an auxiliary table of numbers.

http://sqlmag.com/sql-server/virtual-auxiliary-table-numbers

Example: (I can't create the function so I am going to use the CTE with a static TOP value instead)

DECLARE @PublicationsByUser AS TABLE
(
  Id   INT,
  PublicationId  INT,
  MetricId       INT,
  ProviderId     INT,
  DateCreated    DATE,
  Amount         FLOAT
);

INSERT INTO @PublicationsByUser
VALUES 
  (1,1,1,1,'20150101',10),
  (2,1,1,1,'20150201',15),
  (3,1,1,1,'20150401',20),
  (4,1,1,1,'20150501',20),
  (5,1,1,1,'20150801',25),
  (7,2,1,1,'20150101',1),
  (8,2,1,1,'20150601',6);

WITH   
L0   AS(SELECT 1 AS c UNION ALL SELECT 1),
L1   AS(SELECT 1 AS c FROM L0 AS A CROSS JOIN L0 AS B),
L2   AS(SELECT 1 AS c FROM L1 AS A CROSS JOIN L1 AS B),
L3   AS(SELECT 1 AS c FROM L2 AS A CROSS JOIN L2 AS B),
L4   AS(SELECT 1 AS c FROM L3 AS A CROSS JOIN L3 AS B),
L5   AS(SELECT 1 AS c FROM L4 AS A CROSS JOIN L4 AS B),
Nums AS(SELECT ROW_NUMBER() OVER(ORDER BY (SELECT NULL)) AS n FROM L5), 
R AS (
SELECT 
    *, 
    DATEDIFF(MONTH, DateCreated, LEAD(DateCreated) OVER(
    PARTITION BY PublicationId, ProviderId, MetricId 
    ORDER BY DateCreated
    )) AS diff
FROM @PublicationsByUser
)
SELECT
    R.PublicationId,
    R.MetricId,
    R.ProviderId,
    DATEADD(MONTH, ISNULL(T.n - 1, 0), R.DateCreated) AS DateCreated,
    R.Amount,
    R.diff,
    T.n
FROM
	R OUTER APPLY (SELECT TOP (100) n FROM Nums WHERE n <= R.diff ORDER BY n) AS T
ORDER BY
    R.PublicationId,
    R.MetricId,
    R.ProviderId,
    DateCreated;
GO

This is how it will look like with the function in place:

WITH R AS (
SELECT 
    *, 
    DATEDIFF(MONTH, DateCreated, LEAD(DateCreated) OVER(
    PARTITION BY PublicationId, ProviderId, MetricId 
    ORDER BY DateCreated
    )) AS diff
FROM @PublicationsByUser
)
SELECT
    R.PublicationId,
    R.MetricId,
    R.ProviderId,
    DATEADD(MONTH, ISNULL(T.n - 1, 0), R.DateCreated) AS DateCreated,
    R.Amount,
    R.diff,
    T.n
FROM
    R 
    OUTER APPLY 
    dbo.GetNums(R.diff) AS T
ORDER BY
    R.PublicationId,
    R.MetricId,
    R.ProviderId,
    DateCreated;
GO


August 28th, 2015 10:51am

Hunchback, your answer has been amazing, the result and the optimization of the code is awesome.  I found a previous solution from a reading of Itzik Ben-Gan book, even I read the use of the function GetNums (the name that Itzik gave this function), but at the end I got the range of gaps, but I used a cursor for creating the new values, but your answer obviously is better, I will review each little detailed for trying to understand in depp.  Again, thank you very much.

Gracias !!!!

Free Windows Admin Tool Kit Click here and download it now
August 28th, 2015 4:24pm

Did you see my original response that provided that solution using the existing Calendar table (or you could have used Numbers table with the similar result)?
August 28th, 2015 4:45pm

Yes Naomi, even I tested with a temporary Dates Table, but I took the decision of chose the option of Hunback because I have in my production enviroment the same GetNum function, I don't make a comparison at performance, but I will testing for the future and I will request the creation of a Dates Table, Did you have an standar script for generating.  
Free Windows Admin Tool Kit Click here and download it now
August 28th, 2015 5:17pm

In fact, I am writing a little article with my solution (my not very good solution) and I would like to take the code that Naomi and Hunback posted for showing better options, If both are agrement, because I think that this type of solution and problems are very common.

Thanks.

August 28th, 2015 5:21pm

Sure, go ahead and use that code.

I have code that produces Calendar table and Numbers table in this article

Passing multiple ranges to stored procedure

somewhere close to the beginning of the article. In our database we have different version of that table called Dates and with a different structure, so I picked up that existing table when writing my solution.

Free Windows Admin Tool Kit Click here and download it now
August 28th, 2015 5:26pm

Thank you very much Naomi, when I finished the article I am going to notify in this thread, my mother tongue is spanish, but I will try to make my best effort in English.
August 28th, 2015 5:48pm

This topic is archived. No further replies will be accepted.

Other recent topics Other recent topics