Implementing CTE for filling gaps inside a sequence

This is the second article of a series dedicated to implementing Recursive Common Table Expression in different kind of scenarios, and this time I will address another use case; it basically consists in filling gaps inside a sequence of an incomplete range of dates.  I would like to comment that the next example has been taken from MSDN forum, I have chosen because I think it is an excellent way to prove how versatile the recursive CTE is, the most creative solution was proposed by Peter Larsson (swePeso) and he brought us  a valid and optimal implementation which is useful since SQL Server 2005 and up versions.

The original dataset is presented below:

Expected output:

The request consists in fill the gaps with the consecutive dates and the previous valid weights, for instance, we should add the values: 2018-11-03, 2018-11-04, 2018-11-05 to complete the range between 2018-11-01 and 2018-11-06, for the previous range the weights must be equal to 8.2 which was the last value before the starting gap.

As part of the user requirement, the solution must be compatible with SQL Server 2008, it means, we cannot use LEAD-LEG Windows functions, however, the optimal solution of Peter can allow us to overpassed the version limitation.

Proposed solution:

DECLARE @Sample TABLE
(
DATECOL DATETIME ,
WEIGHTS FLOAT
);

INSERT @Sample
VALUES ( '11/02/2018', 8.2 ) ,
( '11/01/2018', 9.4 ) ,
( '11/06/2018', 10 ) ,
( '11/09/2018', 9.3 ) ,
( '11/10/2018', 9.7 );

DECLARE @WantedDate DATETIME = '20181113';

WITH cteData ( minDate, Weight, maxDate )
AS ( SELECT TOP ( 1 ) DATECOL ,
WEIGHTS ,
@WantedDate
FROM @Sample
ORDER BY DATECOL
UNION ALL
SELECT DATEADD(DAY, 1, d.minDate) ,
ISNULL(
( SELECT x.WEIGHTS
FROM @Sample AS x
WHERE x.DATECOL = DATEADD(DAY, 1, d.minDate)
) ,
d.Weight
) ,
d.maxDate
FROM cteData AS d
WHERE d.minDate d.maxDate
)
SELECT minDate AS DATECOL ,
Weight AS WEIGHTS
FROM cteData
OPTION ( MAXRECURSION 3 );

In the first section of the CTE (anchor member), it set up the lower date and assign the maxdate value which corresponds to a value chosen by the author where the only point here is that the date would be higher than max value in the dataset (11/10/2018) of dates.  The first rows generated will have the next values:

DateColWeight MaxDate
2018-11-019.42018-11-13

The recursive section corresponding to the part after UNION ALL inside the CTE allow us to start generating an incremental date (day by day) on the original minDate loaded from the anchor member, in the other hand, the Weight column can be obtained from two different ways, if the corresponding DateCol is part of the original Dataset (Akka @Sample values) it is going to take it from the predicate which is part of the subquery, I mean: WHERE x.DATECOL = DATEADD(DAY, 1, d.minDate). In this case, the row would be a gap filled row, for instance: 11/03/2018, it will take the Weight value from the immediate previous result set, which in our example, it is 8.2.

An interesting way of seeing the iterative cycle and how recursive CTE work is using the MAXRECURSION option, in short words, the MAXRECURSION is a hint to specify how many recursions it is allowed inside the CTE.  If we limit it to 3, you can observe the number of iteration and the results accumulated in every case, we should change the HINT to OPTION (MAXRECURSION 3).

It will return an error because the HINT would cause that the CTE will not be able to continue working

However, the interesting part that I will like to show you is the next:

The first stage corresponding to the anchor member can be considered as a pre-stage before to start the iteration, otherwise, with the initial values set up, the process starting to working iteratively and it has stopped once time it has reached the limit settled down.

I would like to encourage you to continue reviewing more interesting aspects related to recursive CTE, recently, Hugo Kornelis has written a very useful article about the Execution Plan and his common operator in a classic recursive CTE approach, more details in this link:

Summary

This article is the second of a trilogy dedicated to recursive CTE, here I have tried to address a very creative use of this sort of approach, fortunately MSDN and StackOverflow are a great source of  use cases, because many users can leave very amazing scenarios that probably we could not invent or have faced in the past.

The main credit of this article is for Peter Larsson who has implemented very creative solutions which potentiate the set theory in his TSQL code, my work has limited to bring and complement some details about the solution.  The use of recursive CTE represents a valuable tool that it is worth to include as part of our toolbox and utilizing as a valid option and leaving the use of cursors or similar as a last resource.

Happy Querying!!!!

geohernandez

View Comments

  • This page certainly has all of the information and facts I wanted concerning this subject and didn’t know who to ask.

Share
Published by
geohernandez
Tags: CTE

Recent Posts

Formatting our Postgres scripts with pgformatter in DBeaver

Are you a PostgreSQL enthusiast using DBeaver on a Windows Platform? If you find yourself…

4 months ago

Looking back to Kimball’s approach to Data Warehousing

Over time, it's fascinating to witness how certain concepts, approaches, or visions age. Time, in…

5 months ago

List Comprehension and Walrus operator in Python

When we are working with lists, dictionaries, and sets in Python, we have a special…

9 months ago

Playing with some Pandas functions and Airflow operators

Recently, I was dealing with a task where I had to import raw information into…

11 months ago

Using interpolated format strings in Python

The release of Python 3.6 came with exciting functionalities. I want to speak about the…

1 year ago

Getting the last modified directory with Python

Working with os paths is a prevalent task, especially when working on a console application…

1 year ago