Implementing CTE for filling gaps inside a sequence

This is the second article of a series dedicated to implementing Recursive Common Table Expression in different kind of scenarios, and this time I will address another use case; it basically consists in filling gaps inside a sequence of an incomplete range of dates.  I would like to comment that the next example has been taken from MSDN forum, I have chosen because I think it is an excellent way to prove how versatile the recursive CTE is, the most creative solution was proposed by Peter Larsson (swePeso) and he brought us  a valid and optimal implementation which is useful since SQL Server 2005 and up versions.

The original dataset is presented below:

Expected output:

The request consists in fill the gaps with the consecutive dates and the previous valid weights, for instance, we should add the values: 2018-11-03, 2018-11-04, 2018-11-05 to complete the range between 2018-11-01 and 2018-11-06, for the previous range the weights must be equal to 8.2 which was the last value before the starting gap.

As part of the user requirement, the solution must be compatible with SQL Server 2008, it means, we cannot use LEAD-LEG Windows functions, however, the optimal solution of Peter can allow us to overpassed the version limitation.

Proposed solution:

DECLARE @Sample TABLE
(
DATECOL DATETIME ,
WEIGHTS FLOAT
);

INSERT @Sample
VALUES ( '11/02/2018', 8.2 ) ,
( '11/01/2018', 9.4 ) ,
( '11/06/2018', 10 ) ,
( '11/09/2018', 9.3 ) ,
( '11/10/2018', 9.7 );

DECLARE @WantedDate DATETIME = '20181113';

WITH cteData ( minDate, Weight, maxDate )
AS ( SELECT TOP ( 1 ) DATECOL ,
WEIGHTS ,
@WantedDate
FROM @Sample
ORDER BY DATECOL
UNION ALL
SELECT DATEADD(DAY, 1, d.minDate) ,
ISNULL(
( SELECT x.WEIGHTS
FROM @Sample AS x
WHERE x.DATECOL = DATEADD(DAY, 1, d.minDate)
) ,
d.Weight
) ,
d.maxDate
FROM cteData AS d
WHERE d.minDate d.maxDate
)
SELECT minDate AS DATECOL ,
Weight AS WEIGHTS
FROM cteData
OPTION ( MAXRECURSION 3 );

In the first section of the CTE (anchor member), it set up the lower date and assign the maxdate value which corresponds to a value chosen by the author where the only point here is that the date would be higher than max value in the dataset (11/10/2018) of dates.  The first rows generated will have the next values:

DateColWeight MaxDate
2018-11-019.42018-11-13

The recursive section corresponding to the part after UNION ALL inside the CTE allow us to start generating an incremental date (day by day) on the original minDate loaded from the anchor member, in the other hand, the Weight column can be obtained from two different ways, if the corresponding DateCol is part of the original Dataset (Akka @Sample values) it is going to take it from the predicate which is part of the subquery, I mean: WHERE x.DATECOL = DATEADD(DAY, 1, d.minDate). In this case, the row would be a gap filled row, for instance: 11/03/2018, it will take the Weight value from the immediate previous result set, which in our example, it is 8.2.

An interesting way of seeing the iterative cycle and how recursive CTE work is using the MAXRECURSION option, in short words, the MAXRECURSION is a hint to specify how many recursions it is allowed inside the CTE.  If we limit it to 3, you can observe the number of iteration and the results accumulated in every case, we should change the HINT to OPTION (MAXRECURSION 3).

It will return an error because the HINT would cause that the CTE will not be able to continue working

However, the interesting part that I will like to show you is the next:

The first stage corresponding to the anchor member can be considered as a pre-stage before to start the iteration, otherwise, with the initial values set up, the process starting to working iteratively and it has stopped once time it has reached the limit settled down.

I would like to encourage you to continue reviewing more interesting aspects related to recursive CTE, recently, Hugo Kornelis has written a very useful article about the Execution Plan and his common operator in a classic recursive CTE approach, more details in this link:

Summary

This article is the second of a trilogy dedicated to recursive CTE, here I have tried to address a very creative use of this sort of approach, fortunately MSDN and StackOverflow are a great source of  use cases, because many users can leave very amazing scenarios that probably we could not invent or have faced in the past.

The main credit of this article is for Peter Larsson who has implemented very creative solutions which potentiate the set theory in his TSQL code, my work has limited to bring and complement some details about the solution.  The use of recursive CTE represents a valuable tool that it is worth to include as part of our toolbox and utilizing as a valid option and leaving the use of cursors or similar as a last resource.

Happy Querying!!!!

geohernandez

View Comments

  • This page certainly has all of the information and facts I wanted concerning this subject and didn’t know who to ask.

Share
Published by
geohernandez
Tags: CTE

Recent Posts

Getting Started with Snowflake’s Snowpipe for Data Ingestion on Azure

In this post, I'll explain how to take full advantage of Snowflake's Snowpipe for reliable…

2 weeks ago

Integrating Azure Blob Storage with Snowflake: A Guide to Storage Integration and Staging Data

This article offers a comprehensive, step-by-step guide for integrating Azure Blob Storage with a Snowflake…

2 weeks ago

Getting Started with SnowSQL: Connecting to Your Snowflake Account

In this quick guide, we’ll walk through the essential steps to connect to Snowflake using…

3 months ago

A new step in my career as a Senior Data Architect

I am thrilled to share that I have embarked on a new professional journey as…

7 months ago

Data Modeling and its relevance in the Cloud Era

Since 2005, I've immersed myself in the dynamic world of data and its modeling. It's…

9 months ago

Formatting our Postgres scripts with pgformatter in DBeaver

Are you a PostgreSQL enthusiast using DBeaver on a Windows Platform? If you find yourself…

1 year ago