geohernandez
Menu
  • HOME
  • ABOUT
  • CONTACT ME
  • WORK WITH GEO
    • Data Specialist
    • Speaker Events
    • Resume
  • English
    • English
    • Español
Menu

Implementing CTE for filling gaps inside a sequence

Posted on November 15, 2018March 13, 2019 by geohernandez

This is the second article of a series dedicated to implementing Recursive Common Table Expression in different kind of scenarios, and this time I will address another use case; it basically consists in filling gaps inside a sequence of an incomplete range of dates.  I would like to comment that the next example has been taken from MSDN forum, I have chosen because I think it is an excellent way to prove how versatile the recursive CTE is, the most creative solution was proposed by Peter Larsson (swePeso) and he brought us  a valid and optimal implementation which is useful since SQL Server 2005 and up versions.

The original dataset is presented below:

Expected output:

The request consists in fill the gaps with the consecutive dates and the previous valid weights, for instance, we should add the values: 2018-11-03, 2018-11-04, 2018-11-05 to complete the range between 2018-11-01 and 2018-11-06, for the previous range the weights must be equal to 8.2 which was the last value before the starting gap.

As part of the user requirement, the solution must be compatible with SQL Server 2008, it means, we cannot use LEAD-LEG Windows functions, however, the optimal solution of Peter can allow us to overpassed the version limitation.

Proposed solution:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
[sql]
DECLARE @Sample TABLE
(
DATECOL DATETIME ,
WEIGHTS FLOAT
);
 
INSERT @Sample
VALUES ( '11/02/2018', 8.2 ) ,
( '11/01/2018', 9.4 ) ,
( '11/06/2018', 10 ) ,
( '11/09/2018', 9.3 ) ,
( '11/10/2018', 9.7 );
 
DECLARE @WantedDate DATETIME = '20181113';
 
WITH cteData ( minDate, Weight, maxDate )
AS ( SELECT TOP ( 1 ) DATECOL ,
WEIGHTS ,
@WantedDate
FROM @Sample
ORDER BY DATECOL
UNION ALL
SELECT DATEADD(DAY, 1, d.minDate) ,
ISNULL(
( SELECT x.WEIGHTS
FROM @Sample AS x
WHERE x.DATECOL = DATEADD(DAY, 1, d.minDate)
) ,
d.Weight
) ,
d.maxDate
FROM cteData AS d
WHERE d.minDate d.maxDate
)
SELECT minDate AS DATECOL ,
Weight AS WEIGHTS
FROM cteData
OPTION ( MAXRECURSION 3 );
[/sql]

In the first section of the CTE (anchor member), it set up the lower date and assign the maxdate value which corresponds to a value chosen by the author where the only point here is that the date would be higher than max value in the dataset (11/10/2018) of dates.  The first rows generated will have the next values:

DateColWeight MaxDate
2018-11-019.42018-11-13

The recursive section corresponding to the part after UNION ALL inside the CTE allow us to start generating an incremental date (day by day) on the original minDate loaded from the anchor member, in the other hand, the Weight column can be obtained from two different ways, if the corresponding DateCol is part of the original Dataset (Akka @Sample values) it is going to take it from the predicate which is part of the subquery, I mean: WHERE x.DATECOL = DATEADD(DAY, 1, d.minDate). In this case, the row would be a gap filled row, for instance: 11/03/2018, it will take the Weight value from the immediate previous result set, which in our example, it is 8.2.

An interesting way of seeing the iterative cycle and how recursive CTE work is using the MAXRECURSION option, in short words, the MAXRECURSION is a hint to specify how many recursions it is allowed inside the CTE.  If we limit it to 3, you can observe the number of iteration and the results accumulated in every case, we should change the HINT to OPTION (MAXRECURSION 3).

It will return an error because the HINT would cause that the CTE will not be able to continue working

However, the interesting part that I will like to show you is the next:

The first stage corresponding to the anchor member can be considered as a pre-stage before to start the iteration, otherwise, with the initial values set up, the process starting to working iteratively and it has stopped once time it has reached the limit settled down.

I would like to encourage you to continue reviewing more interesting aspects related to recursive CTE, recently, Hugo Kornelis has written a very useful article about the Execution Plan and his common operator in a classic recursive CTE approach, more details in this link:

Plansplaining, part 9. Recursive CTEs

Summary

This article is the second of a trilogy dedicated to recursive CTE, here I have tried to address a very creative use of this sort of approach, fortunately MSDN and StackOverflow are a great source of  use cases, because many users can leave very amazing scenarios that probably we could not invent or have faced in the past.

The main credit of this article is for Peter Larsson who has implemented very creative solutions which potentiate the set theory in his TSQL code, my work has limited to bring and complement some details about the solution.  The use of recursive CTE represents a valuable tool that it is worth to include as part of our toolbox and utilizing as a valid option and leaving the use of cursors or similar as a last resource.

Happy Querying!!!!

Category: Chronicles from the trenches, T-SQL

1 thought on “Implementing CTE for filling gaps inside a sequence”

  1. Franklin Quinoes says:
    May 17, 2021 at 7:00 pm

    This page certainly has all of the information and facts I wanted concerning this subject and didn’t know who to ask.

    Reply

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Search for articles

Recent Posts

  • Quick Guide: BigQuery Service Account Setup Using gcloud
  • The Art of Data Modeling in AI times
  • Getting Started with Snowflake’s Snowpipe for Data Ingestion on Azure

Categories

  • Airflow (1)
  • Azure (6)
  • Azure DevOps (2)
  • Bash script (1)
  • Blog (1)
  • Cassandra (3)
  • Chronicles from the trenches (26)
  • Data Architecture (3)
  • Data Engineering (11)
  • DB optimization (2)
  • Events (2)
  • GIT (1)
  • MySQL (1)
  • Python (7)
  • Snowflake (3)
  • SQL Saturday (1)
  • SSIS (2)
  • T-SQL (5)
  • Uncategorized (2)

Archives

  • May 2025 (1)
  • March 2025 (1)
  • January 2025 (2)
  • October 2024 (1)
  • July 2024 (1)
  • May 2024 (1)
  • December 2023 (1)
  • November 2023 (1)
  • August 2023 (1)
  • June 2023 (1)
  • December 2022 (1)
  • November 2022 (1)
  • July 2022 (1)
  • March 2022 (1)
  • September 2021 (1)
  • May 2021 (1)
  • March 2021 (1)
  • February 2021 (3)
  • December 2020 (1)
  • October 2020 (3)
  • September 2020 (1)
  • August 2020 (1)
  • January 2020 (1)
  • August 2019 (1)
  • July 2019 (1)
  • June 2019 (1)
  • May 2019 (1)
  • April 2019 (1)
  • March 2019 (1)
  • November 2018 (3)
  • October 2018 (1)
  • September 2018 (1)
  • August 2018 (2)
© 2025 geohernandez | Powered by Minimalist Blog WordPress Theme