I have to confess; I have had traumatic experiences with variant data type in my beginning as a software developer, in fact, I always have believed that it is an anti-pattern, at least in the OOP world, on the other hand in the Database world it gains much relevance, Why? Because of every column inside a table is supposed to be a specific datatype, otherwise, it could create a great number of conflicts and uncomfortable situations.
Do you think that is good having a kind of all-in-one data type? Frankly, my experience tells me: No. I don’t come to my any positive memories derived from variant datatype, on the other hand, the DB world by definition must be more restrictive and the idea of having this magic or “versatile” datatype is very contradictory. However, we have to accept that we are not living in a perfect world and probably my strong position in some aspect of programming is not sharing by others, and we have in MS SQL Server a magic datatype, please, a great applause for the brilliant and amazing Mr.Sql_variant.
I don’t expect to spend much time describing sql_variant datatype, I would like to mention the official explanation from Microsoft, in simple words: “A data type that stores values of various SQL Server supported data types”. It looks nice in the beginning, but the aim of this short blog is to show you a specific scenario where this implementation can bring us many headaches.
In this case, suppose that we decide to query the system table: sys.partition_range_values, and for some tasks we want to filter some information based on [value] column. In the context of this article, I would like to use a very nice view designed by Kalen Delaney, this is the link:
http://sqlblog.com/blogs/kalen_delaney/archive/2010/07/26/my-big-fat-partitioning-query.aspx
Basically, this view can allow us to get different information related to partitions, indexes, ranges,
a very complete suite to query and work in the partitioning of huge tables. One time I said that, suppose that you want to do a query like the next:
DECLARE @oldestDate AS DATETIME2 = '2018-09-10 15:46:47.1270709'; SELECT MAX(partition_number) FROM ptnmnt.vw_PartitionInfo WHERE index_id = 2 AND partitionFunctionName = 'SomeFunctionName' AND ( boundary_value_on_right = 1 AND CAST(value AS DATETIME2) = @oldestDate OR boundary_value_on_right = 0 AND CAST(value AS DATETIME2) = @oldestDate );
The worst part is the awful message that you are going to receive:
A simple way of “fixing” consist in replace the datatype, and you can think that it would be enough,
but let me do it
DECLARE @oldestDate AS DATETIME2 = '2018-09-10 15:46:47.1270709'; SELECT MAX(partition_number) FROM ptnmnt.vw_PartitionInfo WHERE index_id = 2 AND partitionFunctionName = 'SomeFunctionName' AND ( boundary_value_on_right = 1 AND CAST(value AS DATETIME) = @oldestDate OR boundary_value_on_right = 0 AND CAST(value AS DATETIME) = @oldestDate );
We will receive another ugly message like this:
It is on this point when you probably want to take the justice in your own hands (sorry, I did not find a good gif for the previous comment), Why are we trying to fix through try-error-try approach? In this kind of situation is when we should stop and start to review in deep, what kind of datatype is expected to have in value column, and magically appear the famous Mr.SqlVariant
Reviewing inside the sys.sysobjvalues we can find this:
Indeed, the column [value] is a sql_variant, what exactly does mean? Basically, it can store a variety of datatypes, a good option to know exactly what kind of datatype it is using can be the function << SQL_VARIANT_PROPERTY>> this functions returns the base datatype of sql_variant value, but how this function can help us to avoid the previous errors, the next script is a sample about how
we can use it:
SELECT TOP 1 @DataTypeSQLVariant = LOWER(CONVERT( VARCHAR(25) , SQL_VARIANT_PROPERTY (value ,'BaseType') ) ) FROM ptnmnt.vw_PartitionInfo WHERE index_id = 2 AND value IS NOT NULL AND partitionFunctionName = 'SomeFunctionName';
This previous query supplied me the datatype that we can use to verify and avoid error in the next query:
IF @DataTypeSQLVariant = 'datetime' BEGIN SELECT @LastEmptyPartitionNumber = MAX(partition_number) FROM ptnmnt.vw_PartitionInfo WHERE index_id < 2 AND partitionFunctionName = 'SomeFunctionName' AND ( boundary_value_on_right = 1 AND TRY_CAST(value AS DATETIME) <= @oldestDate OR boundary_value_on_right = 0 AND TRY_CAST(value AS DATETIME) < @oldestDate ); END;
In case that you are able to work with different partition function name as a parameter, you can check the different base datatypes which belong to partition function name and create a group of IF statement or even try to implement SQL Dynamic for each feasible datatype.
Summary
As you can see, the sql_variant datatype shall represent us a challenge and we need to be careful in the ways that we want to use, in this scenario we had to adapt our query to different possibility and returned base datatype. Fortunately, we count with a valuable support function, are these situations where we thankful for adopting a defensive strategy and avoid awful conversion errors.
Happy Querying!!!!
In this quick guide, we’ll walk through the essential steps to connect to Snowflake using…
I am thrilled to share that I have embarked on a new professional journey as…
Since 2005, I've immersed myself in the dynamic world of data and its modeling. It's…
Are you a PostgreSQL enthusiast using DBeaver on a Windows Platform? If you find yourself…
Over time, it's fascinating to witness how certain concepts, approaches, or visions age. Time, in…
When we are working with lists, dictionaries, and sets in Python, we have a special…
View Comments
Very nice work Geo !! Keep up the good work.
Awesome article Geooooooooo! very well detailed and easy to read....