T-SQL – Darling Data

A Quirk When Rewriting Scalar UDFs In SQL Server

Posted on March 28, 2023March 4, 2023 by Erik Darling

Back To Business

I think I have probably spent 500 hours of my life rewriting T-SQL Scalar UDFs to avoid all the performance problems associated with them.

The obvious choice is the Inline Table Valued Function, which has fewer performance issues baked in. For the kids out there: they don’t spill trauma.

But getting the rewrite right can be tricky, especially when it’s possible for the function to return NULL values.

I’m going to walk through a simple example, and show you how to get the results you want, without adding abusing your developers.

What is not covered in this post are all the performance issues caused by UDFs. If you want to get into that, click the training link at the bottom of this post.

The Problem

Here’s the function we need to rewrite. It returns a simple bit value if a particular user was active after a certain date:

CREATE OR ALTER FUNCTION
    dbo.rewrite
(
    @UserId int,
    @LastAccessDate datetime
)
RETURNS bit
AS
BEGIN
    DECLARE
        @b bit = 0,
        @d datetime = GETDATE(); /*NOFROID4U*/
    
    SELECT
        @b = 
            CASE
                WHEN u.Id IS NOT NULL
                THEN 1
                ELSE 0
            END
    FROM dbo.Users AS u
    WHERE u.Id = @UserId
    AND   u.LastAccessDate > @LastAccessDate;

    RETURN
        @b;
END;
GO

Since I’m using SQL Server 2022 in compatibility level 160, I’m declaring a useless datetime parameter and using GETDATE() to set it to a value to avoid scalar UDF inlining.

We can call it about like so (again, this query is too trivial to suffer any performance issues), and get some reasonable-looking results back.

SELECT TOP (10)
    p.OwnerUserId,
    p.LastActivityDate,
    thing0 = 
        dbo.rewrite
        (
            p.OwnerUserId, 
            GETDATE()
        )
FROM dbo.Posts AS p
WHERE p.Score = 1;

Writeable Media

Rewriting this function looks straightforward. All we need to do is Robocop a few parts and pieces and badabing badaboom we’re done.

Note that to really complete this, we’d also need to add a convert to bit to avoid SQL Server implicitly converting the output of the case expression to a (potentially) different datatype, but we’ll fix that in the final rewrite.

CREATE OR ALTER FUNCTION
    dbo.the_rewrite
(
    @UserId int,
    @LastAccessDate datetime
)
RETURNS table
WITH SCHEMABINDING
AS
RETURN    
    SELECT
        b = 
            CASE
                WHEN u.Id IS NOT NULL
                THEN 1
                ELSE 0
            END
    FROM dbo.Users AS u
    WHERE u.Id = @UserId
    AND   u.LastAccessDate > @LastAccessDate;
GO

Of course, this alters how we need to reference the function in the calling query. Inline table valued functions are totally different types of objects from scalar UDFs.

SELECT TOP (10)
    p.OwnerUserId,
    p.LastActivityDate,
    thing1 = 
        (
            SELECT 
                t.b 
            FROM dbo.the_rewrite
            (
                p.OwnerUserId, 
                GETDATE()
            ) AS t
        )
FROM dbo.Posts AS p
WHERE p.Score = 1;

But the results are disappointing! Where we once had perfectly formed zeroes, now we have a bunch of NULLs that severely harsh our mellow.

This can obviously cause problems for whomever or whatever is ingesting the result set.

Expectations: 1 or 0
Reality: NULL

Shame, that.

Changing The Query

Many developers will attempt something like this first, to replace NULLs in the calling query:

SELECT TOP (10)
    p.OwnerUserId,
    p.LastActivityDate,
    thing1 = 
        (
            SELECT 
                ISNULL
                (
                    t.b, 
                    0
                ) 
            FROM dbo.the_rewrite
            (
                p.OwnerUserId, 
                GETDATE()
            ) AS t
        )
FROM dbo.Posts AS p
WHERE p.Score = 1;

But this will still produce NULL realities where we have zeroed expectations. We could take a step way back and do something like this:

SELECT TOP (10)
    p.OwnerUserId,
    p.LastActivityDate,
    thing1 = 
        ISNULL
        (
            (
                SELECT 
                    t.b
                FROM dbo.the_rewrite
                (
                    p.OwnerUserId, 
                    GETDATE()
                ) AS t
            ),
            0
        )
FROM dbo.Posts AS p
WHERE p.Score = 1;
GO

But this is an ugly and annoying thing to remember. Imagine having to explain this to someone reading or trying to incorporate our beautiful new function into a query.

We should fix this inside the function.

Fixer Upper

I’m not going to pretend this is the only way to do this. You can likely figure out half a million ways to pet this cat. It’s just easy.

CREATE OR ALTER FUNCTION
    dbo.the_inner_rewrite
(
    @UserId int,
    @LastAccessDate datetime
)
RETURNS table
WITH SCHEMABINDING
AS
RETURN    
    
    SELECT
        b = 
            CONVERT
            (
                bit,
                MAX(x.b)
            )
    FROM
    (
        SELECT
            b = 
                CASE
                    WHEN u.Id IS NOT NULL
                    THEN 1
                    ELSE 0
                END
        FROM dbo.Users AS u
        WHERE u.Id = @UserId
        AND   u.LastAccessDate > @LastAccessDate
        
        UNION ALL
        
        SELECT
            b = 0
    ) AS x;
GO

We have:

Our original query, which may return 1 or 0 based on existence
A union all to a zero literal so that a result is guaranteed to be produced
An outer max to get the higher value between the two inner selects

And this will produce expected results, with the final output converted to a bit.

Thanks for reading!

Going Further

If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that, and need to solve database performance problems quickly. You can also get a quick, low cost health check with no phone time required.

How To Convert Binary And Varbinary Strings In SQL Server

Posted on August 19, 2022August 1, 2022 by Erik Darling

Noticeable

This is a short post that I wanted to write on the heels of doing a bunch of work in sp_QuickieStore.

Many times, pulling data out of odd structures like XML or JSON can lead to difficulty in correctly typing each output element. I run into this commonly with query plan XML, of course. You may run into it elsewhere.

The main issue is that I often need to compare what comes out of those odd data structures to data stored more properly in other system views. For example:

Query Hash: Binary 8
Query Plan Hash: Binary 8
SQL Handle: Varbinary 64
Plan Handle: Varbinary 64

There’s some shenanigans you can use around big ints, but I’ve run into a lot of bugs with that. I don’t want to talk about it.

Nutty

As an example, this won’t match:

SELECT
    c = 
        CASE
            WHEN '0x1AB614B461F4D769' = 0x1AB614B461F4D769
            THEN 1
            ELSE 0
        END;

The string does not implicitly convert to the binary 8 value. The same is true when you use varbinary values.

You might think that just converting the string to binary 8 would be enough, but no! This will still return a zero.

SELECT
    c = 
        CASE
            WHEN CONVERT(binary(8), '0x1AB614B461F4D769') = 0x1AB614B461F4D769
            THEN 1
            ELSE 0
        END;

In this case (ha ha ha), you need to use the additional culture parameter to make this work.

Objective

Here’s why:

SELECT
    no =
        CONVERT(binary(8), '0x1AB614B461F4D769'),
    yes = CONVERT(binary(8), '0x1AB614B461F4D769', 1);

no	                yes
0x3078314142363134	0x1AB614B461F4D769

The same is true with varbinary, too:

SELECT
    no =
        CONVERT(varbinary(64), '0x0900F46AC89E66DF744C8A0AD4FD3D3306B90000000000000000000000000000000000000000000000000000'),
    yes = CONVERT(varbinary(64), '0x0900F46AC89E66DF744C8A0AD4FD3D3306B90000000000000000000000000000000000000000000000000000', 1);

no
0x30783039303046343641433839453636444637343443384130414434464433443333303642393030303030303030303030303030303030303030303030303030	

yes
0x0900F46AC89E66DF744C8A0AD4FD3D3306B90000000000000000000000000000000000000000000000000000

The real answer here is to not rely on conversions, implicit or otherwise, when comparing data.

But, if you ever find yourself having to deal with some wonky binary data, this is one way to get yourself out of a scrape.

Thanks for reading!

Going Further

Stored Procedures vs sp_executesql In SQL Server: Is One Better Than The Other?

Posted on July 28, 2022July 12, 2022 by Erik Darling

Basically

I get this question a lot while working with clients, largely in a couple specific contexts:

Me telling someone they need to use dynamic SQL in a stored procedure
Applications sending over parameterized SQL statements that are executed with sp_executesql

Often, the dynamic SQL recommendation comes from needing to deal with:

IF branching
Parameter sensitivity
Optional parameters
Local variables

Even in the context of a stored procedure, these things can really suck performance down to a sad nub.

But The Code

Now, much of the SQL generated by ORMs terrifies me.

Even when it’s developer misuse, and not the fault of the ORM, it can be difficult to convince those perfect angels that the query their code generated is submaximal.

Now, look, simple queries do fine with an ORM (usually). Provided you’re:

Paying attention to indexes
Not using long IN clauses
Strongly typing parameters
Avoiding AddWithValues

You can skate by with your basic CRUD stuffs. I get worried as soon as someone looks at an ORM query and says “oh, that’s a report…” because there’s no way you’re generating reasonable reporting queries with an ORM.

Procedural Drama

The real upside of stored procedures isn’t stuff like plan reuse or caching or 1:1 better performance. A single parameterized query run in either context will perform the same, all things considered.

Where they shine is with additional flexibility in tuning things. Rather than one huge query that the optimizer has to deal with, you can split things up into more manageable chunks.

You also have quite a bit more freedom with various hints, trace flags, query rewrites, isolation levels, etc.

In other words: eventually your query needs will outgrow your ORMs ability to generate optimal queries.

Until then, use whatever you’re able to get your job done with.

Thanks for reading!

Going Further

SQL Server 2017 CU 30: The Real Story With SelOnSeqPrj Fixes

Posted on July 19, 2022July 17, 2022 by Erik Darling

I am a heading

Thanks for watching! Demo scripts below.

Demo Scripts

USE StackOverflow2013;
EXEC dbo.DropIndexes;
SET NOCOUNT ON;
DBCC FREEPROCCACHE;
GO 


CREATE INDEX 
   chunk 
ON dbo.Posts 
    (OwnerUserId, Score DESC) 
INCLUDE  
    (CreationDate, LastActivityDate)
WITH
    (MAXDOP = 8, SORT_IN_TEMPDB = ON, DATA_COMPRESSION = PAGE);
GO 

CREATE OR ALTER VIEW 
    dbo.PushyPaul
WITH SCHEMABINDING
AS
    SELECT 
        p.OwnerUserId,
        p.Score,
        p.CreationDate,
        p.LastActivityDate,
        PostRank = 
            DENSE_RANK() OVER
            ( 
               PARTITION BY 
                  p.OwnerUserId 
               ORDER BY     
                  p.Score DESC 
            )
    FROM dbo.Posts AS p;
GO 

SELECT 
    p.* 
FROM dbo.PushyPaul AS p
WHERE p.OwnerUserId = 22656;
GO 

CREATE OR ALTER PROCEDURE 
    dbo.StinkyPete 
(
    @UserId int
)
AS 
SET NOCOUNT, XACT_ABORT ON;
BEGIN
    SELECT 
        p.* 
    FROM dbo.PushyPaul AS p
    WHERE p.OwnerUserId = @UserId;
END;
GO 

EXEC dbo.StinkyPete 
    @UserId = 22656;



/*Start Here*/

ALTER DATABASE 
    StackOverflow2013 
SET PARAMETERIZATION SIMPLE;

DBCC TRACEOFF
(
    4199, 
    -1
);

ALTER DATABASE SCOPED CONFIGURATION 
    SET QUERY_OPTIMIZER_HOTFIXES = OFF;


SELECT 
    p.* 
FROM dbo.PushyPaul AS p
WHERE p.OwnerUserId = 22656
AND   1 = (SELECT 1); /*Avoid trivial plan/simple parameterization*/


/*Let's cause a problem!*/
ALTER DATABASE 
    StackOverflow2013 
SET PARAMETERIZATION FORCED;

SELECT 
    p.* 
FROM dbo.PushyPaul AS p
WHERE p.OwnerUserId = 22656
AND   1 = (SELECT 1); /*Avoid trivial plan/simple parameterization*/


/*Can we fix the problem?*/
DBCC TRACEON
(
    4199, 
    -1
);


SELECT 
    p.* 
FROM dbo.PushyPaul AS p
WHERE p.OwnerUserId = 22656
AND   1 = (SELECT 1); /*Avoid trivial plan/simple parameterization*/


/*That's kinda weird...*/
DBCC FREEPROCCACHE;


SELECT 
    p.* 
FROM dbo.PushyPaul AS p
WHERE p.OwnerUserId = 22656
AND   1 = (SELECT 1); /*Avoid trivial plan/simple parameterization*/


/*Turn Down Service*/
DBCC TRACEOFF
(
    4199, 
    -1
);

SELECT 
    p.* 
FROM dbo.PushyPaul AS p
WHERE p.OwnerUserId = 22656
AND   1 = (SELECT 1); /*Avoid trivial plan/simple parameterization*/


/*Okay then.*/


/*I'm different.*/
ALTER DATABASE SCOPED CONFIGURATION 
    SET QUERY_OPTIMIZER_HOTFIXES = ON;


SELECT 
    p.* 
FROM dbo.PushyPaul AS p
WHERE p.OwnerUserId = 22656
AND   1 = (SELECT 1); /*Avoid trivial plan/simple parameterization*/



/*Cleanup*/
ALTER DATABASE 
    StackOverflow2013 
SET PARAMETERIZATION SIMPLE;

ALTER DATABASE SCOPED CONFIGURATION 
    SET QUERY_OPTIMIZER_HOTFIXES = OFF;

DBCC TRACEOFF
(
    4199, 
    -1
);

Going Further

SQL Server 2017 CU 30 Doesn’t Actually Fix The Problem With Views And Parameters

Posted on July 18, 2022July 16, 2022 by Erik Darling

I am a heading

In the release notes for SQL Server 2017 CU30, there’s a note that it fixes a problem where parameters can’t be pushed passed Sequence Project operators:

“In Microsoft SQL Server 2017, running parameterized queries skips the SelOnSeqPrj rule. Therefore, pushdown does not occur.” But it doesn’t actually do that.

Paul White Original Post: The Problem with Window Functions and Views

The Problem In The Plan

Here are the good and bad plans, comparing using a literal value vs. a parameterized value:

In the plan with a literal value, the predicate is applied at the index seek, and the filtering is really quick.
In the plan with a parameterized value, the index is scanned, and applied at a filter way later in the query plan.

This is where the SelOnSeqPrj rule comes in: The parameter can’t be pushed past the Sequence Project operator like the literal value can.

Thanks for reading!

Going Further

Advanced T-SQL String Searching And Parsing Techniques In SQL Server

Posted on June 22, 2022June 13, 2022 by Erik Darling

IndexOf

First, I want to advise you to avoid doing things as much as possible like this in T-SQL. Use CLR, if you can. Or do it outside of your really expensive database server.

You know, rational choices.

Recently, I had an interesting client problem where they had a variety of string formats, and different potential start and end points to get a substring from.

Normally, I love computed columns for this, but there was no way to get exactly what they wanted without using a whole bunch of them. It was sort of annoying.

Because it was easier to get across in T-SQL, we ended up using a trigger.

Don’t look at me like that.

CharindexOf

First, let’s load up a small temp table with some different string values.

CREATE TABLE
    #strings
(
    string varchar(4)
);

INSERT
    #strings
(
    string
)
SELECT '1234' 
  UNION ALL
SELECT '2341' 
  UNION ALL
SELECT '3412'    
  UNION ALL
SELECT '4123';

Now, let’s say that there was a need to figure out which number came first in the string. You can use the CHARINDEX function to do that, but it’s a little difficult to make it “dynamic”.

To make things easy, let’s create a helper table of search patterns.

CREATE TABLE
    #searches
(
    search varchar(1)
);

INSERT
    #searches
(
    search
)
SELECT '1' 
  UNION ALL
SELECT '2' 
  UNION ALL
SELECT '3'    
  UNION ALL
SELECT '4';

ApplyOf

Now we can use a little bit of apply and derived table magic to figure out which of our search values appear first in our strings.

SELECT 
    s.string, 
    x1.search,
    x1.search_position
FROM #strings AS s
CROSS APPLY 
(
    SELECT TOP (1) 
        x0.search,
        x0.search_position
    FROM 
    (
        SELECT
            s2.search,
            search_position = 
                CHARINDEX(s2.search, s.string)
        FROM #searches AS s2
     ) AS x0 
     ORDER BY x0.search_position
) AS x1;

The results look like this:

SubstringOf

Now we can add in the LEN and SUBSTRING functions in order to parse out the part of the column we’re interested in.

SELECT 
    s.string, 
    x1.search, 
    x1.search_position, 
    x1.search_length,
    sub = 
        SUBSTRING
        (
            x1.search,
            x1.search_position,
            x1.search_length
        )
FROM #strings AS s
CROSS APPLY 
(
    SELECT TOP (1) 
        x0.search,
        x0.search_position,
        x0.search_length
    FROM 
    (
        SELECT
            s2.search,
            search_position = 
                CHARINDEX(s2.search, s.string),
            search_length = 
                LEN(s2.search)
        FROM #searches AS s2
     ) AS x0 
     ORDER BY x0.search_position
) AS x1;

Now the results look like this:

This is an admittedly pretty rudimentary example, but it’s tough to make this complicated without making it overly so.

From here, you can use a second APPLY with CHARINDEX ordered descending to find an end position, and use more advanced SUBSTRING techniques to locate the text between two positions.

Thanks for reading!

Going Further

Desperately Seeking SQL: Advice For Small Companies Trying To Hire Quality Developers

Posted on June 8, 2022June 8, 2022 by Erik Darling

On Sight

I’m gonna take a quick break from (mostly) technical content to blog about a problem I see day in and day out when working with clients: Bad SQL©.

The stuff I normally blog about — why some things are Bad In SQL© — are the result of seeing real performance pains. But Bad SQL© isn’t just anti-patterns, it’s also terribly formatted queries that hide anti-patterns.

You can normally eyeball a query to find things that generally don’t agree with performance out of the box, like:

Functions (inline ones aside)
Table variables
Stacked Common Table Expressions
Non-SARGable predicates
Overly complicated queries
Insert a million other things here

But of course, the more complicated queries are, or the more layers of abstraction exist in a query, the harder this stuff is to spot quickly. Particularly with views, and nested views, bad ideas can be buried many layers deep in the sediment.

I call it sediment because code often looks like geologic layers, where you can tell who wrote what and when based on style and techniques that got used.

And to vendors who encrypt their god-awful code: �

The Great Untangling

Getting through that untangling can be a costly and time consuming process, depending on the level of damage done over the years, and the desired outcome. Sometimes it’s easier to rewrite everything from scratch than to do in-place rewrites of existing objects.

It’s obviously worth exploring enhancements in newer versions of SQL Server that may power things across the finish line:

Perhaps the new cardinality estimator does more good than harm
Batch Mode On Row Store does a lot of good with bad code
Scalar UDF Inlining can solve a lot of function problems

There are many other general and targeted improvements that might help your workload without code changes. Hopefully that continues with SQL Server 2022.

On top of the workload improvements, new versions also provide improved insights into problems via dynamic management views, Query Store, logging, and more.

If you’re not on at least SQL Server 2016 right now, you’re leaving a whole lot on the table as far as this goes.

Hiring Issues

It’s tough for smaller companies to attract full time talent to fix huge backlogs of issues across SQL Server stored procedures, functions, views, index and table design, and all that.

Or even harder, convert ORM queries into sensible stored procedures, etc. when you start hitting performance limitations in the single-query model.

I asked on Twitter what folks out there consider attractive employment offerings and got a TON of great feedback.

First, I need acknowledge that not everyone wants to work for a huge company. Second, I need to acknowledge that salary isn’t everything to everyone.

But let’s assume that a smaller company want to hire someone in competition with a larger company. What can they offer when they run out of salary runway, and can’t match equity?

Clear career paths/Upward mobility
Flexible schedules
Paid time off for training
Covering the costs of training and certifications
Focusing on employee growth (not just sticking them in a corner to monkey with the same thing for years)
Quality of company culture (meeting overload was something I got a lot of DMs about)
Conference travel budgets
Meaningful company mission
Introducing tech savvy folks to the business side of things
Recognizing that not every employee wants to be an On-callogist

There were more, but these were the things I got the most hits from folks on. Having these doesn’t mean you can expect someone to take 20-30% less on the salary front, of course, but if you’re close to another offer these things might sway folks to your side.

Far and away, what I took from responses is that folks want to feel effective; like they can make a difference without a lot of bureaucracy and red tape. Get the hell out of my way, to coin a phrase.

Finder’s Fee

When it comes to attracting people to your company — think of it as your employer SEO — the SQL Server community is a great place to start.

If you want to try something for free, keep an eye out for when Brent posts to find out Who’s Hiring In The Database Community. It doesn’t cost you anything, but you have to keep on top of the posts and replies, and make sure you have good job description that sticks out.

If you have any location-based requirements for your candidates, try sponsoring a local SQL Server user group’s meetings for a few months. There may be a small, nominal fee if it’s entirely virtual. If it’s in-person, you’ll foot the bill for dozen or so pizza pies for attendees. That usually gets you an announcement before and after whatever speaker is presenting. It’s totally fair to ask for attendance numbers. Keeping on with that, consider sponsoring a SQL Saturday event. These typically have a deeper reach than a monthly user group, since there are more attendees in a concentrated area. You may get a booth, or your logo on slides, and whatever else you can negotiate with the event planners.

If you’re okay with spending more money for a lot of eyeballs, larger events like PASS Summit, and SQLBits are annual conferences with thousands of attendees. As a FYI, these are the types of conferences whomever you hire is probably going to want to attend, too.

Imagine that.

Askance

I have clients ask me to help them find quality employees for roles from time to time, or to help them interview folks they’ve farmed themselves.

Normally I’m happy to help on either front, and leave sealing the deal to them. I think from now on I’m gonna point them to this post, so they have some better ideas about how to put a stamp on things.

Not every company can offer everything, but as large companies continue to gobble up smaller ones, and Microsoft in particular keeps fishing folks out of the MVP pool, it’s going to be harder for those who remain to stay competitive. At least I think so: I haven’t quite been persuaded that there will be a coomba ya moment where everyone gets sick of the MegaCorp grind and goes back to mom and pop shops to reclaim their lost souls.

After all, a lot of folks do have their sights set on retirement. High salaries and generous equity (well, maybe not equity as the market is currently behaving) certainly help get them there faster.

That’s part of the picture that you can’t easily ignore, along with the oft-proferred wisdom that the only way to stay on a competitive salary track is to change jobs every 2-3 years.

Retention is going to get more difficult for everyone across the board, but the revolving door will largely let out with the bigger players who can afford to keep it spinning.

Thanks for reading!

Going Further

Getting Parameter Values From A SQL Server Query Plan For Performance Tuning

Posted on March 24, 2022May 16, 2022 by Erik Darling

Property Management

I’m a really big fan of using operator properties for a lot of things, at least visually. Where things sort of fall down for that is copying and pasting things out.

For some stuff, you still need to head down to the XML.

Let’s say you have a stored procedure that accepts a bunch of parameters. The rest of this one isn’t important, but here you go:

CREATE OR ALTER PROCEDURE 
    dbo.AwesomeSearchProcedure 
( 
    @OwnerUserId int = NULL, 
    @CreationDate datetime = NULL, 
    @LastActivityDate datetime = NULL,
    @PostTypeId int = NULL,
    @Score int = NULL,
    @Title nvarchar(250) = NULL, 
    @Body nvarchar(MAX) = NULL 
)

A Plan Appears

Let’s say we grab a query plan for this thing from the plan cache or query store. We can get the properties of the select operator and see compile time values:

EXEC dbo.AwesomeSearchProcedure 
    @OwnerUserId = 35004,
    @CreationDate = '20130101', 
    @LastActivityDate = '20140101',
    @Title = N'SQL Server';

We get this back:

Again — nice visually — but it doesn’t do much for us if we want to recreate executing the stored procedure to get an actual execution plan.

It’s also not terrible helpful if we want to simulate a parameter sniffing situation, because we only have the compile time values, not the run time values.

Bummer. But whatever.

XML Time!

If we right click and select “show execution plan XML”, we can scroll way down to the bottom to find the XML fragment that holds what the properties display:

<ParameterList>
  <ColumnReference Column="@iTitle" ParameterDataType="nvarchar(250)" ParameterCompiledValue="N'SQL Server'" />
  <ColumnReference Column="@iLastActivityDate" ParameterDataType="datetime" ParameterCompiledValue="'2014-01-01 00:00:00.000'" />
  <ColumnReference Column="@iCreationDate" ParameterDataType="datetime" ParameterCompiledValue="'2013-01-01 00:00:00.000'" />
  <ColumnReference Column="@iOwnerUserId" ParameterDataType="int" ParameterCompiledValue="(35004)" />
</ParameterList>

This still isn’t awesome, because we have to do some surgery on the XML itself to get values out.

It’s even worse if we have a parameterized application query, because not only do we need to make a DECLARE to assign values to these variables but we need to turn the query itself into dynamic SQL.

If we don’t do that, we’ll fall victim to a common pitfall: testing queries with local variables.

Passwords

For most things, I absolutely adore using operator properties. For some things, you still need the XML.

It’d be nice if there were some fancy copy and paste magic that would do that for you, but so far it doesn’t exist.

Thanks for reading!

Going Further

Understand Your Plan: Operator Properties

Posted on March 18, 2022May 16, 2022 by Erik Darling

Dragging The Line

There’s a lot of stuff flying around in a query plan. Data and what not.

Sure, you can hover over operators and arrows and see some stuff, but if you really wanna see stuff — I mean REALLY wanna see stuff — you gotta get into the properties.

You can access those in two ways:

Hit F4 (not F5 again; you already suffered through that)
Right click on any operator in the query plan and hit Properties

And that, my dear friend, will unlock many mysteries in your query plans.

Start With Select

I think one of the most interesting places to start is with the root operator (select, insert, update, delete), because there’s so much in there.

Here’s an abridged list of things you can see from the properties of the root node of an Actual Execution Plan:

Compile metrics: CPU, duration, memory
Degree Of Parallelism
Detailed Memory Grant information
Stats Usage
Query Time Stats in CPU and duration (including UDF times)
Parameter compile and runtime values
Nonparallel Plan reasons
Set Options
Warnings
CPU thread usage
Wait Stats

There’s more in there too, but holy cow! All the stuff you can learn here is fantastic. You might not be able to solve all your problems looking here, but it’s as good a place to start as any.

Plus, this is where you can get a sense of just how long your query ran for, and start tracking down the most troublesome operators.

Follow The Time

I’ve said before that operator costs are basically useless, and you should be following the operator times to figure out where things get wonky.

For some operators, just looking at the tool tip is enough. For example, if you have an operator that piles up a bunch of execution time because of a spill, the spill details are right in front of you.

But other times, operator properties expose things that aren’t surfaced at the tool tip.

Skew Manchu

Take skewed parallelism, for example. There are no visual indicators that it happened (maybe there should be, but given the warnings we get now, I’m not sure I trust that summer intern).

But you know, it might be nice to know about stuff like this. Each thread is supposed to get an equal portion of the query memory grant, and if work is distributed unevenly, you can end up with weird, random performance issues.

This is something I almost always spot check in parallel plans. In a perfect world, duration would be CPU➗DOP. Life rarely ends up perfect, which is why it’s worth a look.

I To The O

You can also see I/O stats at the operator level, logical and physical. This is why I kinda laugh at folks who still use SET STATISTICS TIME, IO ON; — you can get that all in one place — your query plan.

You can interpret things in the same way, it’s just a little easier to chew.

Thanks for reading!

Going Further

Understand Your Plan: Query Plan Warnings

Posted on March 17, 2022May 16, 2022 by Erik Darling

Good And Bad

The good news is that SQL Server’s query plans will attempt to warn you about problems. The bad news is that most of the warnings only show up in Actual Execution Plans. The worse news is that a lot of the warnings that try to be helpful in Estimated Execution plans can be pretty misleading.

Here’s a current full list:

<xsd:element name="SpillOccurred" type="shp:SpillOccurredType" minOccurs="0" maxOccurs="1"/>
<xsd:element name="ColumnsWithNoStatistics" type="shp:ColumnReferenceListType" minOccurs="0" maxOccurs="1"/>
<xsd:element name="SpillToTempDb" type="shp:SpillToTempDbType" minOccurs="0" maxOccurs="unbounded"/>
<xsd:element name="Wait" type="shp:WaitWarningType" minOccurs="0" maxOccurs="unbounded"/>
<xsd:element name="PlanAffectingConvert" type="shp:AffectingConvertWarningType" minOccurs="0" maxOccurs="unbounded"/>
<xsd:element name="SortSpillDetails" type="shp:SortSpillDetailsType" minOccurs="0" maxOccurs="unbounded"/>
<xsd:element name="HashSpillDetails" type="shp:HashSpillDetailsType" minOccurs="0" maxOccurs="unbounded"/>
<xsd:element name="ExchangeSpillDetails" type="shp:ExchangeSpillDetailsType" minOccurs="0" maxOccurs="unbounded"/>
<xsd:element name="MemoryGrantWarning" type="shp:MemoryGrantWarningInfo" minOccurs="0" maxOccurs="1"/>

<xsd:attribute name="NoJoinPredicate" type="xsd:boolean" use="optional"/>
<xsd:attribute name="SpatialGuess" type="xsd:boolean" use="optional"/>
<xsd:attribute name="UnmatchedIndexes" type="xsd:boolean" use="optional"/>
<xsd:attribute name="FullUpdateForOnlineIndexBuild" type="xsd:boolean" use="optional"/>

Certain of these are considered runtime issues, and are only available in Actual Execution Plans, like:

Spills to tempdb
Memory Grants

I’ve never seen the “Spatial Guess” warning in the wild, which probably speaks to the fact that you can measure Spatial data/index adoption in numbers that are very close to zero. I’ve also never seen the Full Update For Online Index Build warning.

Then there are some others like Columns With No Statistics, Plan Affecting Converts, No Join Predicate, and Unmatched Indexes.

Let’s talk about those a little.

Columns With No Statistics

I almost never look at these, unless they’re from queries hitting indexed views.

The only time SQL Server will generate statistics on columns in an indexed view is when you use the NOEXPAND hint in your query. That might be very helpful to know about, especially if you don’t have useful secondary indexes on your indexed view.

If you see this in plans that aren’t hitting an indexed view, it’s likely that SQL Server is complaining that multi-column statistics are missing. If your query has a small number of predicates, it might be possible to figure out which combination and order will satisfy the optimizer, but it’s often not worth the time involved.

Like I said, I rarely look at these. Though one time it did clue me in to the fact that a database had auto create stats disabled.

So I guess it’s nice once every 15 years or so.

Plan Affecting Converts

There are two of these:

Ones that might affect cardinality estimates
Ones that might affect your ability to seek into an index

Cardinality Affecting

The problem I have with the cardinality estimation warning is that it shows up when it’s totally useless.

SELECT TOP (1)
    Id = CONVERT(varchar(1), u.Id)
FROM dbo.Users AS u;

Like I said, misleading.

Seek Affecting

These can be misleading, but I often pay a bit more attention to them. They can be a good indicator of data type issues in comparison operations.

Where they’re misleading is when they tell you they mighta-coulda done a seek, when you don’t have an index that would support a seek.

SELECT TOP (1)
    u.Id
FROM dbo.Users AS u
WHERE u.Reputation = CONVERT(sql_variant, N'138');

Of course, without an index on Reputation, what am I going to seek to?

Nothing. Nothing at all.

No Join Predicate

This one is almost a joke, I think.

Back when people wrote “old style joins”, they could have missed a predicate, or something. Like so:

SELECT
    c = COUNT_BIG(*)
FROM dbo.Users AS u, 
     dbo.Badges AS b, 
     dbo.Comments AS c
WHERE u.Id = b.UserId;
/*Oops no join on comments!*/

Except there’s no warning in this query plan for a missing join predicate.

But if we change the query to this, it’ll show up:

SELECT
    u.Id
FROM dbo.Users AS u, 
     dbo.Badges AS b, 
     dbo.Comments AS c
WHERE u.Id = b.UserId;
/*Oops no join on comments!*/

But let’s take a query that has a join predicate:

SELECT TOP (1)
    b.*
FROM dbo.Comments AS c
JOIN dbo.Badges AS b
    ON c.UserId = b.UserId
WHERE b.UserId = 22656;

We still get that warning:

We still get a missing join predicate, even though we have a join predicate. The predicate is implied here, because of the where clause.

But apparently the check for this is only at the Nested Loops Join. No attempt is made to track pushed predicates any further.

If there were, the warning would not appear.

Unmatched Indexes

If you create filtered indexes, you should know a couple things:

It’s always a good idea to have the column(s) you’re filter(ing) on somewhere in the index definition (key or include, whatever)
If query predicate(s) are parameterized on the column(s) you’re filter(ing) on, the optimizer probably won’t choose your filtered index

I say probably because recompile hints and unsafe dynamic SQL may prompt it to use your filtered index. But the bottom line here is parameters and filtered indexes are not friends in some circumstances.

Here’s a filtered index:

CREATE INDEX
    cigarettes
ON dbo.Users
    (Reputation)
WHERE
    (Reputation >= 1000000)
WITH
    (SORT_IN_TEMPDB = ON, DATA_COMPRESSION = PAGE);

And here’s a query that should use it:

SELECT 
    u.Reputation
FROM dbo.Users AS u
WHERE u.Reputation >= 1000000;

BUUUUUUUUUUUUUUUT!

SQL Server warns us we didn’t. This is an artifact of Simple Parameterization, which happens early on in the Trivial Plan optimization phase.

It’s very misleading, that.

Warnings And Other Drugs

In this post we covered common scenarios when plan warnings just don’t add up to much of a such. Does that mean you should always ignore them? No, but also don’t be surprised if your investigation turns up zilch.

If you’re interested in learning more about spills, check out the Spills category of my blog. I’ve got a ton of posts about them.

At this point, you’re probably wondering why people bother with execution plans. I’m sort of with you; everything up to the actual version feels futile and useless, and seems to lie to you.

Hopefully Microsoft invests more in making the types of feedback mechanisms behind gathering plans and runtime metrics easier for casual users in future versions of SQL Server.

Thanks for reading!