This is a short post that I wanted to write on the heels of doing a bunch of work in sp_QuickieStore.
Many times, pulling data out of odd structures like XML or JSON can lead to difficulty in correctly typing each output element. I run into this commonly with query plan XML, of course. You may run into it elsewhere.
The main issue is that I often need to compare what comes out of those odd data structures to data stored more properly in other system views. For example:
Query Hash: Binary 8
Query Plan Hash: Binary 8
SQL Handle: Varbinary 64
Plan Handle: Varbinary 64
There’s some shenanigans you can use around big ints, but I’ve run into a lot of bugs with that. I don’t want to talk about it.
Nutty
As an example, this won’t match:
SELECT
c =
CASE
WHEN '0x1AB614B461F4D769' = 0x1AB614B461F4D769
THEN 1
ELSE 0
END;
The string does not implicitly convert to the binary 8 value. The same is true when you use varbinary values.
You might think that just converting the string to binary 8 would be enough, but no! This will still return a zero.
SELECT
c =
CASE
WHEN CONVERT(binary(8), '0x1AB614B461F4D769') = 0x1AB614B461F4D769
THEN 1
ELSE 0
END;
SELECT
no =
CONVERT(binary(8), '0x1AB614B461F4D769'),
yes = CONVERT(binary(8), '0x1AB614B461F4D769', 1);
no yes
0x3078314142363134 0x1AB614B461F4D769
The same is true with varbinary, too:
SELECT
no =
CONVERT(varbinary(64), '0x0900F46AC89E66DF744C8A0AD4FD3D3306B90000000000000000000000000000000000000000000000000000'),
yes = CONVERT(varbinary(64), '0x0900F46AC89E66DF744C8A0AD4FD3D3306B90000000000000000000000000000000000000000000000000000', 1);
no
0x30783039303046343641433839453636444637343443384130414434464433443333303642393030303030303030303030303030303030303030303030303030
yes
0x0900F46AC89E66DF744C8A0AD4FD3D3306B90000000000000000000000000000000000000000000000000000
The real answer here is to not rely on conversions, implicit or otherwise, when comparing data.
But, if you ever find yourself having to deal with some wonky binary data, this is one way to get yourself out of a scrape.
Thanks for reading!
Going Further
If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount on to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that and need to solve performance problems quickly.
I get this question a lot while working with clients, largely in a couple specific contexts:
Me telling someone they need to use dynamic SQL in a stored procedure
Applications sending over parameterized SQL statements that are executed with sp_executesql
Often, the dynamic SQL recommendation comes from needing to deal with:
IF branching
Parameter sensitivity
Optional parameters
Local variables
Even in the context of a stored procedure, these things can really suck performance down to a sad nub.
But The Code
Now, much of the SQL generated by ORMs terrifies me.
Even when it’s developer misuse, and not the fault of the ORM, it can be difficult to convince those perfect angels that the query their code generated is submaximal.
Now, look, simple queries do fine with an ORM (usually). Provided you’re:
Paying attention to indexes
Not using long IN clauses
Strongly typing parameters
Avoiding AddWithValues
You can skate by with your basic CRUD stuffs. I get worried as soon as someone looks at an ORM query and says “oh, that’s a report…” because there’s no way you’re generating reasonable reporting queries with an ORM.
Procedural Drama
The real upside of stored procedures isn’t stuff like plan reuse or caching or 1:1 better performance. A single parameterized query run in either context will perform the same, all things considered.
Where they shine is with additional flexibility in tuning things. Rather than one huge query that the optimizer has to deal with, you can split things up into more manageable chunks.
You also have quite a bit more freedom with various hints, trace flags, query rewrites, isolation levels, etc.
In other words: eventually your query needs will outgrow your ORMs ability to generate optimal queries.
Until then, use whatever you’re able to get your job done with.
Thanks for reading!
Going Further
If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount on to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that and need to solve performance problems quickly.
USE StackOverflow2013;
EXEC dbo.DropIndexes;
SET NOCOUNT ON;
DBCC FREEPROCCACHE;
GO
CREATE INDEX
chunk
ON dbo.Posts
(OwnerUserId, Score DESC)
INCLUDE
(CreationDate, LastActivityDate)
WITH
(MAXDOP = 8, SORT_IN_TEMPDB = ON, DATA_COMPRESSION = PAGE);
GO
CREATE OR ALTER VIEW
dbo.PushyPaul
WITH SCHEMABINDING
AS
SELECT
p.OwnerUserId,
p.Score,
p.CreationDate,
p.LastActivityDate,
PostRank =
DENSE_RANK() OVER
(
PARTITION BY
p.OwnerUserId
ORDER BY
p.Score DESC
)
FROM dbo.Posts AS p;
GO
SELECT
p.*
FROM dbo.PushyPaul AS p
WHERE p.OwnerUserId = 22656;
GO
CREATE OR ALTER PROCEDURE
dbo.StinkyPete
(
@UserId int
)
AS
SET NOCOUNT, XACT_ABORT ON;
BEGIN
SELECT
p.*
FROM dbo.PushyPaul AS p
WHERE p.OwnerUserId = @UserId;
END;
GO
EXEC dbo.StinkyPete
@UserId = 22656;
/*Start Here*/
ALTER DATABASE
StackOverflow2013
SET PARAMETERIZATION SIMPLE;
DBCC TRACEOFF
(
4199,
-1
);
ALTER DATABASE SCOPED CONFIGURATION
SET QUERY_OPTIMIZER_HOTFIXES = OFF;
SELECT
p.*
FROM dbo.PushyPaul AS p
WHERE p.OwnerUserId = 22656
AND 1 = (SELECT 1); /*Avoid trivial plan/simple parameterization*/
/*Let's cause a problem!*/
ALTER DATABASE
StackOverflow2013
SET PARAMETERIZATION FORCED;
SELECT
p.*
FROM dbo.PushyPaul AS p
WHERE p.OwnerUserId = 22656
AND 1 = (SELECT 1); /*Avoid trivial plan/simple parameterization*/
/*Can we fix the problem?*/
DBCC TRACEON
(
4199,
-1
);
SELECT
p.*
FROM dbo.PushyPaul AS p
WHERE p.OwnerUserId = 22656
AND 1 = (SELECT 1); /*Avoid trivial plan/simple parameterization*/
/*That's kinda weird...*/
DBCC FREEPROCCACHE;
SELECT
p.*
FROM dbo.PushyPaul AS p
WHERE p.OwnerUserId = 22656
AND 1 = (SELECT 1); /*Avoid trivial plan/simple parameterization*/
/*Turn Down Service*/
DBCC TRACEOFF
(
4199,
-1
);
SELECT
p.*
FROM dbo.PushyPaul AS p
WHERE p.OwnerUserId = 22656
AND 1 = (SELECT 1); /*Avoid trivial plan/simple parameterization*/
/*Okay then.*/
/*I'm different.*/
ALTER DATABASE SCOPED CONFIGURATION
SET QUERY_OPTIMIZER_HOTFIXES = ON;
SELECT
p.*
FROM dbo.PushyPaul AS p
WHERE p.OwnerUserId = 22656
AND 1 = (SELECT 1); /*Avoid trivial plan/simple parameterization*/
/*Cleanup*/
ALTER DATABASE
StackOverflow2013
SET PARAMETERIZATION SIMPLE;
ALTER DATABASE SCOPED CONFIGURATION
SET QUERY_OPTIMIZER_HOTFIXES = OFF;
DBCC TRACEOFF
(
4199,
-1
);
Going Further
If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount on to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that and need to solve performance problems quickly.
In the release notes for SQL Server 2017 CU30, there’s a note that it fixes a problem where parameters can’t be pushed passed Sequence Project operators:
“In Microsoft SQL Server 2017, running parameterized queries skips the SelOnSeqPrj rule. Therefore, pushdown does not occur.” But it doesn’t actually do that.
Here are the good and bad plans, comparing using a literal value vs. a parameterized value:
dunksville
In the plan with a literal value, the predicate is applied at the index seek, and the filtering is really quick.
In the plan with a parameterized value, the index is scanned, and applied at a filter way later in the query plan.
This is where the SelOnSeqPrj rule comes in: The parameter can’t be pushed past the Sequence Project operator like the literal value can.
Thanks for reading!
Going Further
If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount on to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that and need to solve performance problems quickly.
First, I want to advise you to avoid doing things as much as possible like this in T-SQL. Use CLR, if you can. Or do it outside of your really expensive database server.
You know, rational choices.
Recently, I had an interesting client problem where they had a variety of string formats, and different potential start and end points to get a substring from.
Normally, I love computed columns for this, but there was no way to get exactly what they wanted without using a whole bunch of them. It was sort of annoying.
Because it was easier to get across in T-SQL, we ended up using a trigger.
Don’t look at me like that.
CharindexOf
First, let’s load up a small temp table with some different string values.
CREATE TABLE
#strings
(
string varchar(4)
);
INSERT
#strings
(
string
)
SELECT '1234'
UNION ALL
SELECT '2341'
UNION ALL
SELECT '3412'
UNION ALL
SELECT '4123';
Now, let’s say that there was a need to figure out which number came first in the string. You can use the CHARINDEX function to do that, but it’s a little difficult to make it “dynamic”.
To make things easy, let’s create a helper table of search patterns.
CREATE TABLE
#searches
(
search varchar(1)
);
INSERT
#searches
(
search
)
SELECT '1'
UNION ALL
SELECT '2'
UNION ALL
SELECT '3'
UNION ALL
SELECT '4';
ApplyOf
Now we can use a little bit of apply and derived table magic to figure out which of our search values appear first in our strings.
SELECT
s.string,
x1.search,
x1.search_position
FROM #strings AS s
CROSS APPLY
(
SELECT TOP (1)
x0.search,
x0.search_position
FROM
(
SELECT
s2.search,
search_position =
CHARINDEX(s2.search, s.string)
FROM #searches AS s2
) AS x0
ORDER BY x0.search_position
) AS x1;
The results look like this:
medium
SubstringOf
Now we can add in the LEN and SUBSTRING functions in order to parse out the part of the column we’re interested in.
SELECT
s.string,
x1.search,
x1.search_position,
x1.search_length,
sub =
SUBSTRING
(
x1.search,
x1.search_position,
x1.search_length
)
FROM #strings AS s
CROSS APPLY
(
SELECT TOP (1)
x0.search,
x0.search_position,
x0.search_length
FROM
(
SELECT
s2.search,
search_position =
CHARINDEX(s2.search, s.string),
search_length =
LEN(s2.search)
FROM #searches AS s2
) AS x0
ORDER BY x0.search_position
) AS x1;
Now the results look like this:
Smart E. Pants
This is an admittedly pretty rudimentary example, but it’s tough to make this complicated without making it overly so.
If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount on to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that and need to solve performance problems quickly.
You can normally eyeball a query to find things that generally don’t agree with performance out of the box, like:
Functions (inline ones aside)
Table variables
Stacked Common Table Expressions
Non-SARGable predicates
Overly complicated queries
Insert a million other things here
But of course, the more complicated queries are, or the more layers of abstraction exist in a query, the harder this stuff is to spot quickly. Particularly with views, and nested views, bad ideas can be buried many layers deep in the sediment.
I call it sediment because code often looks like geologic layers, where you can tell who wrote what and when based on style and techniques that got used.
And to vendors who encrypt their god-awful code: �
The Great Untangling
Getting through that untangling can be a costly and time consuming process, depending on the level of damage done over the years, and the desired outcome. Sometimes it’s easier to rewrite everything from scratch than to do in-place rewrites of existing objects.
It’s obviously worth exploring enhancements in newer versions of SQL Server that may power things across the finish line:
Perhaps the new cardinality estimator does more good than harm
Batch Mode On Row Store does a lot of good with bad code
Scalar UDF Inlining can solve a lot of function problems
There are many other general and targeted improvements that might help your workload without code changes. Hopefully that continues with SQL Server 2022.
On top of the workload improvements, new versions also provide improved insights into problems via dynamic management views, Query Store, logging, and more.
If you’re not on at least SQL Server 2016 right now, you’re leaving a whole lot on the table as far as this goes.
Hiring Issues
It’s tough for smaller companies to attract full time talent to fix huge backlogs of issues across SQL Server stored procedures, functions, views, index and table design, and all that.
Or even harder, convert ORM queries into sensible stored procedures, etc. when you start hitting performance limitations in the single-query model.
First, I need acknowledge that not everyone wants to work for a huge company. Second, I need to acknowledge that salary isn’t everything to everyone.
But let’s assume that a smaller company want to hire someone in competition with a larger company. What can they offer when they run out of salary runway, and can’t match equity?
Clear career paths/Upward mobility
Flexible schedules
Paid time off for training
Covering the costs of training and certifications
Focusing on employee growth (not just sticking them in a corner to monkey with the same thing for years)
Quality of company culture (meeting overload was something I got a lot of DMs about)
Conference travel budgets
Meaningful company mission
Introducing tech savvy folks to the business side of things
Recognizing that not every employee wants to be an On-callogist
There were more, but these were the things I got the most hits from folks on. Having these doesn’t mean you can expect someone to take 20-30% less on the salary front, of course, but if you’re close to another offer these things might sway folks to your side.
Far and away, what I took from responses is that folks want to feel effective; like they can make a difference without a lot of bureaucracy and red tape. Get the hell out of my way, to coin a phrase.
Finder’s Fee
When it comes to attracting people to your company — think of it as your employer SEO — the SQL Server community is a great place to start.
If you want to try something for free, keep an eye out for when Brent posts to find out Who’s Hiring In The Database Community. It doesn’t cost you anything, but you have to keep on top of the posts and replies, and make sure you have good job description that sticks out.
If you have any location-based requirements for your candidates, try sponsoring a local SQL Server user group’s meetings for a few months. There may be a small, nominal fee if it’s entirely virtual. If it’s in-person, you’ll foot the bill for dozen or so pizza pies for attendees. That usually gets you an announcement before and after whatever speaker is presenting. It’s totally fair to ask for attendance numbers. Keeping on with that, consider sponsoring a SQL Saturday event. These typically have a deeper reach than a monthly user group, since there are more attendees in a concentrated area. You may get a booth, or your logo on slides, and whatever else you can negotiate with the event planners.
If you’re okay with spending more money for a lot of eyeballs, larger events like PASS Summit, and SQLBits are annual conferences with thousands of attendees. As a FYI, these are the types of conferences whomever you hire is probably going to want to attend, too.
Imagine that.
Askance
I have clients ask me to help them find quality employees for roles from time to time, or to help them interview folks they’ve farmed themselves.
Normally I’m happy to help on either front, and leave sealing the deal to them. I think from now on I’m gonna point them to this post, so they have some better ideas about how to put a stamp on things.
Not every company can offer everything, but as large companies continue to gobble up smaller ones, and Microsoft in particular keeps fishing folks out of the MVP pool, it’s going to be harder for those who remain to stay competitive. At least I think so: I haven’t quite been persuaded that there will be a coomba ya moment where everyone gets sick of the MegaCorp grind and goes back to mom and pop shops to reclaim their lost souls.
After all, a lot of folks do have their sights set on retirement. High salaries and generous equity (well, maybe not equity as the market is currently behaving) certainly help get them there faster.
That’s part of the picture that you can’t easily ignore, along with the oft-proferred wisdom that the only way to stay on a competitive salary track is to change jobs every 2-3 years.
Retention is going to get more difficult for everyone across the board, but the revolving door will largely let out with the bigger players who can afford to keep it spinning.
Thanks for reading!
Going Further
If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount on to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that and need to solve performance problems quickly.
I’m a really big fan of using operator properties for a lot of things, at least visually. Where things sort of fall down for that is copying and pasting things out.
For some stuff, you still need to head down to the XML.
Let’s say you have a stored procedure that accepts a bunch of parameters. The rest of this one isn’t important, but here you go:
CREATE OR ALTER PROCEDURE
dbo.AwesomeSearchProcedure
(
@OwnerUserId int = NULL,
@CreationDate datetime = NULL,
@LastActivityDate datetime = NULL,
@PostTypeId int = NULL,
@Score int = NULL,
@Title nvarchar(250) = NULL,
@Body nvarchar(MAX) = NULL
)
A Plan Appears
Let’s say we grab a query plan for this thing from the plan cache or query store. We can get the properties of the select operator and see compile time values:
Again — nice visually — but it doesn’t do much for us if we want to recreate executing the stored procedure to get an actual execution plan.
It’s also not terrible helpful if we want to simulate a parameter sniffing situation, because we only have the compile time values, not the run time values.
Bummer. But whatever.
XML Time!
If we right click and select “show execution plan XML”, we can scroll way down to the bottom to find the XML fragment that holds what the properties display:
This still isn’t awesome, because we have to do some surgery on the XML itself to get values out.
It’s even worse if we have a parameterized application query, because not only do we need to make a DECLARE to assign values to these variables but we need to turn the query itself into dynamic SQL.
For most things, I absolutely adore using operator properties. For some things, you still need the XML.
It’d be nice if there were some fancy copy and paste magic that would do that for you, but so far it doesn’t exist.
Thanks for reading!
Going Further
If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount on to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that and need to solve performance problems quickly.
There’s a lot of stuff flying around in a query plan. Data and what not.
Sure, you can hover over operators and arrows and see some stuff, but if you really wanna see stuff — I mean REALLY wanna see stuff — you gotta get into the properties.
You can access those in two ways:
Hit F4 (not F5 again; you already suffered through that)
Right click on any operator in the query plan and hit Properties
And that, my dear friend, will unlock many mysteries in your query plans.
Start With Select
I think one of the most interesting places to start is with the root operator (select, insert, update, delete), because there’s so much in there.
Here’s an abridged list of things you can see from the properties of the root node of an Actual Execution Plan:
Compile metrics: CPU, duration, memory
Degree Of Parallelism
Detailed Memory Grant information
Stats Usage
Query Time Stats in CPU and duration (including UDF times)
Parameter compile and runtime values
Nonparallel Plan reasons
Set Options
Warnings
CPU thread usage
Wait Stats
There’s more in there too, but holy cow! All the stuff you can learn here is fantastic. You might not be able to solve all your problems looking here, but it’s as good a place to start as any.
Plus, this is where you can get a sense of just how long your query ran for, and start tracking down the most troublesome operators.
Follow The Time
I’ve said before that operator costs are basically useless, and you should be following the operator times to figure out where things get wonky.
For some operators, just looking at the tool tip is enough. For example, if you have an operator that piles up a bunch of execution time because of a spill, the spill details are right in front of you.
contagious
But other times, operator properties expose things that aren’t surfaced at the tool tip.
Skew Manchu
Take skewed parallelism, for example. There are no visual indicators that it happened (maybe there should be, but given the warnings we get now, I’m not sure I trust that summer intern).
year of the spider
But you know, it might be nice to know about stuff like this. Each thread is supposed to get an equal portion of the query memory grant, and if work is distributed unevenly, you can end up with weird, random performance issues.
This is something I almost always spot check in parallel plans. In a perfect world, duration would be CPU➗DOP. Life rarely ends up perfect, which is why it’s worth a look.
I To The O
You can also see I/O stats at the operator level, logical and physical. This is why I kinda laugh at folks who still use SET STATISTICS TIME, IO ON; — you can get that all in one place — your query plan.
ding!
You can interpret things in the same way, it’s just a little easier to chew.
Thanks for reading!
Going Further
If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount on to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that and need to solve performance problems quickly.
The good news is that SQL Server’s query plans will attempt to warn you about problems. The bad news is that most of the warnings only show up in Actual Execution Plans. The worse news is that a lot of the warnings that try to be helpful in Estimated Execution plans can be pretty misleading.
Certain of these are considered runtime issues, and are only available in Actual Execution Plans, like:
Spills to tempdb
Memory Grants
I’ve never seen the “Spatial Guess” warning in the wild, which probably speaks to the fact that you can measure Spatial data/index adoption in numbers that are very close to zero. I’ve also never seen the Full Update For Online Index Build warning.
Then there are some others like Columns With No Statistics, Plan Affecting Converts, No Join Predicate, and Unmatched Indexes.
Let’s talk about those a little.
Columns With No Statistics
I almost never look at these, unless they’re from queries hitting indexed views.
The only time SQL Server will generate statistics on columns in an indexed view is when you use the NOEXPAND hint in your query. That might be very helpful to know about, especially if you don’t have useful secondary indexes on your indexed view.
If you see this in plans that aren’t hitting an indexed view, it’s likely that SQL Server is complaining that multi-column statistics are missing. If your query has a small number of predicates, it might be possible to figure out which combination and order will satisfy the optimizer, but it’s often not worth the time involved.
Like I said, I rarely look at these. Though one time it did clue me in to the fact that a database had auto create stats disabled.
So I guess it’s nice once every 15 years or so.
Plan Affecting Converts
There are two of these:
Ones that might affect cardinality estimates
Ones that might affect your ability to seek into an index
Cardinality Affecting
The problem I have with the cardinality estimation warning is that it shows up when it’s totally useless.
SELECT TOP (1)
Id = CONVERT(varchar(1), u.Id)
FROM dbo.Users AS u;
fine2me
Like I said, misleading.
Seek Affecting
These can be misleading, but I often pay a bit more attention to them. They can be a good indicator of data type issues in comparison operations.
Where they’re misleading is when they tell you they mighta-coulda done a seek, when you don’t have an index that would support a seek.
SELECT TOP (1)
u.Id
FROM dbo.Users AS u
WHERE u.Reputation = CONVERT(sql_variant, N'138');
knot4you
Of course, without an index on Reputation, what am I going to seek to?
Nothing. Nothing at all.
No Join Predicate
This one is almost a joke, I think.
Back when people wrote “old style joins”, they could have missed a predicate, or something. Like so:
SELECT
c = COUNT_BIG(*)
FROM dbo.Users AS u,
dbo.Badges AS b,
dbo.Comments AS c
WHERE u.Id = b.UserId;
/*Oops no join on comments!*/
Except there’s no warning in this query plan for a missing join predicate.
well okay
But if we change the query to this, it’ll show up:
SELECT
u.Id
FROM dbo.Users AS u,
dbo.Badges AS b,
dbo.Comments AS c
WHERE u.Id = b.UserId;
/*Oops no join on comments!*/
greatly
But let’s take a query that has a join predicate:
SELECT TOP (1)
b.*
FROM dbo.Comments AS c
JOIN dbo.Badges AS b
ON c.UserId = b.UserId
WHERE b.UserId = 22656;
We still get that warning:
tough chickens
We still get a missing join predicate, even though we have a join predicate. The predicate is implied here, because of the where clause.
But apparently the check for this is only at the Nested Loops Join. No attempt is made to track pushed predicates any further.
run for your life
If there were, the warning would not appear.
Unmatched Indexes
If you create filtered indexes, you should know a couple things:
It’s always a good idea to have the column(s) you’re filter(ing) on somewhere in the index definition (key or include, whatever)
If query predicate(s) are parameterized on the column(s) you’re filter(ing) on, the optimizer probably won’t choose your filtered index
I say probably because recompile hints and unsafe dynamic SQL may prompt it to use your filtered index. But the bottom line here is parameters and filtered indexes are not friends in some circumstances.
Here’s a filtered index:
CREATE INDEX
cigarettes
ON dbo.Users
(Reputation)
WHERE
(Reputation >= 1000000)
WITH
(SORT_IN_TEMPDB = ON, DATA_COMPRESSION = PAGE);
And here’s a query that should use it:
SELECT
u.Reputation
FROM dbo.Users AS u
WHERE u.Reputation >= 1000000;
BUUUUUUUUUUUUUUUT!
combine
SQL Server warns us we didn’t. This is an artifact of Simple Parameterization, which happens early on in the Trivial Plan optimization phase.
It’s very misleading, that.
Warnings And Other Drugs
In this post we covered common scenarios when plan warnings just don’t add up to much of a such. Does that mean you should always ignore them? No, but also don’t be surprised if your investigation turns up zilch.
If you’re interested in learning more about spills, check out the Spills category of my blog. I’ve got a ton of posts about them.
At this point, you’re probably wondering why people bother with execution plans. I’m sort of with you; everything up to the actual version feels futile and useless, and seems to lie to you.
Hopefully Microsoft invests more in making the types of feedback mechanisms behind gathering plans and runtime metrics easier for casual users in future versions of SQL Server.
Thanks for reading!
Going Further
If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount on to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that and need to solve performance problems quickly.
The Users table looks like this. It’s important for me to show you this up front, because column ordinal position in the table is important for understanding missing index requests.
what-whatcha need?
Keep this in mind — the columns aren’t in alphabetical order, or how selective they are, or by data type, etc.
They’re in the order that they are when the table was created, and then if any of them were added later on.
That’s all.
Long Time
Let’s take this query:
SELECT TOP (10000)
u.Id,
u.AccountId,
u.DisplayName,
u.Reputation,
u.Views,
u.CreationDate
FROM dbo.Users AS u
WHERE u.Views = 0
ORDER BY u.CreationDate DESC;
Is it very useful? No. But it’ll help us paint the right picture. The query plan doesn’t matter, because it’s just a clustered index scan, and it’ll be a clustered index scan for every other permutation, too.
The missing index for this query is like so:
CREATE NONCLUSTERED INDEX [<Name of Missing Index, sysname,>]
ON [dbo].[Users] ([Views])
INCLUDE ([CreationDate],[DisplayName],[Reputation],[AccountId])
Only the Views column is in the key of the index, even though CreationDate is an order by column.
In this case, it would be beneficial to have it as the second key column, because it would be sorted for free for us after an equality predicate.
You may also notice that Id is not part of the definition too. That’s because it’s the clustered index, so it will be inherited by any nonclustered indexes we create.
Normal
Okay, now let’s look at this query, with a slightly different where clause:
SELECT TOP (10000)
u.Id,
u.AccountId,
u.DisplayName,
u.Reputation,
u.Views,
u.CreationDate
FROM dbo.Users AS u
WHERE u.Views = 0
AND u.Reputation = 1
ORDER BY u.CreationDate DESC;
We’re adding another predicate on Reputation = 1 here. The missing index request now looks like this:
CREATE NONCLUSTERED INDEX [<Name of Missing Index, sysname,>]
ON [dbo].[Users] ([Reputation],[Views])
INCLUDE ([CreationDate],[DisplayName],[AccountId])
Neither one of these predicates is particularly selective (7,954,119 and 6,197,417, respectively) but Reputation ends up first in the key column list because its ordinal position in the table is first.
Frequency
How about if we add a really selective predicate to our query?
SELECT TOP (10000)
u.Id,
u.AccountId,
u.DisplayName,
u.Reputation,
u.Views,
u.CreationDate
FROM dbo.Users AS u
WHERE u.AccountId = 12462842
AND u.Views = 0
ORDER BY u.CreationDate DESC;
Now our missing index request looks like this:
CREATE NONCLUSTERED INDEX [<Name of Missing Index, sysname,>]
ON [dbo].[Users] ([Views],[AccountId])
Two things happened:
We don’t have any included columns anymore
AccountId is second in the key columns
This is amusing because the missing index request machine seems to understand that this will only ever one row via the equality predicate on AccountId but it still gets enumerated as the second index key column.
In other words, it doesn’t put the most selective column first. It gives you an index designed, like other examples, based on the column’s ordinal position in the table.
Nothing else, at least not so far.
Inequality
Where missing index requests will change column order is when it comes to inequality predicates. That doesn’t just mean not equal to, oh no no no.
That covers any “range” predicate: <, <=, >, >=, <> or !=, and IS NOT NULL.
Take this query for example:
SELECT TOP (10000)
u.Id,
u.AccountId,
u.DisplayName,
u.Reputation,
u.Views,
u.CreationDate
FROM dbo.Users AS u
WHERE u.AccountId = 0
AND u.Reputation < 0
AND u.Views < 0
ORDER BY u.CreationDate DESC;
The missing index request looks like this:
CREATE NONCLUSTERED INDEX [<Name of Missing Index, sysname,>]
ON [dbo].[Users] ([AccountId],[Reputation],[Views])
Now AccountId is the leading key column, but Reputation and Views are still in ordinal position order as inequality predicates.
Wink Wink
Now, look, I started off by saying that missing index requests aren’t perfect, and that’s okay. They’re not meant to replace a well-trained person. They’re meant to help the hapless when it comes to fixing slow queries.
As you get more comfortable with indexes and how to create them to make queries go faster, you’ll start to see deficiencies in missing index requests.
But you don’t want the optimizer spending a long time in the index matching/missing index request phases. That’s a bad use of its time.
As you progress, you’ll start treating missing index requests like a crying baby: something might need attention, but it’s up to you as an adult DBA or developer to figure out what that is.
Thanks for reading!
Going Further
If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount on to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that and need to solve performance problems quickly.