Isolation Levels – Darling Data

This OVERPOWERED Hint Will Break NOLOCK Hints In SQL Server

Posted on July 26, 2022July 8, 2022 by Erik Darling

G4M3R

CREATE TABLE
    dbo.view_me
(
    id int NOT NULL
);
GO 

CREATE VIEW 
    dbo.viewed
AS 
SELECT
    vm.*
FROM dbo.view_me AS vm WITH(READCOMMITTED);
GO 

SELECT
    v.*
FROM dbo.viewed AS v WITH(NOLOCK);
GO

Msg 4138, Level 16, State 1, Line 22

Conflicting locking hints are specified for table “dbo.view_me”. This may be caused by a conflicting hint specified for a view.

Thanks for reading!

Going Further

If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that and need to solve performance problems quickly.

Common SQL Server Consulting Advice: Enabling An Optimistic Isolation Level

Posted on April 25, 2022April 21, 2022 by Erik Darling

Shame Shame Shame

One of the very first performance problems that developers will run into when using SQL Server is locking, and often deadlocks.

Though deadlocks have their root in (b)locking behavior, not all blocking leads to deadlocks.

Some (b)locking will just go on forever, ha ha ha.

The reason there are so many SQL Server blog posts about NOLOCK hints, and so much confusion about what it does, is because of defaults.

The worst part that it’s a default that shouldn’t be, and… somehow people have this sunny view of what NOLOCK does, but they all have a very negative view of better solutions to the problem.

What NOLOCK Really Does

I have this conversation at least twice a week, despite having written about it plenty of times.

And other people writing about it many times.

For years.

It does not mean your query doesn’t take locks, it means your query doesn’t respect locks taken by other queries. It’s not that read committed is so great, it’s that read uncommitted is so bad.

Think of modifications sort of like an Infinite Improbability Drive.

As soon as the drive reaches infinite Improbability, it passes through every conceivable point in every conceivable universe simultaneously. An incredible range of highly improbable things can happen due to these effects.

Perhaps not quite that eccentric, but you get the idea. While a modification in flight, a query with a NOLOCK (or READ UNCOMMITTED) hint may read those changes while they’re happening.

Incomplete inserts
Incomplete deletes
Incomplete updates

Inserts and deletes are a bit more straight forward. Say you’re inserting or deleting 10 rows, and either one is halfway done when your select query that is running with ~~flaming knives and scissors~~ a NOLOCK hint comes along.

You would read:

The first five inserted rows
The remaining five rows to be deleted

For updates, things are a little trickier because you might end up with an in-place update or per-index update.

You can read:

Partially changed rows
From an index that hasn’t been modified yet
Something in between

This is not what you want to happen.

Even if you have a million excuses as to why it works okay for you (it’s just a mobile app; they can refresh, we only need close-enough reports; users make changes and then read them later) I promise you that it’s not something you want to happen, because you can’t fully predict the ramifications of many concurrent modifications and selects running all together.

What You Really Want To Happen Instead

The utter beauty of these solutions is that they give you reliable results. They may not be perfect for every situation, but for probably like 99% of cases where you’re using NOLOCK hints everywhere anyway, they do.

Rather than futz about with the Infinite Improbabilities that could be read from modifications that are neither here nor there but are certainly not completed, you read the last known good version of a row or set of rows that are currently locked.

There you have it! No more uncertainty, puzzled users, additional database requests to refresh wonky-looking data, or anything like that.

It’s just you and your crisp, clean data.

If you’re on SQL Server 2019 and using Accelerated Database Recovery, the known-good versions of your precious data will be stored locally, per-database.

In all other scenarios, the row versioning goes off to tempdb.

Your select queries can read that data without being impeded by locks, and without all of incorrectness.

What options do you have to take advantage of these miraculous functionalities?

Those sound pretty close, but let’s talk a little bit more about them.

Isolation Levels, Not In Depth

It’s difficult to cover every potential reservation or concern someone may have about isolation levels. If you have an application that depends on certain locking guarantees to correctly process certain events, you may need read committed, or something beyond read committed (like repeatable read or serializable) to hold the correct locks.

If that’s your app design, then your job becomes query and index tuning to make sure that your queries run as quickly as possible to reduce the locking surface area of each one. This post is not for you.

This post is largely geared towards folks who have NOLOCK everywhere like some sort of incantation against performance problems, who hopefully aren’t totally stuck in their ways.

Here are some of the potential downsides of optimistic isolation levels:

Prior to SQL Server 2019, you’re going to add some load to tempdb
You’re going to add an up to 14-byte pointer to each row for tracking in the version store (but that happens on any table where there’s a trigger anyway)
You need to remove local locking hints on queries you want to use row versioning
You need to set the isolation level to allow queries to use Snapshot Isolation
Read query performance may slow down if the version store gets particularly large
You need to more closely monitor long running modifications to make sure they don’t fill tempdb

But this stuff is all worth it, because you can get around lots of weird, transient application issues:

Incorrect results
Incorrectly processed events
Readers blocking writers
Readers deadlocking with writers

If you’re currently using NOLOCK everywhere, or if someone starts suggesting you use it everywhere for better performance, know that you have better options out there.

Thanks for reading!

Going Further

Is Read Uncommitted (NOLOCK) An Optimistic Or Pessimistic Isolation Level?

Posted on December 9, 2021May 16, 2022 by Erik Darling

Poll Worker

I ran a Twitter poll recently, because as I was writing another blog post, I wasn’t quite sure how to group uncommitted/nolock between pessimistic or optimistic isolation levels.

read uncommitted is what kind of isolation level:

— Erik Darling Data (@erikdarlingdata) December 1, 2021

On the one hand, locking and blocking does still come into play with them. anyone who has seen locking on schema stability can attest to that. On the other hand, they will read just about anything you want. That’s sort of optimistic in spirit, but maybe not in letter.

Most of you kind folks out there called it optimistic, though some others had more colorful language to describe it:

indeterminate

— Kendra Little (@Kendra_Little) December 1, 2021

opportunistic

— L_ N___ (@sqL_handLe) December 1, 2021

Filthy.

— CujoDeSoque (@cujodesoque) December 2, 2021

Problematic

— Martin Madsen (@mraaskov) December 1, 2021

Probabilistic. Maybe

— Dèjì Akọ́mọláfẹ́ – Disambiguate This (@Dejify) December 1, 2021

Sadistic

— 🅹🅰🅼🅴🆂 🅲🅾🅾🅺 (@Cookie_JAC) December 1, 2021

Option 3: terrible

— Randolph “taking time off” West (they/them) (@_randolph_west) December 1, 2021

At the end of the day, it’s probably not the right isolation to use, no matter what you choose to call it.

Thanks for reading!

Going Further

How To Tell You Need An Optimistic Isolation Level In SQL Server

Posted on December 7, 2021May 16, 2022 by Erik Darling

Into The Yonder

When you create a database in SQL Server (everything except Azure SQL DB), you get this garbage isolation level called Read Committed.

That isolation level, along with others like Repeatable Read and Serializable, are considered pessimistic. Though Repeatable Read and Serializable are less garbage, it comes with a strictness of locking that most applications don’t need across the board. They may need it for certain niche activities, but you know…

Then there are isolation levels that are quite useful for most isolation levels, and they’re called optimistic isolation levels. In SQL Server, they’re Snapshot Isolation (SI), and Read Committed Snapshot Isolation (RCSI).

I think they are very much not-garbage, and so do other major database platforms that use specific implementations of MVCC (Multi Version Concurrency Control) by default. There may be some historical reason for SQL Server not doing it by default, which is also garbage.

Differences

There are some differences between the two optimistic isolation levels, which makes them useful in different situations. Most people don’t need both turned on, which is something I see quite a bit, but there just might be someone out there who turns on and actually uses both.

To generalize a little bit:

SI is good when you only want certain queries to read versioned data
RCSI is good when you want every query to read versioned data

What’s versioned data? You can think of it like the “last known good” version of a row before a modification started.

When an update or a delete starts to change data, SQL Server will send those last known good versions up to tempdb for read queries to grab what they need rather than getting blocked. Inserts are a little different, because they are the only known good version of a row.

There are some other differences, too.

SI:

Can be turned on without exclusive access to the database
Queries all read data as it looked at the beginning of a transaction

RCSI:

Does need exclusive access to the database, but it’s not as bad as it sounds
Reads data as it looked when each query in a transaction starts

Getting exclusive access to the database can be done without the single-user/multi-user dance:

ALTER DATABASE YourDatabase
    SET READ_COMMITTED_SNAPSHOT ON
WITH ROLLBACK IMMEDIATE;

If You Know You Know

The reasons why you might want to turn these on are when your application performance suffers because of locking or deadlocking.

If you want some quick and dirty queries to figure out if you’ve got those happening, you can run these queries.

/*Lock waits?*/
SELECT
    dows.wait_type,
    dows.waiting_tasks_count,
    dows.wait_time_ms,
    dows.max_wait_time_ms,
    dows.signal_wait_time_ms
FROM sys.dm_os_wait_stats AS dows
WHERE dows.wait_type LIKE 'LCK%'
AND   dows.waiting_tasks_count > 0
ORDER BY dows.wait_time_ms DESC;

/*Deadlocks?*/
SELECT 
    p.object_name,
    p.counter_name,
    p.cntr_value
FROM sys.dm_os_performance_counters p
WHERE TRIM(p.counter_name) = 'Number of Deadlocks/sec'
AND   TRIM(p.instance_name) = '_Total';

If you need deeper analysis of waits or deadlocks, I’d suggest you use sp_BlitzFirst or sp_BlitzLock.

What you want to look for in general are when readers and writers are interfering with each other. If your blocking or deadlocking problems are between any kind of exclusive locks, optimistic isolation levels won’t help you.

Wait stats from readers will generally have an “S” in them, like LCK_M_S. The same goes for deadlocks, where the lock mode will have an S in either the owner or the waiter.

Reader Writer Fighter

It’s important to keep in mind that it’s not just writers that block readers, or writers that can deadlock.

This is where the “Shared” lock and lock mode stuff comes into play. Again, if all your locks and deadlocks are between modification queries — locks and lock modes with X (exclusive) or U (update) — they’ll still block each other.

There’s a lot more details at the linked posts above, but that’s the general pattern. Another pattern to look for is if your developers keep adding more and more NOLOCK hints to “fix performance issues”.

A lot of times they’re just covering up other issues with indexing or the way queries are written, or they’re totally misunderstood. I’ve said it before, but it doesn’t mean your query doesn’t take any locks, it means that your query doesn’t respect locks taken by other queries.

That often comes as a surprise to people when I tell them, so I say it whenever I write about it. But that’s where the bad reputation comes from — it can read all sorts of in-flight data that may not reflect reality.

Thanks for reading!

Going Further

Blocking on Columnstore Indexes that RCSI and NOLOCK Don’t Resolve

Posted on December 2, 2021May 14, 2022 by Joe Obbish

I recently ran into a production issue where a SELECT query that referenced a NOLOCK-hinted table was hitting a 30 second query timeout. Query store wait stats suggested that the issue was blocking on a table with a nonclustered columnstore index (NCCI). This was quite unexpected to me and I was eventually able to produce a reproduction of the issue. I believe this to be a bug in SQL Server that’s present in both RTM and the current CU as of this blog post (CU14). The issue also impacts CCIs as well but I did significantly less testing with that index type.

The Setup

First I’ll create a heap with 500k rows. All of the rows have the same value. I’ll also create an NCCI on the single column of the table.

DROP TABLE IF EXISTS dbo.TEST_NCCI_1;
CREATE TABLE dbo.TEST_NCCI_1 (
ID VARCHAR(10) NOT NULL
);

INSERT INTO dbo.TEST_NCCI_1
SELECT TOP (500000) '1'
FROM master..spt_values t1
CROSS JOIN master..spt_values t2;

CREATE NONCLUSTERED COLUMNSTORE INDEX NCCI ON dbo.TEST_NCCI_1 (ID) WITH (MAXDOP = 1);

The NCCI has a single compressed rowgroup with no rows in the delta store or delete buffer. It’s an honest table and I’m not trying to trick you:

We also need a table to join to. A heap with a single row will serve that need:

DROP TABLE IF EXISTS dbo.JOIN_TO_ME;
CREATE TABLE dbo.JOIN_TO_ME (
ID VARCHAR(10) NOT NULL
);

INSERT INTO dbo.JOIN_TO_ME VALUES ('1');

Suppose a different session has a long held exclusive table lock on TEST_NCCI_1. For example, the code below could be running in a different session with a yet to be committed transaction:

BEGIN TRANSACTION;

SELECT TOP (1) *
FROM dbo.TEST_NCCI_1 WITH (TABLOCKX);

Would you expect the following query to be blocked by the open transaction?

SELECT COUNT_BIG(*)
FROM dbo.TEST_NCCI_1 a WITH (NOLOCK)
Left Outer Join dbo.JOIN_TO_ME b WITH (NOLOCK) ON a.ID = b.ID;

Test Results

I ran 45 tests in total by varying the isolation level (SET TRANSACTION ISOLATION LEVEL) and the locking hint at the table level. You may be wondering why I bothered to do so. According to the documentation, a table level locking hint overrides the isolation level for read operations:

Only one of the isolation level options can be set at a time, and it remains set for that connection until it is explicitly changed. All read operations performed within the transaction operate under the rules for the specified isolation level unless a table hint in the FROM clause of a statement specifies different locking or versioning behavior for a table.

This is not what I observed which is part of why I believe that the current behavior in SQL Server should be considered a bug. In the table below, “YES” means that the query was blocked and “NO” means the query was able to execute. I used red font for the behavior which I believe to be in error based on my understanding of expected locking behavior in SQL Server.

To summarize the results:

The query is always blocked under the read committed, repeatable read, and serializable isolation levels. The table locking hint does not matter.
Blocking behavior works as expected under the read uncommitted and snapshot isolation levels.
RCSI does not help here. With RCSI, to avoid getting blocked you need to set the transaction isolation level to READ UNCOMMITTED.

I was able to use the lock_acquired extended event to see the problematic requested lock. In the screenshot below, the results are filtered to the object level. The first set of locks that occurred around the 11 second mark were under the read uncommitted isolation level. The second set of locks at around the 21 second mark where under the read committed isolation level. A table level NOLOCK hint was included for both queries.

To be clear, it is unexpected to see an IS object lock for a SELECT query with a table level NOLOCK hint. As far as I can tell, there are no interesting locks taken after the IS lock is acquired:

I attempted to investigate further by getting callstacks before the IS lock is acquired:

sqlmin.dll!XeSqlPkg::lock_acquired::Publish+0x228
sqlmin.dll!lck_lockInternal+0x1139
sqlmin.dll!LockAndCheckState+0x2c5
sqlmin.dll!GetHoBtLockInternal+0x445
sqlmin.dll!ColumnDataSetSession::WakeUp+0xa86
sqlmin.dll!NormalColumnDataSet::WakeUp+0xe
sqlmin.dll!ColumnDataSetSession::Create+0x183
sqlmin.dll!ColumnsetSS::WakeUpInternal+0x1ec
sqlmin.dll!ColumnsetSS::WakeUp+0x15e
sqlmin.dll!CreateDictionaryRowsetHelper+0x250
sqlmin.dll!CBpLocalDictionaryManager::GetSEStringDictionaryRowset+0x416
sqlmin.dll!CBpLocalDictionaryManager::FGetData+0xd1
sqlmin.dll!CBpDeepDataContextForBatch::GetStringValue+0x1d
sqlTsEs.dll!CTEsHashMultiData<167,1>::MdEsIntrinFn<CMDPureInput,CMDIteratorAllPureInputs>+0x13d
sqlTsEs.dll!CEsMDIntrinsicWrapper::UnaryImpl<&CTEsHashMultiData<167,1>::MdEsIntrinFn<CMDPureInput,CMDIteratorAllPureInputs>,&CTEsHashMultiData<167,1>::MdEsIntrinFn<CMDImpureInput,CMDIterator> >+0xd1
sqlTsEs.dll!CEsExecMultiData::GeneralEval+0x188
sqlTsEs.dll!CMultiDataEsRuntime::Eval+0x247
sqlmin.dll!CBpComputeMulticolumnHashRunTime::Eval+0x5b7
sqlmin.dll!CBpPartialJoin::ProcessProbeSide+0x445
sqlmin.dll!CBpQScanHashJoin::Main+0x148
sqlmin.dll!CBpQScanHashJoin::BpGetNextBatch+0x28
sqlmin.dll!CBpQScan::GetNextBatch+0x6f
sqlmin.dll!CBpChildIteratorScanner::BpGetNextBatch+0x12
sqlmin.dll!CBpQScanHashAggNew::ProcessInput+0x8b

I’m not terribly surprised to see a reference to a dictionary in the callstacks because I’m not able to reproduce the issue with an INT or BIGINT column. However, I have no idea where to go from here.

Workarounds

Adding an IGNORE_NONCLUSTERED_COLUMNSTORE_INDEX query hint was an effective workaround for all of my test cases. I experienced the expected locking behavior for all test cases with that hint, so it is apparent to me that the issue is caused by the query plan reading from the NCCI.

Disabling batch mode using the DISALLOW_BATCH_MODE use hint seems to be an effective workaround as well, but I did not test this as thoroughly.

Changing the isolation level may also be an acceptable way to avoid the unexpected blocking for some scenarios. For example, if you already have a NOLOCK table hint in place then changing the transaction isolation level to read uncommitted may not introduce any new data correctness issues.

Final Thoughts

For some queries, a simple join on a VARCHAR column between a table with a columnstore index and another table can lead to unexpected blocking, even with a table level NOLOCK hint or with read committed snapshot isolation enabled for the database. This is frustrating and disappointing behavior from a concurrency perspective. It is an odd scenario where enabling snapshot isolation offers a significant benefit over enabling read committed snapshot isolation. I hope that this locking behavior changes in a future version of SQL Server. Thanks for reading!

SQL Server Features I Still Need To Talk People Into Using

Posted on October 15, 2021May 16, 2022 by Erik Darling

Booze Schmooze

As a counterpart to yesterday’s post, I have a list of Great Ideas™ that sometimes it’s hard to get people on board with, for some reason.

Don’t get me wrong, some people can’t jump on this stuff fast enough — I’ve had people take “unscheduled maintenance” during engagements to flip the right switches — but other times there’s a hem and a haw and a whole lot of’em to go along with the plan.

Somehow people who have auto close and Priority Boost turned on and haven’t checked for corruption since 30 Rock went off the air want a full-bore fisking of every change and every assurance that no edge case exists that could ever cross their path.

Okay pal. You go on with your bad 2012 RTM self.

Lock Pages In Memory

“Please don’t pop my balloon animals.”

To say that this setting only lets SQL Server hang onto memory is a grand injustice. It also lets SQL Server use completely different APIs to access memory through Windows, including skipping over virtual memory space. That can be an awesome benefit for servers with gobs of memory.

What are people worried about, here? Usually some article they read about balloon drivers in 2011, or something.

But the same people aren’t afraid to set min server memory to max server memory, and then wonder why they have no plan cache.

I love this setting though, and if you can also get away with turning on trace flag 834, there are some nice additional benefits.

DBCC CHECKDB

“Well, our index maintenance job runs for 9 hours, so we don’t have time for this. Besides, won’t it cause a lot of blocking?”

Lord have mercy, the foot dragging on this. Part of your job as a DBA is to keep data safe and backed up. Running CHECKDB is part of that.

No DBA got fired over fragmented indexes. More than a few have for data going corrupt.

Granted, this can get a little more complicated for really big databases. Some people break it up into different steps, and other people offload the process.

Some third party backup tools from vendors like Quest and Red Gate allow you to automate processes like that, too. Full backup, restore to new server, run CHECKDB, tell us what happened.

How nice, you get a tested restore, too.

Query Store

“Won’t this catch my server on fire and leak PII to hackers?”

If you’re too cheap to spring for a proper monitoring tool, Query Store makes a pretty okay pseudo replacement. Especially in 2017 and up where you can track wait stats too, you can get some pretty good insights out of it.

Microsoft has also gotten pretty smart about better default settings for this thing, and in 2019 you have more knobs to set smarter standards for which plans get tracked in there.

It’d be really nice if you could choose to ignore queries, too, but you know. Can’t always get what you want, there.

I’d much rather look at Query Store than that unreliable old plan cache, too.

Read Committed Snapshot Isolation

“Why do I want tempdb to be full of old data?”

Remember yesterday? Me either. Nothing good happened, anyway. Do you remember what that row looked like before the update started? No?

Read Committed Snapshot Isolation does. And it wants you to, too. This setting solves so many dumb problems that people run headlong into because Microsoft committed to a garbage isolation level.

One complaint I hear all the time is that a particular application runs a lot better on Oracle with no other changes. This setting is usually the reason why: It’s not turned on.

Once you turn it on, reader/writer blocking and deadlocking goes away, and you don’t need to add a SSMS keyboard shortcut that inserts WITH NOLOCK.

Changing Indexes

“They’re fine the way they are, trust me. That burning smell is another server.”

Index tuning needs to be a process that starts with cleaning up indexes, and ends with adding in better ones.

What makes an index bad? When it’s unused, and/or duplicative.

What makes an index good? When it gets read from more than it’s written to, and it’s a usefully different way for queries to access data.

There are other index anti-patterns that are good to look for too, like lots of single key column indexes, but they usually get cleaned up when you start merging duplicates.

There’s a near fully eclipsed Venn Diagram of people who are worried about having too many indexes and people who have never dropped an index in their career.

Talk, Talk

These are the kinds of changes and processes people should be comfortable with making when they work with SQL Server.

Sure, there are a ton of others, but some of them have become part of the installer and get a little more leeway — parallelism settings, instant file initialization, tempdb etc. — I only wish that more of this stuff would follow suit.

One wonders quite loudly why setting MAXDOP made it into the installer, but setting Cost Threshold For Parallelism did not.

Thanks for reading!

Going Further

SQL Server Features I Still Need To Talk People Out Of

Posted on October 14, 2021May 16, 2022 by Erik Darling

We Rust

Consulting gives you a lot of opportunities to talk to a lot of people and deal with interesting issues.

Recently it occurred to me that a lot of people seem to confer magic button status to a lot of things that always seem to be If-I-Could-Only-Do-This features that would solve all their problems, and similarly a Thing-That-Solved-One-Problem-Once turned into something that got used everywhere.

Go figure, right?

Let’s talk about some of them, so maybe I won’t have to talk this over with you someday, dear reader.

Partitioning

How this ended up being everyone’s top unexplored performance feature is beyond me. I always seem to hear that someone really wants to partition dbo.tblSomeBigTable because queries would be faster if they could eliminate partitions.

Maybe if you’re using clustered column store indexes it would, but for the rest of us, you’re no better off with a partitioned table than you are with a table that has decent indexing. In a lot of cases, partitioning can make things worse, or just more confusing.

Few people seem to consider the amount of work that goes into partitioning a really big table, either. It doesn’t matter if you want to do it in place, or use a batch process to copy data over.

Even fewer people talk about Partitioning for what it’s good for, which is managing partitions. Just make sure all those indexes are aligned.

Fill Factor

At this point, I’d expect everyone to understand why Flash and SSD storage is better than old spinning disks. Lack of moving parts, less pushing random I/O patterns, etc.

And yet, without a single page split being measured or compared, fill factor gets dropped down to 80 (or lower) just in case.

I call this Spinning Disk Mentality, and it hurts to see it out in the wild, especially when:

You’re on Standard Edition
You already have way more data than memory
You’re intentionally making data less compact
Your entire workload is stalled out on PAGEIOLATCH_XX waits

I truly appreciate the problem that lowering fill factor used to solve, but let’s join the CURRENT_CENTURY on this one.

Unless you have a good reason to add physical fragmentation to your indexes, how about we skip that?

In-Memory OLTP (Hekaton)

This is a hugely misunderstood feature. Everyone thinks it’s gonna make queries faster because tables will be in memory without reading the fine print.

If you have problems with throughput on very hot data, this might be a good solution for you.
If you’ve got a bunch of run-0f-the-mill queries that get blocked sometimes and performance generally stinks on, this isn’t really what you need to focus on.

I think the most common useful pattern I’ve seen for this feature is for “shock absorber” tables, where things like event betting, ticket sales, and online ordering all need to happen very quickly for a very small amount of data, and after the initial rush can be shuffled to regular disk-based tables.

If your table is already cached in memory when queries are hitting it, using this feature isn’t gonna make it any more in memory.

You’ve got other problems to solve.

Dirty Reads

Getting blocked sucks. It doesn’t matter if it’s in a database, at a bar, in traffic, or an artery. Everyone wants their reads instantly and they don’t wanna hear a darn word about it.

I’m not here to trample all over NOLOCK — I’ve defended people using it in the past — but I am here to ask you nicely to please reconsider dousing all your queries with it.

In many cases, READPAST is a better option, so your query can skip over locked rows rather than read a bunch of in-flight changes. This can be the wrong choice too, but it’s worth considering. It can be especially useful for modification queries that are just out looking for some work to do.

We’ll talk about my favorite option in tomorrow’s post.

Recompiling All The Things

Look, you wanna recompile a report or something, fine. I do, too. I love doing it, because then I don’t have one less random issue to think about.

Weirdly sniffed parameter? No mas, mon ami.

Magick.

Especially in cases where bigger code changes are hard/impossible, this can be sensible, like dealing with a million local variables.

Just be really careful using it everywhere, especially in code that executes a ton. You don’t wanna spend all your time constantly coming up with query plans any more than you wanna get parameter sniffed.

Plus, since Query Store captures plans with recompile hints, you can still keep track of performance over time. This can be a great way to figure out a parameter sniffing problem, too.

Gotcha

Basic understanding often is often just camouflage for complete confusion. Often, once you dig past the ~~documentation~~ marketing materials, you’ll find every feature has a whole lot of drawbacks, trade-offs, blind spots, and interoperability issues.

Databases being databases, often just getting your schema to a state where you can test new features is a heroic feat.

No wonder so many millions of hours have been spent trying to replace them.

Thanks for reading!

My SQL Server Query Ran For A Long Time But Didn’t Use A Lot Of CPU: What Happened?

Posted on October 6, 2021May 16, 2022 by Erik Darling

Of Walls And Clocks

No one ever says a broken record is right twice a day, perhaps because DJs are far more replaceable than clock makers.

I say that only to acknowledge that I may sound like a broken record when I say that when you’re tuning a query, it’s quite important to compare wall clock time and duration. Things you should note:

If CPU and duration were about equal in a serial plan, that’s normal
If CPU is much higher than duration in a parallel plan, that’s normal
If duration and CPU are about equal in a parallel plan, you’ve got yourself a situation
If duration is much higher than CPU in any plan, something else held your query up

In this post, I’m going to outline a non-exhaustive list of reasons why that last bullet point just might be.

Hammer Time

Big Data: One common reason you may run into is that you’re returning a large result set, either locally to SSMS, or to an app server that is either overloaded or underpowered. It’s also possible that something is happening on the application side that’s slowing things down. In these cases, you’ll usually see a lot of ASYNC_NETWORK_IO waits. To better test the speed of the actual query, you can dump the results into a #temp table.

Blocking: Another quite common issue is that the query you’re running is getting blocked. Before you go reaching for that NOLOCK hint, make sure you know what it does. Blocking is each to check on with sp_WhoIsActive. If you see your query waiting on waits that start with LCK_ Some common ones are LCK_M_SCH_S, LCK_M_SCH_M, LCK_M_S, LCK_M_U, LCK_M_X, LCK_M_IS, LCK_M_IU, LCK_M_IX. While your query is being blocked, it’s just gonna rack up wall clock time, while using zero CPU.

Stats updates: Once in a while I’ll run into a query that runs slowly the first time and fast the second time because behind the scenes the query had to wait on stats to update. It’s a bit hard to figure out unless you’re on SQL Server 2019, but it can totally make your query look like you sat around doing nothing for a chunk of time, especially if you’re waiting on large tables, or a bunch of smaller updates.

Reading from disk: If the tables or indexes you’re reading are bigger than your buffer pool, your queries are gonna eat it going to disk to read data in. If this is your limitation, you’ll see a lot of PAGIOLATCH_SH or PAGEIOLATCH_EX waits, depending on if your query is reading data, writing data, or both.

Waiting for a worker thread: When your queries can’t get a worker thread to run on, they end up waiting on THREADPOOL. While some waits on it are to be expected, you don’t want queries to wait long periods of time on this. It’s a serious sign your server is jammed way up.

Waiting for memory: When your queries can’t get the memory grant they want, they sit around waiting on RESOURCE_SEMAPHORE. Just like above, it’s a sure sign your server is having problems if it’s a wait you see occurring a lot, or for long average periods of time.

Waiting for a query plan: I know, this sounds like a long compilation time — and it sort of is — but only because the query is waiting a long time to get memory to compile an execution plan. This wait is going to be RESOURCE_SEMAPHORE_QUERY_COMPILE, instead.

Query Plan Compilation: Sometimes the optimizer gets a bit carried away and spend a whole bunch of time in search2 trying to reorder joins, and do other tricks to make your query faster. Under extreme circumstances, you might wait a really long time for that query plan. There’s no wait stats to tell you that, but if you look at your query plan’s properties (F4 key on the root operator), you can see the compile time.

Thanks for reading!

Going Further

Steps For Getting Rid Of NOLOCK Hints In SQL Server Queries

Posted on September 24, 2021May 16, 2022 by Erik Darling

Way Out

Whenever I see people using NOLOCK hints, I try to point out that they’re not a great idea, for various reasons explained in detail all across the internet.

At minimum, I want them to understand that the hint name is the same as setting the entire transaction isolation level to READ UNCOMMITTED, and that the hint name is quite misleading. It doesn’t mean your query takes no locks, it means your query ignores locks taken by other queries.

That’s how you can end up getting incorrect results.

That warning often comes with a lot of questions about how to fix blocking problems so you can get rid of those hints.

After all, if you get rid of them, your SQL Server queries will (most likely) go back to using the READ COMMITTED isolation level and we all know that read committed is a garbage isolation level, anyway.

Cause and Wrecked

An important thing to understand is why the hint was used in the first place. I’ve worked with some nice developers who slapped it on every query just in case.

There was no blocking or deadlocking. They just always used it, and never stopped.

Not that I blame them; the blocking that can occur under read committed the garbage isolation level is plain stupid, and no respectable database platform should use it as a default.

In many ways, it’s easier for a user to re-run a query and hopefully get the right result and shrug and mumble something about computers being awful, which is also true.

So, first step: ask yourself if there was ever really a blocking problem to begin with.

Bing Tutsby

Next, we need to understand where the blocking was coming from. Under read committed the garbage isolation level, writers can block readers, and readers can block writers.

In most cases though, people have added the hint to all of their queries, even ones that never participated in blocking.

If the problem was writers blocking writers, no isolation can help you.
If the problem was readers blocking writers, you may need to look at long running queries with Key Lookups

If the problem was writers blocking readers, you’d have to look at a few things:

Are you modification queries biting off more than they should chew?
Do they have good indexes in place to locate rows to update or delete?

If you have query store enabled, you can use sp_QuickieStore to search it for queries that do a lot of writes. If you don’t, you can use sp_BlitzCache to search the plan cache for them.

Best Case

Of course, you can avoid all of these problems, except for writers blocking writers, by using an optimistic isolation level like Read Committed Snapshot Isolation or Snapshot Isolation.

In the past, people made a lot of fuss about turning these on, because

You may not have tempdb configured correctly
You have queue type code that relied on blocking for correctness

But in reasonably new versions of SQL Server, tempdb’s setup is part of the install process, and the wacky trace flags you used to have to turn on are the default behavior.

If you do have code in your application that processes queues and relies on locking to correctly process them, you’re better off using locking hints in that code, and using an optimistic isolation level for the rest of your queries. This may also be true of triggers that are used to enforce referential integrity, which would need READCOMMITTEDLOCK hints.

The reason why they’re a much better choice than using uncommitted isolation levels is because rather than get a bunch of dirty reads from in-flight changes, you read the last known good version of the row before a modification started.

This may not be perfect, but it will prevent the absolute majority of your blocking headaches. It will even prevent deadlocks between readers and writers.

No, Lock

If your code has a lot of either NOLOCK hints or READ UNCOMITTED usage, you should absolutely be worried about incorrect results.

There are much better ways to deal with blocking, and I’ve outlined some of them in this post.

Thanks for reading!

Going Further

Read Committed Is A Garbage Isolation Level In SQL Server

Posted on June 15, 2021May 16, 2022 by Erik Darling

Yep.

My friend Forrest ran into a similarly fun problem, too.

Thanks for watching!