The Perils Of Change: Max Degree Of Parallelism

Downtune


Like in yesterday’s post about Cost Threshold For Parallelism, changing MAXDOP settings will have a universal effect on the workload.

This is true whether you change it at the server level for all databases, or at the database level using a database scoped configuration for a single database.

It is a guardrail to prevent unwanted conditions as a whole, like excessive concurrent parallel queries causing worker thread starvation (THREADPOOL waits), or just pushing CPU to 100% for extended periods of time.

You may also find times when parallel threads within a single query are more prone to involvement in blocking or deadlocking.

The problem is similar to yesterday’s post, and requires some level of attention to detail at the query level after you change MAXDOP.

Don’t just look at all the various CX waits, pronounce that there are fewer of them, and call the job done.

You should examine parallel query metrics to make sure that runtimes are still adquate.

Real World


If we take a single query and run it at reduced DOPs, you can see the pattern I’m talking about.

Yesterday, we looked at DOP 8 vs. DOP 1, and I’m including a DOP 1 run here too. It’s just not as important, unless you’re the kind of wackadoo SharePoint admin who changes MAXDOP to 1.

(I’m going to be honest with you here, it’s been YEARS since I’ve seen a SharePoint database, or looked at the SharePoint documentation to see if the MAXDOP 1 requirement still exists).

quality sleep

When you change DOP, it’s up to you to monitor (you can use Query Store for this) execution and CPU time to look for big regressions.

Keeping the above picture in mind:

  • Is it okay if this query runs for ~300ms longer at DOP 4?
  • Is it okay if this query runs for ~1.3 seconds longer at DOP 2?

Taking a full three seconds longer at DOP 1 is probably out the door for queries that users care about, but that’s more likely from changing Cost Threshold For Parallelism.

A diligent caretaker for SQL Server may even look for critical queries that currently use parallel execution plans, and time them with a lower DOP to make sure there’s no substantial regression before making the change.

If there’s a big increase in duration, add the higher DOP hint to queries where it makes sense.

Thanks for reading!

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that, and need to solve database performance problems quickly. You can also get a quick, low cost health check with no phone time required.

The Perils Of Change: Cost Threshold For Parallelism

Non-Default


First, I’m not suggesting that anyone should be using the default value for Cost Threshold For Parallelism. It’s old and moldy and not a good fit for most workloads functioning on modern hardware.

My apologies to Azure SQLDB users who can’t change this setting and leave it up to Microsoft to maybe manage it for them based on ???

Some people out there really like fiddling with settings in a usually ill-informed reaction to Some Script They Found On The Internet, without reading the fine print.

Constantly pushing Cost Threshold For Parallelism up higher because there’s “too much parallelism” is usually a bad strategy, for several reasons.

You should be looking at:

  • Which queries are going parallel
  • Why parallel plans are being chosen
  • The duration vs. CPU of parallel queries

Just to put some easy numbers out there, let’s say you have a query that runs at DOP 8 and:

  • Runs for 8 seconds
  • Uses 64 seconds of CPU

If you push Cost Threshold For Parallelism up to the point where that query no longer qualifies for a parallel plan and runs at DOP 1, it will:

  • Run for 64 seconds
  • Use 64 seconds of CPU time

See why this isn’t a bold and exciting strategy embraced by experts around the world?

Scaling


Of course, in the real world, parallelism rarely scales perfectly linearly. That’s okay, as long as you’re seeing a reasonable reduction in wall clock time.

In this picture, I’ve captured the same query running at DOP 8 and 1.

scaling

What I said above holds close enough to true.

  • At DOP 8, we used ~6.5 seconds of CPU time in ~800ms of wall clock time
  • At DOP 1, we used ~3.9 seconds of CPU time in ~3.9 seconds of wall clock time

While we used more CPU time at DOP 8, we only used about 2.5 seconds more of it, and we used it for about 3 fewer seconds overall.

This is the efficiency tradeoff parallelism offers, and this is especially desirable for queries that have to process a lot of data.

Sticking with DOP 8, I’d rather process 10 million rows across 8 threads than 10 million rows on a single thread (especially in Row Mode, but it would be still be preferable in Batch Mode).

Anyway, if you’ve got Cost Threshold For Parallelism set to the default, you should probably bump it up. I still think 50 is a good starting place to get you to a spot where you can make sane decisions.

Remember, global settings are the guard rails for your workload. They are the general rules you want queries to use.

After you set up the guard rails, it’s up to you to define the exceptions.

Thanks for reading!

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that, and need to solve database performance problems quickly. You can also get a quick, low cost health check with no phone time required.

The SQLBits 2023 Call For Speakers Is Open! @SQLBits #SQLBits

What A Great Time To Be Alive


I absolutely adore the SQLBits conference, and it was the Biggest Bummer In Bummsville last year when I had family travel plans that overlapped with the conference dates.

So this year, I’m back with a mach 5 vengeance, and ready to get back to the most highingest-flyingest conference in the world.

If you’re interested, and you wanna throw your hat in the ring, here are the links for general and full training day submissions:

To learn more about speaking at the event, click here.

To learn more about attending the event, click here.

I’ll have all my fingers and toes crossed until November 14th when the training day schedule is announced.

Thanks for reading!

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that, and need to solve database performance problems quickly. You can also get a quick, low cost health check with no phone time required.

A Little About The CXSYNC_PORT Wait In SQL Server 2022

A Little About The CXSYNC_PORT Wait In SQL Server 2022


Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that, and need to solve database performance problems quickly. You can also get a quick, low cost health check with no phone time required.

Performance Regression With Batch Mode Sorts When Spilling To Disk

Gress and Regress


To vote for this issue, please click here.

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that, and need to solve database performance problems quickly. You can also get a quick, low cost health check with no phone time required.

The Waiting Life Of A SQL Server Query, Revisited

Often Asked


Sometimes people will ask me penetrating questions like “why does SQL Server choose a bad execution plan?” or “why is this query sometimes slow?”

Like many things in databases, it’s an endless spiral of multiverses (and turtles) in which many choose your own adventure games are played and, well, sometimes you get eaten by a Grue.

In this post, I’m going to talk at a high level about potential reasons for both.

I can’t possibly go in depth on both in a single blog post, but I thought it would be fun to talk about all the things that could possibly cause your Unfortunate Circumstances™️.

At any rate, I hope you enjoy it. And maybe learn something. And maybe feel bad about blaming SQL Server for things that are all your fault.

Getting There


A lot of assumptions are gonna get made here, because a lot of things can stand between you and your query getting to SQL Server.

We’re talking networks, security, Active Directory, the cloud. Who knows? Not me. I stick to what happens inside the server.

Just know that there are a lot of moving parts between your application server and your database server.

I’ve seen logins take 30+ seconds to process. Not on purpose, mind you. I stick to what happens inside the server.

But hey, maybe we can just say this all worked quickly, and now we can talk about the stuff I stick to.

Inside the server.

Upon Arrival


There are even more things that have to happen now, even after you log in.

Your connection has to pass some checks and do a bit of routing, and then at a high level, your query will either:

  • Re-use a cached plan if one exists
  • Start building a query plan

This is when SQL Server’s query processor comes along and goes through the various phases of figuring things out:

  • Parse
  • Bind
  • Expand views
  • OPTIMIZE!
  • Come up with an executable plan
  • Apply memory grant and degree of parallelism

There’s one step in that process that seem important. The actual optimization portion. In there, you have:

  • Simplification
  • Cardinality estimation
  • Join ordering (depends a bit on which search phase you make it to: 0, 1, 2
  • Plan exploration phases

Rules, Smoky


During plan exploration is when the optimizer starts applying various rules to your query.

If you want to see the names of all the crazy rules that get may get applied, look in here:

SELECT
    deqts.*
FROM sys.dm_exec_query_transformation_stats AS deqts
ORDER BY
    deqts.name;

In SQL Server 2022, there are about 440 of them. It’s a wonder anything gets done around here.

This process has to happen quickly. So when you throw your crazy-town queries at SQL Server, it may not figure out the single best possible plan ever.

If your queries are parameterized, this brand spankin’ new plan will be created for those parameters.

At this point, it would be good to remind you that your query hasn’t even started executing yet.

It’s just been waiting to get going.

Once it has an executable plan, that happens.

Inside the server.

Ready Set Go


Once your query gets on its horse, you’re subject to many potential holdups, both physical and logical. It may even have to wait for memory to compile an execution plan at times.

  • Memory to compile a query plan (RESOURCE_SEMAPHORE_QUERY_COMPILE)
  • Read pages from disk into memory (PAGEIOLATCH_SH, PAGEIOLATCH_EX)
  • Writing to the transaction log (WRITELOG)
  • Memory for a memory grant (RESOURCE_SEMAPHORE)
  • CPU threads to run on (THREADPOOL)
  • CPU attention while other queries are busy (SOS_SCHEDULER_YIELD)
  • Spilling to disk (IO_COMPLETION, SLEEP_TASK)
  • Building spools (EXECSYNC — parallel plans only)
  • Blocking (LCK_X)
  • Latching (LATCH_XX and PAGELATCH_XX)
  • Parallel threads to process (All those CX waits)

Not enough memory? Wait. Not enough CPU? Wait. Need to read data? Wait. Need to write data? Wait. Blocked? Wait.

And that doesn’t even count all the work your query might have to do while it’s running and not waiting on anything.

The next time you have a gripe about a slow query, try to keep all this stuff in mind.

There’s a lot going on.

Inside the server.

Since You Want To Know


A great way to see all the stuff your queries are waiting on all at once is to run sp_PressureDetector.

Inside the server.

Thanks for reading!

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that, and need to solve database performance problems quickly.

Three Free Things You Should Do Before You Have Your Next SQL Server Performance Problem

Attribution


SQL Server doesn’t do much to help you figure out what’s wrong with it. Sure, it makes a lot of information available, but logging, decoding, and assembling all that information is hard work.

And it sure doesn’t help that a lot of it is stored in XML. Even when there are graphical options for displaying it, they end up hiding a lot of the really useful information.

It also has a lot of optional things that you have to turn on, and all of these things require you to monitor and alert on them.

In this post, I’m going to talk about why you should turn on Query Store, the Blocked Process Report, and capture deadlocks, and link you to free tools to analyze them.

This is all stuff that I help clients with as well, along with actually helping them fix the problems, but if you’re feeling keen on getting to it yourself, here’s what you should do.

Query Store


This thing is great for most SQL Servers out there. It’s like the plan cache, except it won’t keep disappearing on you.

To get it up and running, just run this command:

ALTER DATABASE 
    [YourDatabase]
SET QUERY_STORE 
(
    OPERATION_MODE = READ_WRITE,
    MAX_STORAGE_SIZE_MB = 1024,
    QUERY_CAPTURE_MODE = AUTO
);

After it’s turned on, you can check in on it in two ways:

Whichever one you use, you’re going to be in much better shape than hoping what you need is in the plan cache after something went bump in the night.

EXEC sp_QuickieStore
    @database_name = 'your awesome database';

Just running this will go back up to seven days and give you the top 10 queries by average CPU time. In other words, it’s the stuff that uses the most CPU whenever it runs.

If you find stuff you don’t care about, just copy and paste the contents of the query_id column, and do this to ignore them:

EXEC sp_QuickieStore
    @database_name = 'your awesome database',
    @ignore_query_ids = '1, 2, 3, 4, 5, 6, 7, 8';

Blocked Processes


The next thing you’ll wanna do is start logging which queries are blocking each other. You never know when this will ratchet up and become a real big problem.

The number of times I hear “we had a bad blocking problem” but no one even has a screenshot of it hanging out would boggle the very depths of your mind.

To avoid that, you need to:

  • Enable the blocked process report
  • Enable something to collect it

First, do this:

EXEC sys.sp_configure
    N'show advanced options',
    1;
RECONFIGURE;
GO
EXEC sys.sp_configure
    N'blocked process threshold',
    5; --Seconds
RECONFIGURE;
GO

This will get SQL Server capturing any blocking events that go on longer than five seconds. This is about the level where people start hitting problems, but you can set it to a higher number if you feel like it.

Next, you’ll want to set up an Extended Event to house the data from it:

CREATE EVENT SESSION 
    blocked_process_report
ON SERVER
    ADD EVENT 
        sqlserver.blocked_process_report
    ADD TARGET 
        package0.event_file
    (
        SET filename = N'bpr'
    )
WITH
(
    MAX_MEMORY = 4096KB,
    EVENT_RETENTION_MODE = ALLOW_SINGLE_EVENT_LOSS,
    MAX_DISPATCH_LATENCY = 5 SECONDS,
    MAX_EVENT_SIZE = 0KB,
    MEMORY_PARTITION_MODE = NONE,
    TRACK_CAUSALITY = OFF,
    STARTUP_STATE = ON
);
GO

ALTER EVENT SESSION
    blocked_process_report
ON SERVER 
    STATE = START;
GO

If you’re using Azure SQL DB, you’ll need to tweak this to use it successfully. It’s a lot harder to successfully set up and access event files there, and the ring buffer is far less valuable because it doesn’t hold onto data very long.

The easiest way to get at the data in there is to use my stored procedure sp_HumanEventsBlockViewer.

It’s as simple as pointing it at the event session you created, but there are a whole bunch of other parameters to filter data in or out.

EXEC dbo.sp_HumanEventsBlockViewer
    @session_name = N'blocked_process_report';

 

Deadlocks


The final thing you should do is set up something to collect deadlocks in the long term.

To do that, Extended Events is your friend all over again.

CREATE EVENT SESSION 
    deadlock
ON SERVER
    ADD EVENT 
        sqlserver.xml_deadlock_report
    ADD TARGET 
        package0.event_file
    (
        SET filename = N'deadlock'
    )
WITH
(
    MAX_MEMORY = 4096KB,
    EVENT_RETENTION_MODE = ALLOW_SINGLE_EVENT_LOSS,
    MAX_DISPATCH_LATENCY = 5 SECONDS,
    MAX_EVENT_SIZE = 0KB,
    MEMORY_PARTITION_MODE = NONE,
    TRACK_CAUSALITY = OFF,
    STARTUP_STATE = ON
);
GO

ALTER EVENT SESSION
    deadlock
ON SERVER 
    STATE = START;
GO

While this stored procedure isn’t in my GitHub repo, it’s one that I do a heck of a lot of work on. The easiest way to parse everything out is with sp_BlitzLock.

It’s as simple as pointing it at your extended event session.

EXEC dbo.sp_BlitzLock
    @EventSessionName = N'deadlock';

Nice To Have


Having these three things set up and running will make your life a lot easier the next time you face a performance emergency with your SQL Server.

Sure, there are a lot of other things that could go wrong, but these are the most common areas I see clients having problems in.

If you need to do a post mortem on an issue, you might want to look at:

  • sp_LogHunter: Dig through your error logs for high severity issues
  • sp_HealthParser: Dig through the system health extended event for a wide variety of issues

And as always, if you need help working through these kinds of problems, hit the link below to get some consulting help from me.

Thanks for reading!

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that, and need to solve database performance problems quickly. You can also get a quick, low cost health check with no phone time required.

A Little About Select Queries That Cause Writes In SQL Server

A Little About Select Queries That Cause Writes In SQL Server


Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that, and need to solve database performance problems quickly. You can also get a quick, low cost health check with no phone time required.

Why I Love NOLOCK Hints In SQL Server (Video Edition)

Oh, Just Do it.


Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that, and need to solve database performance problems quickly. You can also get a quick, low cost health check with no phone time required.

How To Get SQL Server Triggers To Selectively Fire

Big Noise


I was helping a client with an issue recently where they wanted to allow certain admin users to override changes currently in a table, but not allow anyone else to make changes.

The thing is, users had to be allowed to make other changes to the table, so it wasn’t something that could be handled easily with security features.

The example I’m going to show here is simplified a bit to get the code across to you, so keep that in mind.

Now, at first I was thinking about using some fancier features that SQL Server has to offer:

  • CONTEXT_INFO
  • sp_getapplock

But after talking over with the Most Valuable Kiwi, those ideas quickly lost their luster. They would both require additional handling in case of errors, and sp_getapplock could have issues under concurrency.

What turned out to be the best idea was actually the simplest: create a temp table with a very random name (in this case, running SELECT NEWID() and removing the dashes from the result) and having the trigger look for that to decide whether or not it should allow the update.

See below for the demo code.

Thanks for reading!

USE Crap;
GO 

/*Out ye devil*/
DROP TABLE IF EXISTS
    dbo.the_table;

/*Okay you can come back*/
CREATE TABLE
    dbo.the_table
(
    id int PRIMARY KEY IDENTITY,
    dt datetime NOT NULL DEFAULT SYSDATETIME()
);
GO 

/*One row is fine.*/
INSERT
    dbo.the_table
(
    dt
)
VALUES
(
    DEFAULT
);
GO 

/*Check me out*/
SELECT
    tt.*
FROM dbo.the_table AS tt;
GO 

/*
The trigger, and why:

Other options like SESSION_CONTEXT and sp_getapplock
require a whole bunch of other handling and can be weird
with connection pooling and under high concurrency
 * Clearing out session context when done
 * Error handling for sp_get applock to release locks
 * A whole list of other stuff!

*/
CREATE OR ALTER TRIGGER
    dbo.the_trigger
ON dbo.the_table
INSTEAD OF UPDATE
AS
BEGIN
    IF 
    (
           ROWCOUNT_BIG() = 0
        OR OBJECT_ID(N'tempdb..#B7E66DC66D9C4C4182ECCF583D126677') IS NULL
    )
    BEGIN
        RAISERROR('Only admins can update that column!', 11, 1) WITH NOWAIT;
        RETURN;
    END;
    ELSE
    BEGIN
        SET NOCOUNT ON;

        IF UPDATE(dt)
        BEGIN
            UPDATE t
              SET t.dt = i.dt
            FROM dbo.the_table AS t
            JOIN Inserted AS i
              ON i.id = t.id;
        END
    END;
END;
GO 

/*
Create a #temp table for the trigger to pick up on
*/
CREATE OR ALTER PROCEDURE
    dbo.the_admin_procedure
(
    @id int,
    @dt datetime
)
AS
BEGIN
    SET NOCOUNT, XACT_ABORT ON;

    CREATE TABLE
        #B7E66DC66D9C4C4182ECCF583D126677
    (
        the_crumb bit
    );

    UPDATE t
        SET t.dt = @dt
    FROM dbo.the_table AS t
    WHERE t.id = @id;
END;
GO 

/*
Non-admin users don't get the temp table
*/
CREATE OR ALTER PROCEDURE
    dbo.the_normal_procedure
(
    @id int,    
    @dt datetime
)
AS
BEGIN
    SET NOCOUNT, XACT_ABORT ON;
    
    UPDATE t
        SET t.dt = @dt
    FROM dbo.the_table AS t
    WHERE t.id = @id;
END;
GO 

/*
In real life, you might be able to figure it out based on login names
*/
CREATE OR ALTER PROCEDURE
    dbo.the_real_procedure
(
    @id int,    
    @dt datetime
)
AS
BEGIN
    SET NOCOUNT, XACT_ABORT ON;

    IF SUSER_SNAME() = 'sister_minnie'
    BEGIN
        CREATE TABLE
            #B7E66DC66D9C4C4182ECCF583D126677
        (
            the_crumb bit
        );        
    END;
    
    UPDATE t
        SET t.dt = @dt
    FROM dbo.the_table AS t
    WHERE t.id = @id;
END;
GO 


/*
Show and tell
*/
EXEC dbo.the_normal_procedure 
    @id = 1,
    @dt = '19000101';

EXEC dbo.the_admin_procedure 
    @id = 1,
    @dt = '99991231';

SELECT
    tt.*
FROM dbo.the_table AS tt;

 

 

 

 

 

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that, and need to solve database performance problems quickly. You can also get a quick, low cost health check with no phone time required.