Why Selects That Cause Writes Can Mean Performance Trouble In SQL Server

Answer Time


While answering a question on dba.se, I got to thinking about if there would be a good way to detect SELECT queries that cause writes.

In newer versions of SQL Server, sys.dm_exec_query_stats has columns that show you spills.

That’s a pretty good start, but what about other kinds of writes, like the ones outlined in the Q&A I linked to?

So uh, I wrote this script to find them.

Downsides


The downsides here are that it’s looking at the plan cache, so I can’t show you which operator is spilling. You’ll have to figure that out on your own.

The source of the writes may be something else, too. It could be a spool, or a stats update, etc. That’s why I tried to set the spill size (1024.) kind of high, to not detect trivial writes.

You may be able to loosely correlate spills to IO_COMPLETION or SLEEP_TASK waits.

Thanks for reading!

WITH 
XMLNAMESPACES 
    ('http://schemas.microsoft.com/sqlserver/2004/07/showplan' AS x),
writes AS
(
    SELECT TOP (100)
        deqs.statement_start_offset,
        deqs.statement_end_offset,
        deqs.plan_handle,
        deqs.creation_time,
        deqs.last_execution_time,
        deqs.total_logical_writes,
        deqs.last_logical_writes,
        deqs.min_logical_writes,
        deqs.max_logical_writes,
        deqs.query_hash,
        deqs.query_plan_hash
    FROM sys.dm_exec_query_stats AS deqs
    WHERE deqs.min_logical_writes > 1024.
    ORDER BY deqs.min_logical_writes DESC
),
plans AS
(
    SELECT DISTINCT
        w.plan_handle,
        w.statement_start_offset,
        w.statement_end_offset,
        w.creation_time,
        w.last_execution_time,
        w.total_logical_writes,
        w.last_logical_writes,
        w.min_logical_writes,
        w.max_logical_writes
    FROM writes AS w
    CROSS APPLY sys.dm_exec_query_plan(w.plan_handle) AS deqp
    CROSS APPLY deqp.query_plan.nodes('//x:StmtSimple') AS s(c)
    WHERE deqp.dbid > 4
    AND   s.c.value('@StatementType', 'VARCHAR(100)') = 'SELECT'
    AND   NOT EXISTS 
          (   
              SELECT      
                  1/0 --If nothing comes up, quote out the NOT EXISTS. 
              FROM writes AS w2
              CROSS APPLY deqp.query_plan.nodes('//x:StmtSimple') AS s2(c)
              WHERE w2.query_hash = w.query_hash
              AND   w2.query_plan_hash = w.query_plan_hash
              AND   s2.c.value('@StatementType', 'VARCHAR(100)') <> 'SELECT' 
          )
)
SELECT      
    p.creation_time,
    p.last_execution_time,
    p.total_logical_writes,
    p.last_logical_writes,
    p.min_logical_writes,
    p.max_logical_writes,
    text = 
        SUBSTRING
        (
            dest.text, 
        	( p.statement_start_offset / 2 ) + 1,
            (( 
        	    CASE p.statement_end_offset 
        		     WHEN -1 
        			 THEN DATALENGTH(dest.text) 
        	         ELSE p.statement_end_offset 
                END - p.statement_start_offset 
        	  ) / 2 ) + 1
        ),
    deqp.query_plan
FROM plans AS p
CROSS APPLY sys.dm_exec_sql_text(p.plan_handle) AS dest
CROSS APPLY sys.dm_exec_query_plan(p.plan_handle) AS deqp
ORDER BY p.min_logical_writes DESC
OPTION ( RECOMPILE );

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that and need to solve performance problems quickly.

Do Query Plans With Multiple Spool Operators Share Data In SQL Server?

Spoolwork


I wanted to show you two situations with two different kinds of spools, and how they differ with the amount of work they do.

I’ll also show you how you can tell the difference between the two.

Two For The Price Of Two


I’ve got a couple queries. One generates a single Eager Index Spool, and the other generates two.

    SELECT TOP (1) 
            u.DisplayName,
            (SELECT COUNT_BIG(*) 
             FROM dbo.Badges AS b 
             WHERE b.UserId = u.Id 
             AND u.LastAccessDate >= b.Date) AS [Whatever],
            (SELECT COUNT_BIG(*) 
             FROM dbo.Badges AS b 
             WHERE b.UserId = u.Id) AS [Total Badges]
    FROM dbo.Users AS u
    ORDER BY [Total Badges] DESC;
    GO 

    SELECT TOP (1) 
            u.DisplayName,
            (SELECT COUNT_BIG(*) 
             FROM dbo.Badges AS b 
             WHERE b.UserId = u.Id 
             AND u.LastAccessDate >= b.Date ) AS [Whatever],
            (SELECT COUNT_BIG(*) 
             FROM dbo.Badges AS b 
             WHERE b.UserId = u.Id 
             AND u.LastAccessDate >= b.Date) AS [Whatever],
            (SELECT COUNT_BIG(*) 
             FROM dbo.Badges AS b 
             WHERE b.UserId = u.Id) AS [Total Badges]
    FROM dbo.Users AS u
    ORDER BY [Total Badges] DESC;
    GO

The important part of the plans are here:

SQL Server Query Plan
Uno
SQL Server Query Plan
Dos

The important thing to note here is that both index spools have the same definition.

The two COUNT(*) subqueries have identical logic and definitions.

Fire Sale


The other type of plan is a delete, but with a different number of indexes.

/*Add these first*/
CREATE INDEX ix_whatever1 ON dbo.Posts(OwnerUserId);
CREATE INDEX ix_whatever2 ON dbo.Posts(OwnerUserId);
/*Add these next*/
CREATE INDEX ix_whatever3 ON dbo.Posts(OwnerUserId);
CREATE INDEX ix_whatever4 ON dbo.Posts(OwnerUserId);

BEGIN TRAN
DELETE p
FROM dbo.Posts AS p
WHERE p.OwnerUserId = 22656
ROLLBACK
SQL Server Query Plan
With two indexes
SQL Server Query Plan
With four indexes

Differences?


Using Extended Events to track batch completion, we can look at how many writes each of these queries will do.

For more on that, check out the Stack Exchange Q&A.

The outcome is pretty interesting!

SQL Server Extended Events

  • The select query with two spools does twice as many reads (and generally twice as much work) as the query with one spool
  • The delete query with four spools does identical writes as the one with two spools, but more work overall (twice as many indexes need maintenance)

Looking at the details of each select query, we can surmise that the two eager index spools were populated and read from separately.

In other words, we created two indexes while this query ran.

For the delete queries, we can surmise that a single spool was populated, and read from either two or four times (depending on the number of indexes that need maintenance).

Another way to look at it, is that in the select query plans, each spool has a child operator (the clustered index scan of Badges). In the delete plans, three of the spool operators have no child operator. Only one does, which signals that it was populated and reused (for Halloween protection).

Thanks for reading!

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that and need to solve performance problems quickly.

Implicit vs Explicit Spools In SQL Server Query Plans

Spools Of A Feather


There are plenty of spools in query plans, and they’re all pretty well labeled.

  • Index
  • Table
  • Rowcount
  • Window

They can be either eager or lazy.

An eager spool will take all the rows at once and cache them, and a lazy spool will only go get rows as needed.

But what else can act like a spool?

Phases On


In general, a blocking operator, or as my upside down friend Paul calls them, “phase separators” can act as a spool.

A spool, after all, is just something that keeps track of some rows, which is exactly what a Sort or a Hash do.

They keep track of rows that arrive, and either sort them according to a need, or create a hash table of the value.

While either of these happen, any downstream work in the query have to wait for them to complete. This is why they’re called blocking, stop and go, or, more eloquently, phase separators.

Eager spools have the same basic feature: wait for all the rows from downstream to arrive, and perform an action (or just feed them to another operator).

Here’s an example of a Sort acting as a spool:

DROP TABLE IF EXISTS #waypops;

CREATE TABLE #waypops
(
  UserId INT
  , PRIMARY KEY NONCLUSTERED (UserId) WITH (IGNORE_DUP_KEY = ON)
);

INSERT #waypops WITH(TABLOCKX) 
       ( UserId)
SELECT b.UserId
FROM dbo.Badges AS b
WHERE b.Name = N'Popular Question';
A SQL Server query plan
Enjoy the silence

The Sort is in the same order as the index it’s reading from, but just reading from the index wouldn’t provide any separation.

Just Passing By


This is weird, niche stuff. That’s why I’m posting it on a Friday. That, and I wanna bully someone into writing about using a hash join to do the same thing.

Thanks for reading!

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that and need to solve performance problems quickly.