SQL Server Extended Event Duration Filtering Can Make Troubleshooting Frustrating

Shortness, Shortness


I use sp_HumanEvents to troubleshoot all sorts of SQL Server issues.

Most frequently, I used it to troubleshoot stored procedure performance like this:

EXEC sp_HumanEvents
    @event_type = 'query',                   
    @query_duration_ms = 5000,               
    @session_id = N'60',                    
    @keep_alive = 1,
    @debug = 1;

The session it creates will have this definition:

CREATE EVENT SESSION keeper_HumanEvents_query
    ON SERVER  
      ADD EVENT sqlserver.module_end 
        (SET collect_statement = 1
         ACTION (sqlserver.database_name, sqlserver.sql_text, sqlserver.plan_handle, sqlserver.query_hash_signed, sqlserver.query_plan_hash_signed)
         WHERE ( 
            sqlserver.is_system = 0 
     AND duration >= 5000000
     AND sqlserver.session_id = 60
 )),
      ADD EVENT sqlserver.rpc_completed 
        (SET collect_statement = 1
         ACTION(sqlserver.database_name, sqlserver.sql_text, sqlserver.plan_handle, sqlserver.query_hash_signed, sqlserver.query_plan_hash_signed)
         WHERE ( 
            sqlserver.is_system = 0 
     AND duration >= 5000000
     AND sqlserver.session_id = 60
 )),
      ADD EVENT sqlserver.sp_statement_completed 
        (SET collect_object_name = 1, collect_statement = 1
         ACTION(sqlserver.database_name, sqlserver.sql_text, sqlserver.plan_handle, sqlserver.query_hash_signed, sqlserver.query_plan_hash_signed) 
         WHERE ( 
            sqlserver.is_system = 0 
     AND duration >= 5000000
     AND sqlserver.session_id = 60
 )),
      ADD EVENT sqlserver.sql_statement_completed 
        (SET collect_statement = 1
         ACTION(sqlserver.database_name, sqlserver.sql_text, sqlserver.plan_handle, sqlserver.query_hash_signed, sqlserver.query_plan_hash_signed)
         WHERE ( 
            sqlserver.is_system = 0 
     AND duration >= 5000000
     AND sqlserver.session_id = 60
 )),
      ADD EVENT sqlserver.query_post_execution_showplan
        (
         ACTION(sqlserver.database_name, sqlserver.sql_text, sqlserver.plan_handle, sqlserver.query_hash_signed, sqlserver.query_plan_hash_signed)
         WHERE ( 
            sqlserver.is_system = 0 
     AND duration >= 5000000
     AND sqlserver.session_id = 60
 ))

I collect:

  • module_end
  • rpc_completed
  • sp_statement_completed
  • sql_statement_completed
  • query_post_execution_showplan

Not all of them are relevant to stored procedure calls, but in larger contexts where I have no idea where long running queries might be coming from, it’s useful to get all these.

Weirdness, Weirdness


Where I find things getting somewhat annoying is when things start showing up in there that meet the duration filter, but don’t really give me anything further to go on.

To simulate what I mean, I’m going to use this stored procedure:

CREATE OR ALTER PROCEDURE
    dbo.Eventually
AS
BEGIN

    SET NOCOUNT, XACT_ABORT ON;
    
    WAITFOR DELAY '00:00:01.000';
    SELECT TOP (1) * FROM dbo.Badges AS b;
    WAITFOR DELAY '00:00:01.000';
    SELECT TOP (1) * FROM dbo.Comments AS c;
    WAITFOR DELAY '00:00:01.000';
    SELECT TOP (1) * FROM dbo.Posts AS p;
    WAITFOR DELAY '00:00:01.000';
    SELECT TOP (1) * FROM dbo.Users AS u;
    WAITFOR DELAY '00:00:01.000';
    SELECT TOP (1) * FROM dbo.Votes AS v;
    WAITFOR DELAY '00:00:01.000';

END;
GO 

EXEC dbo.Eventually;

There are six waitfor commands that each pause for 1 second. In between them are queries that finish in milliseconds.

If I watch the event output, eventually, I’ll see this:

SQL Server Extended Events
lied to me, cried to me

Okay, so the stored procedure took more than 5 seconds, but… no individual query took more than 5 seconds.

To troubleshoot further, I have to set the duration bar even lower, and then figure out what I can meaningfully tune.

  • Do I have one query that takes four seconds
  • Do I have ten queries that take 500 milliseconds
  • Do I have 500 queries that take ten milliseconds

I see this behavior quite a bit with queries that loop/cursor over tons of data. They might take a whole bunch of rows and do a lot of really fast queries over them, but the time adds up.

There’s not really a good solution for this, either. The closer you look, by reducing the duration filter to lower and lower values, the more you return, the more overhead you cause, and the longer you have to wait for things to finish.

Sure, you can wring someone’s neck about not “thinking in sets”, but that doesn’t help when the processing takes place by passing looped values to stored procedures, unless you have the freedom to make significant app changes by using table valued parameters or something instead.

Thanks for reading!

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount on to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that, and need to solve database performance problems quickly.

SQL Server Community Tools: The Wrap Up And Combined Link

I am a heading


Over the past month (plus or minus a couple days), I’ve shown you in a series of quick posts how I use different SQL Server Community Tools that are free and open source to troubleshoot SQL Server issues.

Here’s the full list of posts.

Thanks for reading!

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount on to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that, and need to solve database performance problems quickly.

SQL Server Community Tools: Capturing Query Wait Stats With sp_HumanEvents

Paladin


I have sort of a love/hate relationship with wait stats scripts and analysis. Sometimes they’re great to correlate with larger performance problems or trends, and other times they’re totally useless.

When you’re looking at wait stats, some important things to figure out are:

  • How much of a wait happened compared to uptime
  • If the waits lasted a long time on average

And you can do that out of the box with SQL Server. What you can’t get are two very important things:

  • When the waits happened
  • Which queries caused the waits

This stuff is vitally important for figuring out if wait stats are benign overall to the workload.

For example, let’s say your server has been up for 100 hours, and you spent 50 hours waiting on PAGEIOLATCH_SH waits. Normally I’d be pretty worried about that, and I’d be looking at if the server has enough memory, if queries are asking for big memory grants, if important queries are missing any indexes, etc.

But if we knew that all 50 of those hours were outside of normal use activity, and maybe even happened in a separate database for warehousing or archiving off data, we might be able to ignore it and focus on other portions of the workload.

Dorking


With sp_HumanEvents, you can get all of those things!

EXEC sp_HumanEvents 
    @event_type = 'waits',
    @seconds_sample = 60;

When this finishes running, you’ll get three results back:

  • Overall wait stats for the period of time
  • Wait stats broken down by database for the period of time
  • Wait stats broken down by database and query for the period of time

And because I don’t want to leave you hanging, you’ll also get details about the waits themselves, like

  • How much of a wait happened compared to sampled time
  • How long the waits lasted on average in the sampled time

If you need to figure out which queries are causing wait stats that you’re worried about, this is a great way to get started with that investigation.

Thanks for reading!

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount on to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that, and need to solve database performance problems quickly.

SQL Server Community Tools: Capturing Which Queries Are Recompiling And Why With sp_HumanEvents

Classic Espionage


Like query compilations, query recompilations can be annoying. The bigger difference is that even occasional recompiles can introduce a bad query plan.

If your monitoring tools or scripts are warning you about high compiles or recompiles, sp_HumanEvents can help you dig in further.

We talked about compilations yesterday (and, heck, maybe I should have added that point in there, but hey), so today we’ll talk about recompilations.

There are a lot of reasons why a query might recompile:

  • Schema changed
  • Statistics changed
  • Deferred compile
  • Set option change
  • Temp table changed
  • Remote rowset changed
  • For browse permissions changed
  • Query notification environment changed
  • PartitionView changed
  • Cursor options changed
  • Option (recompile) requested
  • Parameterized plan flushed
  • Test plan linearization
  • Plan affecting database version changed
  • Query Store plan forcing policy changed
  • Query Store plan forcing failed
  • Query Store missing the plan
  • Interleaved execution required recompilation
  • Not a recompile
  • Multi-plan statement required compilation of alternative query plan
  • Query Store hints changed
  • Query Store hints application failed
  • Query Store recompiling to capture cursor query
  • Recompiling to clean up the multiplan dispatcher plan

That list is from SQL Server 2022, so there are going to be some listed here that you might not see just yet.

But let’s face it, the reasons you’re gonna see most often is probably

  • Schema changed
  • Statistics changed
  • Temp table changed
  • Option (recompile) requested

Mad Dog


To capture which queries are recompiling in a certain window, I’ll usually do something like this:

EXEC sp_HumanEvents 
    @event_type = 'recompiles',
    @seconds_sample = 30;

Sometimes recompiles can be good:

  • Schema changed: use a new index that suits the query better
  • Statistics changed: use newer statistics that more accurately reflect column data
  • Temp table changed: use a new histogram for a temp table more relevant to the query
  • Option (recompile) requested: burn it flat, salt the earth

But of course, there’s always an element of danger, danger when a query starts using a new plan. What if it sucks?

To cut down on recompiles, you can use this stuff:

  • Plan Guides
  • Query Store forced plans
  • Keep Plan/KeepFixed Plan query hints
  • Stop using recompile hints?

One thing that can be a pretty big bummer about recompiles is that, if you’re relying solely on the plan cache to find problem queries, they can leave you with very little (or zero) evidence about what queries are getting up to.

Query Store and some monitoring tools will capture them, so you’re better off using those for more in-depth analysis.

Thanks for reading!

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount on to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that, and need to solve database performance problems quickly.

SQL Server Community Tools: Capturing Which Queries Are Compiling With sp_HumanEvents

Compilation Game


One thing that I have to recommend to clients on a fairly regular basis is to enable Forced Parameterization. Many vendor applications send over queries that aren’t parameterized, or without strongly typed parameters, and that can make things… awkward.

Every time SQL Server gets one of those queries, it’ll come up with a “new” execution plan, cache it, and blah blah blah. That’s usually not ideal for a lot of reasons.

There are potentially less tedious ways to figure out which queries are causing problems, by looking in the plan cache or query store.

But, you know, sometimes the plan cache isn’t reliable, and sometimes Query Store isn’t turned on.

And so we have sp_HumanEvents!

Easy Street


One way to start getting a feel for which queries are compiling the most, along with some other details about compilation metrics and parameterization is to do this:

EXEC sp_HumanEvents 
    @event_type = 'compilations',
    @seconds_sample = 30;

Newer versions of SQL Server have an event called query_parameterization_data.

Fired on compile for every query relevant for determining if forced parameterization would be useful for this database.

If you start monitoring compilations with sp_HumanEvents you’ll get details from this event back as well, as long as it’s available in your version of SQL Server.

You can find all sorts of tricky application problems with this event setup.

Thanks for reading!

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount on to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that, and need to solve database performance problems quickly.

SQL Server Community Tools: Capturing Query Performance Problems With sp_HumanEvents

Cumulative Noopdate


In yesterday’s post, I talked through how I capture blocking using sp_HumanEvents. Today I’m going to talk about a couple different ways I use it to capture query performance issues.

One thing I want to stress is that you shouldn’t use yesterday’s technique to gather query performance issues. One thing sp_HumanEvents does is capture actual execution plans, and that can really bog a server down if it’s busy.

I tend to use it for short periods of time, or for very targeted data collection against a single stored procedure or session id running things.

I’ve occasionally toyed with the idea of adding a flag to not get query plans, or to use a different event to get them.

I just don’t think there’s enough value in that to be worthwhile since the actual execution plan has so many important details that other copies do not.

So anyway, let’s do a thing.

Whole Hog


You can totally use sp_HumanEvents to grab absolutely everything going on like this:

EXEC sp_HumanEvents 
    @event_type = 'query', 
    @query_duration_ms = 5000, 
    @seconds_sample = 20;

You may need to do this in some cases when you’re first getting to know a server and need to get a feeling for what’s going on. This will show you any query that takes 5 seconds or longer in the 20 second window the session is alive for.

If you’re on a really busy server, it can help to cut down on how much you’re pulling in:

EXEC sp_HumanEvents 
    @event_type = 'query', 
    @query_duration_ms = 5000, 
    @seconds_sample = 20,
    @sample_divisor = '5';

This will only pull in data from sessions if their spid is divisible by 5. The busier your server is, the weirder you might want to make this number, like 15/17/19 or something.

Belly


Much more common for me is to be on a development server, and want to watch my spid as I execute some code:

EXEC sp_HumanEvents
    @event_type = 'query',                   
    @query_duration_ms = 10000,               
    @session_id = N'58',                    
    @keep_alive = 1;

This is especially useful if you’re running a big long stored procedure that calls a bunch of other stored procedures, and you want to find all the statements that take a long time without grabbing every single query plan.

If you’ve ever turned on Actual Execution Plans and tried to do this, you might still be waiting for SSMS to become responsive again. It’s really painful.

By only grabbing query details for things that run a long time, you cut out all the little noisy queries that you can’t really tune.

I absolutely adore this, because it lets me focus in on just the parts that take a long time.

Shoulder


One pretty common scenario is for clients to give me a list of stored procedures to fix. If they don’t have a monitoring tool, it can be a challenge to understand things like:

  • How users call the stored procedure normally
  • If the problem is parameter sniffing
  • Which queries in the stored procedure cause the biggest problems

We can do that like so:

EXEC sp_HumanEvents 
    @event_type = 'query', 
    @query_duration_ms = 5000, 
    @keep_alive = 1,
    @object_schema = 'dbo',
    @object_name = 'TheWorstStoredProcedureEverWritten';

This will only collect sessions executing a single procedure. I’ll sometimes do this and work through the list.

Hoof


There are some slight differences in how I call the procedure in different circumstances.

  • When I use the @seconds_sample parameter, sp_HumanEvents will run for that amount of time and then spit out a result
  • When I use the @keep_alive parameter, all that happens is a session gets created and you need to go watch live data like this:
SQL Server Extended Events
viewme

Just make sure you do that before you start running your query, or you might miss something important.

Thanks for reading!

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount on to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that, and need to solve database performance problems quickly.

SQL Server Community Tools: Capturing Blocking With sp_HumanEvents

Month Of Sundays


A while back, Brent blogged about using September to raise awareness about free SQL Server community tools.

I thought this was a great idea because I maintain a number of them: my own scripts, the First Responder Kit scripts, and sp_WhoIsActive.

Unlike most people who left links in the comments, I read the entire post and decided to use the whole darn month to write about scripts I maintain and how I use them in my work.

Lest I be accused of not being able to read a calendar, I know that these are dropping a little earlier than the 1st of September. I do apologize for September not starting on a Monday.

There are other great tools and utilities out there, like Andy Mallon’s DBA Utility Database, but I don’t use them enough personally to be able to write about them fluently.

My goal here is to help you use each script with more confidence and understanding. Or even any confidence and understanding, if none existed beforehand.

Oral Board


First up is (well, I think) my most ambitious script: sp_HumanEvents. If you’re wondering why I think it’s so ambitious, it’s because the goal is to make Extended Events usable by Humans.

At present, that’s around 4000 lines of T-SQL. Now, sp_HumanEvents can do a lot of stuff, including logging event data to tables for a bunch of different potential performance issues.

When I was first writing this thing, I wanted it to be able to capture data in a few different ways to fit different monitoring scenarios. In this post, I’m going to show you how I most often use it on systems that have are currently experiencing blocking.

First, you need to have the Blocked Process Report enabled, which is under advanced options:

EXEC sys.sp_configure 'show advanced options', 1;
RECONFIGURE;
EXEC sys.sp_configure 'blocked process threshold', 5;
RECONFIGURE;

If you want to flip the advanced options setting back, you can. I usually leave it set to 1.

The second command turns on the blocked process report, and tells SQL Server to log any instances of blocking going on for 5 or more seconds. You can adjust that to meet your concerns with blocking duration, but I wouldn’t set it too low because there will be overhead, like with any type of monitoring.

Blockeroos


The way I usually set up to look at blocking that’s currently happening on a system — which is what I most often have to do — is to set up a semi-permanent session and watch what data comes in.

When I want to parse that data, I use sp_HumanEventsBlockViewer to do that. At first, I just want to see what kind of stuff starts coming in.

To set that session up, here’s what I do:

EXEC sp_HumanEvents
    @event_type = 'blocking',                   
    @keep_alive = 1;

What this will do is set up an Extended Event session to capture blocking from the Blocked Process Report. That’s it.

From there, you can either use my script (linked above), or watch data coming in via the GUI. Usually I watch the GUI until there’s some stuff in there to gawk at:

EXEC dbo.sp_HumanEventsBlockViewer
    @session_name = 'keeper_HumanEvents_blocking'

Blockerinos


Once you have the blocking collected, troubleshooting it is a whole… Fun… thing.

  • Misguided use of transactions
  • Misguided use of isolation levels
  • Misguided use of locking hints

Are common reasons behind blocking, but not always. Common solutions are things like:

  • Making sure foreign keys have supporting indexes, especially ones with cascading actions
  • Getting angry at Triggers and throwing your computer out the window
  • Making sure modification queries have adequate supporting indexes
  • Batching modifications on large chunks of data
  • Enabling an optimistic isolation level because Read Committed Is A Garbage Isolation Level

Thanks for reading!

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount on to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that, and need to solve database performance problems quickly.

Updates To sp_HumanEvents

Touchy, Touchy


I’ve made some slight tweaks to sp_HumanEvents.

One of them fixes a bug with view creation when you use a custom schema. The rest are minor tweaks and fixes that you (hopefully won’t notice one bit), but make me feel a little bit better about the state of the code.

You can learn how to use the proc, and download the latest version over here.

I took a look through SQL Server 2022’s new Extended Events, and haven’t seen anything in there that would warrant the amount of work it takes to add something new just yet. Maybe at some point, I suppose.

Some folks had asked about adding lightweight query profiling etc. as an option, but it still doesn’t return the kind of details that make it useful for query tuning.

I mean, I guess if you believe in memes it’s okay, but… Maybe you should stop that.

Thanks for reading!

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount on to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that and need to solve performance problems quickly.

Updates To sp_HumanEvents, sp_QuickieStore, And More!

For Free!


Here’s some stuff I’ve been working on lately:

sp_HumanEvents:

  • Added the recompile_cause column to the output of the recompile event
  • Added the wait_resource column to the blocking event output
  • Fixed a bug around table creation
  • MsSQLServerDBA fixed an issue where GO statements were preventing deploying with DBA Tools
  • Remove a recompile hint that was leading to high tempdb allocations (more on that tomorrow)

sp_QuickieStore:

  • Fixed a bug where date filtering was messing up query results

Enjoy, and as always, please let me know if there’s anything I can do to improve the scripts usability, or any issues you encounter while using them.

Thanks for reading!

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount on to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that and need to solve performance problems quickly.

How To Use sp_HumanEvents To Troubleshoot A Slow Stored Procedure In SQL Server

Number One Record


You’ve got a stored procedure that runs slowly, but the problem is that it’s… long. It’s really hard to tell which part is slow, sometimes.

With shorter procs you can probably just collect actual execution plans and slam F5 like a tall glass of gin at 6am.

But you don’t wanna do that with the larger procs, for a few practical reasons:

  • Lots of little queries run quickly, and we don’t care about those
  • Navigating through lots of plans in SSMS is tedious
  • There’s no differentiation when other procedures, etc. are invoked
  • You introduce a lot of overhead retrieving and rendering all those plans
  • The full query text might not be captured, which is a limitation in many places

Let’s save the day with sp_HumanEvents, my stored procedure that makes using Extended Events really easy.

Wanna Ride


There are a lot of issues you can run into with Extended Events. They’re rather unpleasant, and there’s almost zero guidance from Microsoft about usage.

Wouldn’t it be nice to just hit F5?

EXEC sp_HumanEvents
    @event_type = 'query',                   
    @query_duration_ms = 1000,               
    @session_id = N'58',                    
    @keep_alive = 1;

Let’s break down what this does:

  • The events we want to focus on are for executed queries
  • We only care about queries that run for more than one second
  • We’re going to focus in on the session id for the query window we’re tuning in
  • We want the session to stay running so we can watch and re-watch after we make changes

I know it looks a little funny that @session_id is a string, but that’s because there are some sampling capabilities if you want to look at a workload instead of a single proc without overwhelming a server by collecting everything.

Once that’s done, you’ll have a new Extended Event session like this, and you’ll wanna watch live data from it:

SQL Server Extended Events
big bux

Watch Out Now


Once you’ve got that window open and you run your proc, you’ll see any queries that meet the duration criteria, and you should see something that looks like this.

SQL Server Extended Events
good for you

 

The procedure that I’m looking at in here is sp_BlitzCache, because it’s a good example of a proc with a lot of queries in it, where only some of them (like the XML parsing) might get slowed down.

Some notes on the output:

  • The statement doesn’t get collected with the query plan (more on that in a second)
  • INSERT…EXEC shows up as two statements (lines 2 and 3 over there)

You only see these two columns at first, but you can go through and add in any other columns that you find useful from each specific session type.

Two Point Two


The output can be a little confusing at first. Generally the pattern is query plan, then statement associated with it.

If you click on each event, you can choose different elements from it to show in the main result window.

SQL Server Extended Events
like you

Here’s how I usually set things up to find what I wanna go after:

SQL Server Extended Events
get it

I like to grab CPU, duration, information about requested, granted, and used memory, and the statement.

Not every element is available in every event. Where things are NULL, that’s where they’re not available. I wish there were a general extended event that captured everything I need in one go, but that doesn’t seem to exist.

One last point on the output is that if you click on the query_post_execution_showplan lines, the query plan is available in the second result set:

SQL Server Query Plan
one time

Crossover


Before you ask, the lightweight profiling events are useless. They’re so nerfed and collect so little helpful detail that you might as well be looking at a cached plan or Query Store plan.

Lightweight, baby 🙄

To that point, I don’t recommend running something like this across a whole workload. Though Extended Events are purported to have far less observer overhead than Profiler, I’ve seen even heavily-filtered sessions like this slow workloads down quite a bit.

That goes for other traces and forms of monitoring as well. Don’t think this is a special case.

Anyway, this is how I work when I’m trying to tune things for clients. Hopefully you find it useful as well.

Grab sp_HumanEvents and lemme know how it goes on GitHub.

Thanks for reading.

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount on to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that and need to solve performance problems quickly.