Posted on

Perks for My SQLPASS Summit Precon: Prevent ‘Fake News’ in Your Data! Isolation Levels Demystified

I’m proud, giddy, and just plain overly excited to be giving a pre conference session at the SQL PASS Summit in Seattle, Washington on November 5, 2018.

This is going to be a fun, demo-packed day which will teach you skills you can use throughout your career as a developer, database administrator, or data professional.

Why attend?

While features and tricks change from year to year (or even month to month), understanding isolation levels is critical for your long term future.

The Session: Prevent ‘Fake News’ in Your Data! Isolation Levels Demystified

Are your users seeing bad data? If you’re using the default isolation level of read committed in SQL Server, chances are that sometimes your users get incorrect results. If NOLOCK hints lurk in your code, the odds are even higher that sometimes your customers see information that just isn’t right.

In this day-long, demo packed session, you’ll learn why a single statement may read rows twice, miss rows entirely, or return combinations of data that never existed in the database — and why that’s not a bug. You’ll learn what “read phenomena” are, which isolation levels are vulnerable to them, and the performance trade-offs which come from raising your isolation level to protect your users from bad data.

You’ll see how isolation levels work with newer technologies such as columnstore indexes, In-Memory OLTP, and Always On Availability Groups. You’ll discover why version-based isolation levels can be awesome, and what you need to look out for with these isolation levels to avoid race conditions that produce– you guessed it– incorrect results.

At the end of the day, we’ll pull together all this information into a guide. You’ll leave the seminar with the tools and knowledge to choose the right isolation levels for new and existing applications based on business and performance requirements.

What Does it Cost?

A single pre-conference session is $499. You can also bundle pre-conference sessions with registration at the full Summit.

The Perks

Attendees at my pre-conference session will not only get all the demos we show live, and a download of the PDF of slides, but also….

Perk: Watch It Again Later

Access to a recorded online session of my 8 hour seminar, “Conquer Blocking & Isolation Levels” for a full year.

Want to revisit a topic and get the nitty gritty details? You’ll be able to watch anytime, online.

Note: the live pre-conference sessions at PASS aren’t recorded for purchase anymore– however, you can buy recordings of “regular” sessions during the week from PASS in different formats.

Perk: A Month of SQLChallenges!

For four weeks following the conference, I’ll be hosting SQLChallenges on blocking and isolation level problems.

  • Attendees will get access to the problems and a chance to apply the skills they’ve learned.
  • I’ll hold a live, online session going over sample solutions for the problem. Attend, share your solution in Slack if you’d like, and ask questions.
  • Can’t make the SQLChallenge live? The recording will be posted and you’ll get online access for a year. All SQLChallenges support Q&A in the course pages.

Perk: Private Follow-Up Quizzes

Want even more learning? Following the SQLChallenges, attendees will get access to a series of four weekly quizzes about blocking and isolation levels to confirm your knowledge.

Combining this will the month of SQLChallenges will fully establish your new knowledge in place, getting you the most value for your dollar.

Register here

I hope to see you at the PASS Summit! Sign up here to join my pre-conference session.

Postscript: SQL Symposium subscribers get access to all these perks as well. As sustaining members during the first year of SQLWorkbooks, SQL Symposium members get it all ūüôā¬†

Posted on

The Case of the Rowgroup Deadlock in a Columnstore Index

I came across a fun deadlock when writing demos for my session on the Read Committed isolation level this week. (It’s OK to call it “fun” when it’s not production code, right?)

I was playing around with a nonclustered columnstore index on a disk-based table. Here’s what I was doing:

Session 1: this session repeatedly changed the value for a single row, back and forth. This puts it into the delta store.

Session 2: this session repeatedly counted the number of rows in the table, using the columnstore index.

With my sample data in this scenario, I found I frequently generated deadlocks.

Let’s look at the deadlock graph

I started up an Extended Events trace for the xml_deadlock_report event, and here’s what this deadlock looks like (with some annotations)…

Click for a larger view

Breaking it down

The circle on the left is Session 2 – it was running a SELECT COUNT that used the nonclustered columnstore index. It was chosen as the deadlock victim.

Reading the arrows, the SELECT COUNT query:

  • Had a lock on the compressed rowgroup in the columnstore index
  • Wanted a shared lock on the b-tree delta store for the columnstore index to count the rows for data that had been changed

The circle on the right is Session 1 – it was running an update that changed the value for one column in one row. Reading the arrows, the UPDATE query:

  • Had an intent exclusive lock on the b-tree delta store
  • Wanted an intent exclusive lock on the compressed rowgroup (presumably to do the work to make sure it was clear the related row is in the delta store)

Voila– deadlock!

How do I fix it?

If I really want to churn changes into my columnstore index at the same time that I rapid-fire query the data, I may want to change my isolation level for the query counting the data.

Read committed snapshot isolation or snapshot isolation for the SELECT COUNT query ¬†can change this blocking scenario and sneak us right past those deadlocks — as long as enabling those is the right thing for my application and my database server.

The more clever we get, the more ways things can backfire

I’m not writing this post to bash columnstore indexes — far from it. Churning tons of updates into a columnstore index isn’t necessarily what you want to do.

Changing to indexes or data can cause deadlocks. It can also make them go away! That’s a fact of life, and as we add more indexing tools to our toolkits, we still gotta live with it.

Want to learn more about deadlocks?

Check out my course, Troubleshooting Blocking & Deadlocks for Beginners. It gives you example code to create a deadlock, and walks you through tracing and decoding the deadlock graph, too.

Posted on

Using sp_WhoIsActive with Temp Tables to Limit Data Collected

The free sp_WhoIsActive procedure by Adam Machanic can be very useful for monitoring SQL Server: I’m a big fan.

But I’ve learned the hard way that it’s easy to collect too much information with sp_WhoIsActive, particularly if you set up jobs where you have the procedure log data to a table for later review. Collecting the text of SQL Server queries, their execution plans, and more, can take up a lot of room, and I’ve managed to fill up a server drive or two with a little over-eager collection. Whoops!

Sometimes it can be useful to store the results of sp_WhoIsActive in a temporary object, and review the contents before deciding whether or not to store the data permanently.

Or alternately, you might want to do this to programmatically review the results of sp_WhoIsActive at the current moment to check whether you want to fire off an alert based on different criteria.

Because of scoping issues, regular temp tables don’t really work well for this. But you can easily use either a global temporary table, or a short-lived table in a user database.

An example: sessions with tempdb_allocations > 0

Let’s say I want to periodically sample how many tempdb_allocations are being used by different sessions, and collect the query text and query plan if they are available. I only want to keep data for sessions who have tempdb_allocations > 0 permanently.

I could do something like this…

Step 1) Create a permanent table to store the info I care about

The first step is to create a regular database table named dbo.WhoIsActive_tempdb, where we’ll store data for review:

/* This assumes I have a database named dba where I can store monitoring info. 
The code in this post doesn't delete or limit the data you put into this table, so make sure
you either clean it out, have plenty space, or don't insert enough to cause you problems */

use dba;
GO

DECLARE @whoisactive_table VARCHAR(4000) ;
DECLARE @schema VARCHAR(4000) ;

SET @whoisactive_table = QUOTENAME ('dbo') + '.' + QUOTENAME ('WhoIsActive_tempdb')

EXEC sp_WhoIsActive
  @output_column_list = '[tempdb%][%]',
  @get_plans = 1,
  @return_schema = 1,
  @format_output = 0,
  @schema = @schema OUTPUT ;
SET @schema = REPLACE(@schema, '<table_name>', @whoisactive_table) ;

EXEC(@schema) ;

I used the @schema parameter to have sp_WhoIsActive generate the schema for the table itself. Full instructions on doing this by Adam are here.

Since I care about tempdb in the case of this example, I used @output_column_list to specify that those columns should come first, followed by the rest of the columns.

I also elected to set @get_plans to 1 to get query execution plans if they’re available. That’s not free, and they can take up a lot of room, but they can contain a lot of helpful info.

Step 2, Option 1) Sample sp_WhoIsActive into a global temp table, then insert only rows we care about into the permanent table

If we only want to store rows where tempdb_allocations > 0 in the permanent table, then one way to do this is to:

  • Create a temporary table each time we run sp_WhoIsActive
  • Insert the current sample into that table
  • Insert the rows we want to keep permanently into the ‘real’ table

Because of scoping reasons with dynamic SQL, this isn’t easy to do with a normal temporary table. One relatively easy workaround for that is to use a global temporary table. To reduce the possibility of name collisions (like if you have a job running this, and you accidentally run it yourself in a session at the same time), is to use a GUID in the name of the temporary table, like this:

DECLARE @whoisactive_table VARCHAR(4000) ;
DECLARE @schema VARCHAR(4000) ;
DECLARE @dsql NVARCHAR(4000) ;

SET @whoisactive_table = QUOTENAME ('##WhoIsActive_' + CAST(NEWID() as varchar(255)));

EXEC sp_WhoIsActive
  @output_column_list = '[tempdb%][%]',
  @get_plans = 1,
  @return_schema = 1,
  @format_output = 0,
  @schema = @schema OUTPUT ;
SET @schema = REPLACE(@schema, '<table_name>', @whoisactive_table) ;
PRINT @schema
EXEC(@schema) ;

EXEC sp_WhoIsActive
  @output_column_list = '[tempdb%][%]',
  @get_plans = 1,
  @format_output = 0,
  @destination_table = @whoisactive_table;

SET @dsql = N'
INSERT dbo.WhoIsActive_tempdb
SELECT *
FROM ' + @whoisactive_table + N' 
WHERE tempdb_allocations > 0
OPTION (RECOMPILE);'

EXEC sp_executesql @dsql;

SET @dsql = N'DROP TABLE ' + @whoisactive_table + N';'

In this example, I’m running sp_WhoIsActive with the same options I used before. I first use @schema to create a global temporary table with a name like [##WhoIsActive_E91F175D-C09A-44E7-98E7-10A18E038873].

I then select rows from the temp table who have tempdb_allocations > 0, and insert them into dbo.WhoIsActive_tempdb.

But… what if we want to use less tempdb?

That’s a great question.

Let’s say that you’re firing this whole thing off at a time when tempdb is filling up fast, or maybe at a time when you might have allocation contention in tempdb. We can’t prevent our monitoring queries from using tempdb altogether, but maybe we don’t want to create temp tables.

No problem. You can use slightly different code to store sp_WhoIsActive results in a short lived permanent table in the dba database.

Step 2, Option 2) Short lived permanent table in a user database

The only real difference in this code from the second sample above is the “use dba” statement, and a change in the name of @whoisactive_table:

use dba;
GO

DECLARE @whoisactive_table VARCHAR(4000) ;
DECLARE @schema VARCHAR(4000) ;
DECLARE @dsql NVARCHAR(4000) ;

SET @whoisactive_table = QUOTENAME ('dbo') + '.' + QUOTENAME ('WhoIsActive_' + CAST(NEWID() as varchar(255)));

EXEC sp_WhoIsActive
  @output_column_list = '[tempdb%][%]',
  @get_plans = 1,
  @return_schema = 1,
  @format_output = 0,
  @schema = @schema OUTPUT ;
SET @schema = REPLACE(@schema, '<table_name>', @whoisactive_table) ;
PRINT @schema
EXEC(@schema) ;

EXEC sp_WhoIsActive
  @output_column_list = '[tempdb%][%]',
  @get_plans = 1,
  @format_output = 0,
  @destination_table = @whoisactive_table;

SET @dsql = N'
INSERT dbo.WhoIsActive_tempdb
SELECT *
FROM ' + @whoisactive_table + N' 
WHERE tempdb_allocations > 0
OPTION (RECOMPILE);'

EXEC sp_executesql @dsql;

SET @dsql = N'DROP TABLE ' + @whoisactive_table + N';'
GO

When run this way, sp_WhoIsActive still allocates resources in tempdb (be aware that some of that overhead is just part of using a complex query like this), but the object creation is kept in the user database.

Sometimes, you can simply use a filter built into sp_WhoIsActive!

Want to see data just for a single login, program name, database, host name, or session?

Or just want to exclude something based on login, program name, database, host name, or session?

In those case, you might simply use the @filter_type and @filter parameters on sp_WhoIsActive and save yourself some complexity. Here’s the documentation on how to do that.

Happy monitoring!

Posted on

Error 1204: When SQL Server Runs Out of Locks

I recently did a Dear SQL DBA episode answering a question about lock timeouts and memory in SQL Server. I really enjoyed the episode, and¬†thought it would be fun to follow up and show what it looks like if SQL Server doesn’t have enough memory to allocate locks.

You can control how much memory SQL Server allocates for locks using the “locks” configuration option. At least for now: the configuration option is deprecated.

Microsoft recommends that you do NOT change this setting, and let SQL Server manage your locks dynamically.

I agree with them. Don’t mess around with the locks setting.

That being said…

Let’s do some bad things to my test instance!

I’m going to change the lock configuration setting so that my SQL Server runs out of memory for locks pretty easily. Here’s my formula for lock memory starvation, using the WideWorldImporters sample database:

  1. Set the ‘locks’ configuration to 5,000
  2. Restart the SQL Server service to make this (very bad) setting go into effect
  3. Disable lock escalation on a table with 104K rows: ALTER TABLE Sales.OrderLines SET (LOCK_ESCALATION = DISABLE)
  4. I run a query that’s greedy about locks

Here’s my greedy lock query

BEGIN TRAN

    SELECT   OrderLineID
    FROM Sales.OrderLines WITH (HOLDLOCK, ROWLOCK)

 

The HOLDLOCK hint tells SQL Server to run in serializable mode for this table. That makes it “hold” locks for the life of the transaction — not just take out and release locks as it rolls through the table.

The ROWLOCK hint tells SQL Server to use the most granular form of locks. SQL Server would “escalate” these locks to the table level so that it wasn’t so granular, but we disallowed that as part of our lock starvation formula

Error message 1204: cannot obtain a LOCK resource

My instance manages to stream 4837 rows back to me before it fails with this error:

Msg 1204, Level 19, State 4, Line 8
The instance of the SQL Server Database Engine cannot obtain a LOCK resource at this time. 
Rerun your statement when there are fewer active users. 
Ask the database administrator to check the lock and memory configuration for this instance, or to check for long-running transactions.

The message comes back to my session. The message is also in the SQL Server error log, with my session_id as the source. And the message is written to the Windows Event Log under Application, too, where it shows as an error.

In my case, though, the problem isn’t that I have too many active users, as the error message suggests. Or even that I don’t have enough memory.

Things to check if you hit this error

  • Has anyone changed the locks configuration on your instance? (The default setting is 0)
  • Have you disabled lock escalation on any tables?
  • Does your code use ROWLOCK hints to force granular locks in the time periods where this is happening?
  • Does the SQL Server instance have enough memory, and is it under memory pressure? This error will be thrown if locks require 60% of engine memory, or memory is running short in Windows — both of which are pretty dire circumstances for your instance.

 

As for me, I’m going to set the ‘locks’ setting back to 0, restart my instance, and pretend this never happened.

Posted on

Which Locks Count Toward Lock Escalation?

A little while back I wrote about Why Indexes Reduce Locks for Update and Delete Queries.

I got a great question on the post from Zac:

What’s not super clear is why it takes out a lock on the whole table, is this because it does a lock escalation as a result of the Full Scan?
Will this always happen, or is there a threshold of record update counts where this will occur?

This was tough to answer in just a comment, so I promised a full post on the topic.

SQL Server attempts lock escalation to simplify locks

It can be tricky to manage a lot of little fine grained locks. If I take out 50,000 row locks on a single table, it would be easier for SQL Server to manage that by just giving me one table level lock. But this may not be possible if others are using the table.

When you take out a lot of modification locks, SQL Server will attempt to “escalate” them. If it can’t escalate and I keep taking out locks, it will keep trying.

Books Online has a good article about this, which explains a lot of the details about how many locks you need to take out to trigger lock escalation. Here are the (simplified) basics:

  • The ‘magic’ number to trigger escalation for the first time is 5,000 locks on a single table reference
  • Locks do NOT escalate from row level to page level. Row locks escalate to table. Page level locks also¬†escalate to table level. In other words, forcing row level locking will not make it less likely to escalate locks to the table level, but rather it will do the opposite.
    • Note: for partitioned tables, you have the option to enable partition level escalation
  • If you’re modifying data, the escalated table lock will be exclusive. That means nobody else can party with the table if lock escalation succeeds while you’re doing your work.

But which locks cause escalation? All those update locks don’t cause escalation, do they?

No, the update locks do NOT cause escalation. Just as a reminder, “update” locks are weirdly named– these are a special kind of lock (not just a type of lock associated with an “update” statement). Read more in my post on update locks.

Let’s look at a simple¬†test I ran on a VM to show update locks not triggering the “escalator”.

I’m using the WideWorldImporters sample database again. To make sure I get the most update locks possible for my test, I dropped the index on CustomerID:

DROP INDEX [FK_Sales_Orders_CustomerID] ON [Sales].[Orders]
GO

This forces a clustered index scan on all the sample queries I’m going to run below.

I’m running two Extended Events Traces

The first trace is looking at the sqlserver.lock_escalation event. Note that I’m using NO_EVENT_LOSS and MAX_DISPATCH_LATENCY=5 in this trace— that’s because I’m running this against a totally private test instance, and it doesn’t matter if I impact performance.

CREATE EVENT SESSION [LockEscalation] ON SERVER 
ADD EVENT sqlserver.lock_escalation(SET collect_database_name=(1),collect_statement=(1)
    ACTION(sqlserver.session_id))
ADD TARGET package0.event_file(SET filename=N'S:\XEvents\Lock_Escalation.xel')
WITH (MAX_MEMORY=4096 KB,EVENT_RETENTION_MODE=NO_EVENT_LOSS,MAX_DISPATCH_LATENCY=5 SECONDS,MAX_EVENT_SIZE=0 KB,MEMORY_PARTITION_MODE=NONE,TRACK_CAUSALITY=OFF,STARTUP_STATE=OFF)
GO

The second trace is counting the locks used used by session_id=56. It’s looking for locks against a particular object (I don’t care about metadata locks), and it’s putting the output in a histogram target bucketed by the lock mode:

CREATE EVENT SESSION [locks_count_spid_56] ON SERVER 
ADD EVENT sqlserver.lock_acquired(
    WHERE ([package0].[equal_uint64]([sqlserver].[session_id],(56)) AND [associated_object_id]=(72057594047234048.)))
ADD TARGET package0.histogram(SET filtering_event_name=N'sqlserver.lock_acquired',source=N'mode',source_type=(0))
WITH (MAX_MEMORY=4096 KB,EVENT_RETENTION_MODE=ALLOW_MULTIPLE_EVENT_LOSS,MAX_DISPATCH_LATENCY=30 SECONDS,MAX_EVENT_SIZE=0 KB,MEMORY_PARTITION_MODE=NONE,TRACK_CAUSALITY=OFF,STARTUP_STATE=OFF)
GO

Note: this article helps decode the integer that represents the lock mode in this trace.

Looking at this, I’m embarrassed by how inconsistent I am at naming traces. But I’m consistently bad at that, so… yeah.

Test query #1 modifies only 165 rows, no lock escalation

Here’s our first contender…

BEGIN TRAN
UPDATE Sales.Orders
SET InternalComments = N'Hiya'
WHERE CustomerID = 2
OPTION (RECOMPILE);
ROLLBACK
GO

This does a clustered index scan, but modifies only 165 rows. I roll back the transaction just for further testing.

What do the traces have to say?

  • The LockEscalation trace doesn’t say anything. We had¬†NO lock escalation for this trace.
  • The locks_count_spid_56¬†trace shows
    • 104,184 update locks
    • 2,591 intent update locks
    • 165 exclusive locks
    • 153 intent exclusive locks

We had way more than 5K update locks on this object, but they don’t count toward lock escalation.

Test query #2 modifies 4,951 rows and DOES escalate locks

Before starting this test, I restarted my locks_count_spid_56 trace to reset it. Then I ran this query:

BEGIN TRAN
UPDATE Sales.Orders
SET InternalComments = N'Hoya'
WHERE CustomerID > 1013
OPTION (RECOMPILE);
ROLLBACK
GO

This also does a clustered index scan, and modifies¬†just¬†under 5,000 rows. So it would seem like this wouldn’t escalate. Let’s see!

  • The LockEscalation trace has a row! This escalated.
    • escalation_cause = Lock threshold
    • escalated_lock_count (number of locks converted) = 6,248
    • hobt_lock_count (number of locks at time of escalation) = 6,247
    • The statement collected matches this query. (The recompile hint in the query is there to prevent¬†auto-parameterization in this simple, so it’s very clear which query was run.)
  • The locks_count_spid_56 trace shows
    • 97,842 update locks
    • 2,340 intent update locks
    • 4,475 exclusive locks
    • 1,772 intent exclusive locks

Time for a little math

4,475 exclusive locks + 1,772 intent exclusive locks = 6,247 locks at the time of escalation.

The update and intent update locks don’t count toward escalation.

What does this all mean?

Good indexing can reduce the number of update locks that queries take out¬†— that can reduce blocking, because update locks block one another. And besides, good indexing can make queries faster.

Lock escalation converts exclusive and intent exclusive locks. The initial¬†threshold to trigger lock escalation is 5,000 locks used in a single table reference, but you might hit that threshold even if you’re modifying less than 5,000 rows.

If you have to modify a lot of rows in a table that’s being used by others where performance matters, lock escalation is one of multiple factors that makes it desirable to break the modifications up into smaller transactions.

Having lock escalation isn’t necessarily a bad thing. If escalation succeeds, it’s possible that the query with escalated locks doesn’t end up blocking anyone else. You need to monitor the SQL Server to know whether or not you’ve got a blocking problem, and who is blocking whom.

Posted on

Why Indexes Reduce Locks for Update and Delete Queries

I have a¬†cat. Well, she’s Jeremiah’s cat, really, and you’ve even seen her:¬†she’s in the logo of this website! She supervises my work most days at home from a cat bed in my office. I’d love to hug her all the time. But kitty has claws, and she makes it clear that she’s not a fan of hugs. She’d rather be free.

If you have blocking on your SQL Server, your queries would probably rather be free, too.

Indexes can help with that for all sorts of reasons. One of those reasons is that indexes can help your update and delete statements lock fewer rows. And I’m not only talking about shared locks, either.

Good indexing can reduce the number of modification locks your update and delete queries acquire

You can see this if you create a trace and look at locks_acquired for the following two queries against a test instance.¬†Set a filter for the session_id you’re using to run the query, and add sql_text so you can see which query required which locks.

USE WideWorldImporters;
GO

/* This will do an index seek on the nonclustered index [FK_Sales_Orders_CustomerID] */
BEGIN TRAN
UPDATE Sales.Orders
SET InternalComments = 'Hiya'
WHERE CustomerID = 2;
ROLLBACK
GO

/* This will do an index scan on the forced index */
BEGIN TRAN
UPDATE Sales.Orders
SET InternalComments = 'Hiya'
FROM Sales.Orders WITH (INDEX ([PK_Sales_Orders]))
WHERE CustomerID = 2;
ROLLBACK
GO

The first query can go straight to the data that it’s going to update using an existing nonclustered index on CustomerID.

The second query uses an index hint to force SQL Server to use the clustered index of the table. This simulates what would happen if we didn’t have a viable nonclustered index to find the rows quickly.

How many locks do they take out?

Well, they take out¬†a lot of locks. Locks are very granular, and if you do trace this you’ll find you get a lot of rows. After I traced this, I grouped the results by the mode column (lock mode), and the sql_text of the query to quickly see how many locks of which type each query took out.

By “grouped”, I mean I opened the XEvents trace results in SSMS and used the Grouping function in the GUI. It was fast and easy with this amount of rows, but if it was a larger table SSMS would probably have fallen over crying.

Both queries updated the same amount of rows (165). That would sure be weird if they didn’t!

  • The query that did¬†the NC index seek took¬†out 330 update key locks and 165 exclusive key locks.
  • The query that did¬†the clustered index scan took¬†out 104,184 update key locks and 165 exclusive key locks.

They also take out lots of other types of locks, but it’s the update locks I want to talk about here.

Whoa, more than 104K update key locks?

Yep.¬†It is not a coincidence that there are 104,184 rows in this table. (If you see it take out more update key locks than rows in the table, some are probably from metadata objects. The simple grouping on just mode and sql_text isn’t perfect.)

This isn’t quite as terrible as it might sound because update locks are compatible with shared locks. However, update locks are NOT compatible with other update locks, or intent exclusive locks, or exclusive locks. (When in doubt about things like this, I like to check the official Lock Compatibility Matrix.)

This is one of the reasons that an OLTP database may work well in the early days, when it’s small and has a few users, but blocking can really shoot up when the data grows and our user counts grow.

Update locks matter for deletes, too

Oddly enough, an update lock isn’t just for updates. The name is weird.

If you run deletes and a trace for CustomerID=2 you will see something similar with lots of update locks, because deletes also use update locks.

Update locks are a mechanism that SQL Server uses in the background to keep your database from exploding with deadlocks when you have multiple sessions modifying the same table at once. Sometimes blocking is better than having your queries getting killed off!

More reading

Kalen Delaney wrote more details about how update locks help avoid deadlocks, and shows an example of measuring them using the sys.dm_tran_locks DMV here.

Thanks to Mike, who asked what was up with this

Mike was curious about the answers to one of the quiz questions in Troubleshooting Blocking and Deadlocks for Beginners, and wrote in with a great question.

I think this behavior is far from obvious, and I don’t have sample code or details in the course explaining this,¬†so I’m going to link to this post from the course to help make this more clear for others.

Thanks, Mike!

Posted on

The Top 3 Mistakes I Made Fighting Blocking

At the beginning of the “Troubleshooting Blocking and Deadlocks” course, I mention that it took me a long time to get into using the tools I show in the course. The tools are all free, and many of them are built into SQL Server.

So why did it take me so long?

1. I was using outdated built-in commands to troubleshoot live blocking

Lots of us learn to use built-in tools from the SQL 2000 days and prior when we start using SQL Server. Things like:

  • sp_who2
  • DBCC INPUTBUFFER
  • DBCC OPENTRAN

These commands are feast or famine: they either give you either more than you want, or not as much information as you need.

These old commands haven’t been improved over the years by Microsoft for a very good reason: Microsoft gave us a much richer, more flexible set of tools when they introduced Dynamic Management Objects in SQL Server 2005, and they continue to improve the DMOs year after year. The old commands are still there for backwards compatibility.

In the Blocking course, I demonstrate how you can make finding live blockers fast and easy by querying SQL Server dynamic management objects with the free sp_WhoIsActive procedure.

This awesome free procedure is by Adam Machanic, and you can download it from http://whoisactive.com/.

People still learn¬†old commands like sp_who2 first, and get confused. Just this week, I got a question for my Dear SQL DBA podcast about sp_who2 from someone who’s learning now. Here’s the ‘Dear SQL DBA’ video where I show why sp_WhoIsActive makes life much less confusing:

2. I pummeled the SQL Server looking for blocking with custom scripts

Like a good DBA, I wanted to be proactively notified about blocking. Specifically, before our customers notified me about the blocking.

To do this, I wrote a bunch of TSQL that looked for blocked queries. And I ran it against the SQL Server. I ran it A LOT.

This query alone wasn’t enough to bring down the SQL Server. But of course this wasn’t the only monitoring query I ran. There were more, and hilariously when you got to looking at what were the most expensive overall queries against the SQL Server…

Yeah, of course it was all those custom monitoring queries.

When you’re troubleshooting a problem like “not enough CPU”, or you’re thinking about server sizing and CPU licensing comes to mind, you don’t really want to have the secret that the #1 CPU using queries against the instance are your monitoring queries.

There are better ways! For blocking, I really like using a simple performance counter alert based on the “Processes Blocked” performance counter. It’s really easy to set up, it’s very lightweight. I show how to set this up in the course, along with other tools that will help you figure out who was blocking whom if you don’t get there until after the fact.

3. I thought Trace Flags ‚ÄďT1204 and ‚ÄďT1222 should be¬†enough to fix deadlocks

Trace flags 1204 and 1222 are well known in SQL Server. They’ve been around for a while. Each of these flags causes SQL Server to write information about a deadlock to the SQL Server Error Log when a deadlock occurs. One of them writes info in an XML format, the other just in text.

I knew enough to enable these flags. And I’d see that deadlocks had happened.

But I had no idea how to interpret the information, other than to be able to see little bits of the queries that were involved. I thought the problem was me.

Well, maybe the problem was partly me, but it’s also really crappy to try to output a large wad of text information printed into an error log. It’s like trying to read a book in a foreign language where all the pages are out of order.

I’ve found it much easier to collect and interpret deadlock graphs. There’s still a lot of complex information to synthesize: that’s the same. However, you get a graphic display of the problem, too. This helps you navigate around as you try to understand what’s going on.

It’s¬†also much easier when someone shows you how to decode the foreign language! So in the course I:

  • Give you code to generate a deadlock (and it’s rerunnable, no need to restore the database)
  • Show you how to read the deadlock graph
  • Give you tips on how to plan out solutions to prevent the deadlock from re-occurring

Enrollment in the course is currently free (Feb 2017)

Right now you can sign up to get six months access to the videos, demos, and quizzes in the course for free.

I can’t promise that the class will always be free though, so get in there and get learning!