Posted on 3 Comments

The Case of the Blocking Online Index Create (A Shared Lock that Wouldn’t Quit)

I recently got an interesting question from a reader about running a CREATE INDEX statement with DROP_EXISTING:

I always take advantage of creating/modifying nonclustered indexes online since we are running SQL Server 2014 Enterprise Edition. This weekend I ran into an issue while modifying an existing nonclustered index (added a new column to INCLUDE): I was blocking any write operations on the table. After digging into this issue, I noticed that there was a Shared lock on the table.

From my knowledge, at the very beginning a shared lock needs to be acquired but it’s should be very fast one — https://msdn.microsoft.com/en-us/library/ms191261.aspx.

I ran the index creation for more than an hour and had to kill it because it started to significantly affect clients. I just don’t understand why the shared lock would be held for such a long time.

I can reproduce it on my test server with no activities on this database.

I thought this was a really interesting question, because our reader is correct: an online index creation statement should only need a shared lock, briefly, at the beginning of execution.

Ruling out the simple things first: is the shared lock getting blocked?

My first thought was that perhaps there is some process that runs against the production system and the test system that goes to sleep with an open transaction, holding an X or an IX lock against this table. If the index create can’t get its shared lock, then it could be part of a blocking chain.

So I asked first if the index create was the head of the blocking chain, or if it was perhaps blocked by something else. The answer came back that no, the index create was NOT blocked. It was holding the shared lock for a long time.

My new friend even sent a screenshot of the index create running against the test instance in sp_WhoIsActive with blocking_session_id null.

And it wasn’t just one shared lock. It was 120!

To deepen the mystery, he ran sp_WhoIsActive with @get_locks=1,  and the locks for the running, non-blocked index create showed:

<Lock resource_type="OBJECT" request_mode="IS" request_status="GRANT" request_count="1" />
<Lock resource_type="OBJECT" request_mode="S" request_status="GRANT" request_count="120" />
<Lock resource_type="OBJECT" request_mode="Sch-S" request_status="GRANT" request_count="1" />

Yep, that’s 120 OBJECT level shared locks, just hanging out while the online create is running.

I was curious if the number 120 was meaningful to this table somehow. Were there 120 partitions? 120 indexes? 120 triggers? (I know it sounds weird, but I’ve found weirder.)

Nope, there weren’t 120 of anything identifiable.

We did a little tracing, and a little discussion in email. And just as soon as I sent an email that said, “I can’t reproduce this on SQL Server 2016 at all, I think you have a bug!” I got an email back that said, “This is a bug!”

The bug: Rebuilding a nonclustered index to add columns by using CREATE INDEX with DROP_EXISTING=ON and ONLINE=ON causes blocking (KB 3194365)

I’d actually done a little bit of searching when I first got this question by email, and I didn’t find this bug at all. But my keywords weren’t perfect.

And if you’ve ever searched for SQL Server bugs, you know that your keywords have to be exactly perfect, and it has to be your lucky day for you to find the bug.

The bug is here. If you’re running SQL Server 2012, 2014, or 2016 Enterprise Edition, check your patch level if you ever run CREATE WITH DROP_EXISTING: because you probably would like to have the option for it to be at least mostly online.

Thanks very much to Denis for emailing about this issue! The story does have a happy ending: Denis found that the issue was resolved by patching the instance, as described in the KB.

Posted on 3 Comments

3 thoughts on “The Case of the Blocking Online Index Create (A Shared Lock that Wouldn’t Quit)

  1. Hi Kendra,

    Big thanks to you for not giving up on me and holding my hand till the end 🙂 Good luck with your new cool project! Looking forward to your new posts and training material.

    Thanks,
    Denis

  2. I am not sure even how to ask this clearly but I have a similar problem with Create Index Drop_Existing.
    I have several large Partitioned Views (SQL2008R2 SE SP3, 64bit) that are all date range based (week, month). My automated maintenance creates a sub-table for the new time period, alters the view (adding new period and removing old period) and then moves the oldest table (no longer in the view) to a different file group for cold storage. The move is implemented by Create Index Drop_Existing on the clustered primary key. Once the old table is removed from the view it is almost never referenced. Some of these tables are pretty large (>100GB) and take several hours to move.

    The problem is I often notice unusual blocking. For example, queries against system tables (sys.objects, sys.indexes, sys.partitions, ..) will block. I added “Read Uncommitted” to several of my diagnostic scripts but still see the blocking from time-to-time. Also the SSMS Object Browser will block on actions like table/index properties or even just populating the object tree.

    Any idea what I am missing ?

    1. Unfortunately, I don’t think you’re the one missing something. Adam Machanic created a bug for your issue: https://connect.microsoft.com/SQLServer/feedback/details/478568/

      The bug is still active, and is up to 101 votes. You can make it 102, and add comments to the thread. I do have some hope we’ll get traction on this before too long since the SSMS team is now able to release updates independently of major releases.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.