Posted on

Locked Pages, Working Set, and Private Bytes

One of the great things about writing presentations is that it spurs you to “clean up” your definitions. When it comes to writing a slide about something, I ask myself, “Do I really know what that is?” I check my assumptions, and clarify how I think about something.

This week I was working with SQL Server memory settings, and I “cleaned up” my understanding of the following definitions.

Locked Pages in Memory (setting)

“Locked pages” must be kept in physical memory

  • This memory can’t be “paged out”
  • In other words, disk space (aka the Windows Page File), can’t be swapped in for this memory

Working Set of a process

For a process in Windows, this shows…

  • How much is physically resident in memory
  • May include some memory shared with other applications
  • Does NOT include ‘locked’ pages / large pages

Private Bytes for a process

For a process in Windows, this shows…

  • How much memory is allocated
  • This includes page file space allocated and standby list

How do I use these metrics?

My favorite way to get comfortable with these metrics is to set yourself up with a test SQL Server environment far, far away from production. Get your SQL Server using memory, download TestLimit64.exe and Process Explorer from Microsoft Sysinternals, and get to creating memory pressure and watching your metrics!

Want to know how things change when you lock pages, or change your Windows page file configuration? These tools will let you see it in action, rather than guessing.

Sources / further reading

This classic blog post by Bob Ward talks about Locked Pages for SQL Server in this classic post from 2009.

This msdn page defines what the Working Set for a process is.

This Stack Overflow question and answer compares Private Bytes, Working Set, and Virtual Bytes — and explains what’s tricky about these measurements.

If you want to go way deeper, the Windows Internals books and Mark Russinovich’s blog posts are for you!

Posted on

I’m Giving Two Sessions at the SQL PASS Summit

PASS Summit sessions have been scheduled, and I’m excited to be giving two sessions this year AND helping judge Speaker Idol! I’m looking forward to teaching, learning a ton, and connecting with lots of people who love working with data.

Why Did My Clever Index Change Backfire?

Wednesday, Nov 1, 10:45 AM, Room 6A

Table partitioning is a fantastic tool to help you manage tables with skyrocketing rowcounts. SQL Server 2016 SP1 made table partitioning available in Standard Edition, so you may be planning to add partitioning to your database. But some queries may get slower after you partition your tables: the SQL Server optimizer doesn’t always use indexes the same way after those indexes are partitioned into chunks.

This session teaches you to use execution plans to troubleshoot regressed queries using partitioned tables. You will learn what “non-aligned” indexes are, how to tell how many partitions a query is really using, and see options for TSQL and indexing changes to speed up your queries.

When Partitioning Hurts Performance (and How to Fix It)

Friday, Nov 3, 8 am, 6B

SQL Server is full of advanced techniques to build powerful indexes: indexed views, filtered indexes, columnstore indexes, and more. Many of these techniques have risks, however: your cool indexing idea may just backfire and leave your users frustrated.

In this session, you’ll see multiple demos of how different indexing patterns may slow down queries, have dangerous side effects, or not work at all. You’ll learn what to test, how to identify index problems in query execution plans, and how to sidestep these indexing blunders. If you have a solid knowledge of index basics are are comfortable reading query execution plans, this session will improve your index tuning skills.

Speaker Idol

Round 1: Wednesday, Nov 1, 4:45 PM, Skagit 4 (TCC Lower Level)
Round 2: Thursday, Nov 2, 3:15 PM , Skagit 4 (TCC Lower Level)Round 3: Friday, Nov 3, 11:15 AM , Skagit 4 (TCC Lower Level)
Final : Friday, Nov 3, 3:30 PM , Skagit 4 (TCC Lower Level)

PASS Summit hosts its 4th annual Speaker Idol – Twelve contestants will have the opportunity to test their speaking skills and compete for a guaranteed session at PASS Summit 2018. Judged by an expert panel, speakers will have the chance to not only get great real time feedback and exposure to an audience from around the world, but also the opportunity to improve as speakers.

Help support and cheer on your PASS Summit speakers in the making at this event open to all attendees!

Check out the full conference schedule

See the full schedule here.

Hope to see you in Seattle!

Posted on

IO Patterns for Index Seeks: There May be a Lot of Read Ahead Reads

An index seek at work

I received a question recently asking about disk access patterns for index seeks in SQL Server. The question suggested that index seeks would have a random read pattern.

Is this necessarily the case?

Index seeks aren’t necessarily a small read

We tend to think of an index seek as a small, efficient retrieval of a few pages. It “seeks” straight there.

But this isn’t necessarily the case. Seeks can read quite a large amount of data — even all the data in a table if it meets the criteria for the key that I’m seeking on.

Seeks may also be more complex than they sound: they may seek on one key of the index, and then use non-seekable predicates to check every row that comes back to apply more filters.

Let’s look at an example

I have a table named dbo.FirstNameByBirthDate, which has a row for each baby name reported in the United States from 1880 to 2015.

There is a nonclustered index on a column named BirthYear.

I run the following query:

FROM dbo.FirstNameByBirthDate
WHERE BirthYear > cast(1700 as int);

It gets a seek!

Here’s the plan for the query. This gets a seek operator…

Digging into the properties of that operator, the seek predicate here is:

Seek Keys[1]: Start: [BabbyNames201711].[dbo].[FirstNameByBirthDate].BirthYear > Scalar Operator([@1])

SQL Server knows this is going to be a lot of rows. It decides to go parallel.

Having checked the data, I know that this seek operation is going to read the entire index on BirthYear. But, technically, this is still a seek operation.

What types of read operations does this do?

Looking at an actual execution plan, I dig into the index seek operator and it shows me information about the physical IO. Almost all of the requests were read-ahead reads.

Read-ahead is a mechanism that SQL Server can use when it’s pulling a lot of information from disk. Instead of pulling 8K pages onesy-twosy-threesy, SQL Server can suck up big chunks of pages from disk with a vacuum cleaner.

If you’re running developer or enterprise edition, you may get a larger vacuum cleaner.

mmmm, read-aheads

What does that look like to Windows?

There’s a couple of ways to look at the read size. One method is to fire up the SysInternals tool Process Monitor and watch ReadFile operations from sqlservr.exe on a test system.

Here’s a sample of the reads it saw when I was doing this seek (and not running anything else on the instance). The highlighted row has a length of 524,288 – that’s a 512KB read! I’m running Developer Edition, so I’ve got the big read-ahead vacuum cleaner.

Most of the reads here are 64K chunks, but there’s quite a few cases when it can grab more…

A small excerpt of the read operations seen by Process Monitor

What does this mean?

This means that there’s no simple shortcut to establishing and testing IO patterns for an application.

You shouldn’t, for instance, look at a performance counter that says there’s a lot of index seeks and interpret that to mean that there’s a lot of tiny 8K random read operations going to disk. Index seeks just aren’t that simple, and that’s a good thing for performance! Even if seeks do need to go to disk, they are able to leverage read-ahead reads — and they may be ‘seeking’ a lot of data.

If you want to get into the nitty gritty and collect data to help you characterize your workload, check out the Analyzing I/O Characteristics and Sizing Storage Systems for SQL Server whitepaper.

And I guess I should admit: read-ahead doesn’t really use a vacuum cleaner.

Posted on

New Free Webcasts for November and December 2017

I’ve just scheduled a whole batch o’ free webcasts.

Webcasts are held on Thursdays at 9 AM Pacific / noon Eastern / 4 pm UTC.

Here’s the upcoming slate of topics:

  • Oct 5 – RCSI and Snapshot Isolation
  • Oct 19 – Pressure Testing Memory
  • Oct 26 – What’s REALLY in That Index?
  • Nov 9 – Filtered Columnstore Indexes
  • Nov 16 – Defragging: Reorg or Rebuild?
  • Nov 30 – In-Memory Indexes & Isolation
  • Dec 7 – Indexing for Windowing Functions
  • Dec 14 – Serializable & Repeatable Read

Reserver your spot for these free events here.

For each webcast, you’ll get a custom invitation from GoToWebinar with a link to add the event to your calendar.

I can’t wait to put together these presentations, and I hope to see you on Thursdays!

Posted on

The Case of the Rowgroup Deadlock in a Columnstore Index

I came across a fun deadlock when writing demos for my session on the Read Committed isolation level this week. (It’s OK to call it “fun” when it’s not production code, right?)

I was playing around with a nonclustered columnstore index on a disk-based table. Here’s what I was doing:

Session 1: this session repeatedly changed the value for a single row, back and forth. This puts it into the delta store.

Session 2: this session repeatedly counted the number of rows in the table, using the columnstore index.

With my sample data in this scenario, I found I frequently generated deadlocks.

Let’s look at the deadlock graph

I started up an Extended Events trace for the xml_deadlock_report event, and here’s what this deadlock looks like (with some annotations)…

Click for a larger view

Breaking it down

The circle on the left is Session 2 – it was running a SELECT COUNT that used the nonclustered columnstore index. It was chosen as the deadlock victim.

Reading the arrows, the SELECT COUNT query:

  • Had a lock on the compressed rowgroup in the columnstore index
  • Wanted a shared lock on the b-tree delta store for the columnstore index to count the rows for data that had been changed

The circle on the right is Session 1 – it was running an update that changed the value for one column in one row. Reading the arrows, the UPDATE query:

  • Had an intent exclusive lock on the b-tree delta store
  • Wanted an intent exclusive lock on the compressed rowgroup (presumably to do the work to make sure it was clear the related row is in the delta store)

Voila– deadlock!

How do I fix it?

If I really want to churn changes into my columnstore index at the same time that I rapid-fire query the data, I may want to change my isolation level for the query counting the data.

Read committed snapshot isolation or snapshot isolation for the SELECT COUNT query  can change this blocking scenario and sneak us right past those deadlocks — as long as enabling those is the right thing for my application and my database server.

The more clever we get, the more ways things can backfire

I’m not writing this post to bash columnstore indexes — far from it. Churning tons of updates into a columnstore index isn’t necessarily what you want to do.

Changing to indexes or data can cause deadlocks. It can also make them go away! That’s a fact of life, and as we add more indexing tools to our toolkits, we still gotta live with it.

Want to learn more about deadlocks?

Check out my course, Troubleshooting Blocking & Deadlocks for Beginners. It gives you example code to create a deadlock, and walks you through tracing and decoding the deadlock graph, too.