The SQL Server Data Row Size Question: Why is It Bigger?

This morning, I received the following question from a user:

Hello Madam,

Could you please clarify SQLServer “Data Row” size:

If I run the script below on SQL Server 2012, then Slot(row) Size is 710 bytes

if I run the same script against SQL Server 2016 and above, then Slot(row) Size is 724 bytes.

They then provided a script which creates and inserts a few rows into a sample table, runs the DBCC IND command to find a list of pages for the sample table, then uses the DBCC PAGE command to examine the page.

The first thing I looked for: how many bytes is the difference? Is it 14 bytes?

I love it when someone sends me a repro script, but in this case I didn’t need to run it. The first thing I did was to look at the two numbers given for row size, and to subtract the smaller one from the larger one: 724 – 710 = 14 bytes of difference.

That bit of information alone gave me an immediate guess of what was going on.

Row versioning in SQL Server has a 14-byte row overhead

You can reduce blocking in SQL Server by enabling “optimistic” locking. There are trade-offs for this approach, however: row versions are enabled in tempdb, and 14 bytes of overhead may be required on impacted rows. (For a quick overview of this row versioning process, check out this Simple Talk post by Kalen Delaney.)

My guess is that row versioning is enabled only on the SQL Server 2016 and above instances that are being tested in this case. This could be because of any of the following

  • Snapshot isolation is allowed on the database
  • Read committed snapshot isolation is enabled on the database
  • The database has a readable secondary in an availability group

You might wonder – why would having a readable secondary cause the row to grow?

Readable secondaries in SQL Server allow you to offload read workloads.

It’s important to minimize blocking against those readable secondaries, though: you don’t want data modifications flowing through to the readable secondary to be blocked by reporting queries that you are running against the secondary, after all. You also want reads to be as consistent as possible.

For this reason, queries that are run against readable secondaries are automatically escalated to snapshot isolation. And snapshot isolation requires row versioning — those 14 bytes of space are needed too help make it all work.

But readable secondaries are read-only, right? So how can it it write those 14 bytes? The answer is that it’s complicated. Here’s the documentation from in the “capacity planning” concerns in Microsoft’s docs:

When you configure read-access for one or more secondary replicas, the primary databases add 14 bytes of overhead on deleted, modified, or inserted data rows to store pointers to row versions on the secondary databases for disk-based tables. This 14-byte overhead is carried over to the secondary databases. As the 14-byte overhead is added to data rows, page splits might occur.

The row version data is not generated by the primary databases. Instead, the secondary databases generate the row versions. However, row versioning increases data storage in both the primary and secondary databases.

Offload read-only workload to secondary replica of an Always On availability group

To dig into the details of this more, click the link — there’s even more info on the full page.

Take the SQLChallenge: Tuning a Stored Procedure

I’ve just published a new SQLChallenge course for subscribers, and I think it’s one of the best ones yet.

Your mission is to:

  1. Identify which statement is slowing down our stored procedure the most
  2. Tune the code to speed it up. You can change the query that is slow as well as anything else in the procedure that will help you make that statement faster.

In the solution videos, I’ll step through multiple strategies to figure out which statement in the procedure is slowing it down the most – because in real life, you need to have a whole bag of tricks in different situations. 

Solution scripts and videos include demos of:

  • Trace flag 7412, how it impacts sp_WhoIsActive, and live query plans
  • Finding the slow query in Query Store – with TSQL as well as the graphic reports
  • Finding the slow query in the plan cache
  • Methods of rewriting the procedure to tune the query
  • Additional code patterns that make tuning procedures simpler.

The challenge script in the first lesson creates a database on your test instance – no need to restore anything — so it’s fast to get going.

Get started here.

Watch all the videos, or just what you need

The course has an hour and ten minutes of videos, each one of which is 15 minutes or less. 

Want to take SQLChallenges?

Pick the right product for you and get started:

New Courses: Extended Events, Parallelism, Indexes, and More

It’s been a fun, busy spring and early summer. I have loads of new courses to share, and great stuff is coming up soon!

A big thank you to all the SQLWorkbooks members who attended live events, or posted your comments and questions! Your questions are terrific, and you’ve made these events so much fun.

SQLChallenge Courses

SQL Symposium and SQL Learning League members get all SQLChallenges.

SQLChallenge: Create an Extended Events Trace (55 minutes)

Follow the challenge script to reproduce a problem with Query Store in your test instance of SQL Server.

Then, your challenge is to configure a trace that will capture the problem — and also learn more about when the trace will capture an event, and when it won’t.

Challenge difficulty: This 200 level challenge will be pretty quick for folks who have configured Extended Events traces before. If you haven’t done that yet, it may take a little more time — but it’s worth it! These traces can be extremely useful.

Design the Best Index For One Year Wonders – SQLChallenge (50 minutes)

Can you reduce the logical reads as much as possible for the sample query using a disk based rowstore index?

Challenge difficulty: This challenge comes with three levels of difficulty. Even the 200 level, creating a nonclustered index without using advanced features, is interesting in this case due to the nature of the query.

Write a Query to Find “Revived” Names SQLChallenge (45 minutes)

I love writing a good query! This challenge is all about TSQL

Challenge difficulty: This is a 300 level TSQL challenge. If you’ve written queries to solve this kind of problem before, you may be able to blast through it- so there’s an “extra credit” option for you to strut your stuff in that case. If this is your first time thinking through approaches to the problem, it may take longer — but this is a pattern that will very likely prove useful in your query writing future.


SQL Seminar Recordings

SQL Symposium members get to attend all seminars live, plus watch recordings anytime.

How To Decipher CXPACKET Waits and Control Parallelism (4 hours)

Learn how to configure SQL Server’s degree of parallelism, how to dictate which queries go parallel, how to force parallelism, and how to avoid performance-killing anti-patterns.

You’ll also get the scoop on interpreting CXPACKET waits and the new CXCONSUMER waits introduced in recent versions of SQL Server.

Learn Indexing By Solving Problems – SQL Seminar – June 2018 (7 hours 23 minutes)

Learn to design and tune indexes by taking this set of eight indexing SQLChallenges!

This course builds your skills at architecting disk based rowstore indexes in SQL Server, and teaches you how indexes work at the same time. The challenges in this course teach you about clustered and nonclustered indexes, filtered indexes, indexed views, and more.


Upcoming Live SQL Seminars

Join the SQL Symposium to get all online seminars, live and recorded. Or purchase seminars individually.

Conquer Blocking and Isolation Levels

Are your users getting wrong data back? Learn how to get correct data and minimize blocking in this eight hour seminar.

  • July 11-12, 2018

Query Tuning Jam Session

Sharpen your query tuning skills in this four hour course of live SQLChallenges and solutions.

  • July 30, 2018

How to Decode Memory Pressure

Learn to identify and monitor both external and internal memory pressure against SQL Server in this four hour seminar.

  • August 14, 2018

Learn Indexing by Solving Problems – September Edition

Learn to architect disk-based rowstore indexes in eight hours of indexing challenges and solutions.

  • Watch the previously recorded seminar right away here
  • September 24-25, 2018

T-SQL School: SELECT Seminars

Learn Transact-SQL by writing queries each week: Twelve one-hour sessions. Additional signups for this experimental new course are only available via the SQL Symposium.

  • Watch the recorded sessions so far here
  • 1PM – 2PM PDT, each Saturday through Aug 11, 2018

 


Considering Joining?

Coupon code summersql will get you a big discount on the SQL Symposium or the SQL Learning League through June 28th — and it’s a recurring discount too!

As always, you can control the length of your subscription anytime. If you just want one year, you can cancel immediately and still get access for the rest of the year.

AVG() in TSQL: Watch Your Data Types

This week’s Quizletter featured a quiz on using the AVG() function in SQL Server.

I was inspired to write this quiz because I’ve been teaching TSQL School each week. When we covered aggregate functions, I remembered just how tricksy and non-average that AVG() can be.

Want to take the quiz before you read the answers?

Head over here and take the quiz first.

Or read this post and then take the quiz, if you prefer. I’m not gonna call it cheating, because it’s learning either way.


Question 1: AVG of 1 and 1

In question 1, we average a value on two rows: one and one.

How did folks do on this question?

  • Correct: 91% – These folks guessed the answer would be: 1
  • Incorrect: 9% – These folks almost all guessed the answer would be: 1.0

This seems like a subtle difference, but the key to the answer is understanding what data type SQL Server will implicitly convert these numbers to. And data types matter a lot when it comes to the results you can get out of AVG. In this case, 1 will be implicitly converted to an INT data type.

Question 2: AVG of 1 and 2

Here’s where the wacky fun begins: if I average two ints, one and two, I get back… one!

  • 46% – YAY TEAM! Great work! This is bizarro, so good on you folks.
  • 54% – I feel your pain. Most of these folks guessed 1.5, a few guessed 1.500000

What does this mean? Well, let’s look at an example with negative numbers to help get a bigger picture:

When we round integers -1 and -2, we get -1.

How to describe this behavior

There are two different ways you can think about this behavior. Pick the one that is easier for you to remember.

Description 1: SQL Server rounds integers “toward zero”. (When the average was positive, we got the ‘floor’. When the average was negative, we got the ‘ceiling’.)

Description 2: It’s truncating the decimal bits.

Extra credit: Guess which one of these descriptions came from me, and which one came from my computer scientist partner.

Question 3: Mixing in a CAST()

Aha! Now we’re getting fancy. We have a CAST function outside of the AVG function.

The keys to this answer are that CAST will be run after the AVG function completes, and the AVG function is still returning 1, because inside the AVG function, it sees the INT data type.

  • Correct: 36% – These cunning quizzers worked out that AVG would truncate the decimal bits and return 1, and that this would then be cast to 1.0
  • Incorrect: 64% – Most folks guessed 1.5

Question 4: Move that CAST() inside the AVG()

Here we finally have an example where I am changing the data type inside the AVG, and we are casting to a numeric type with precision of three, scale of one. (One number past the decimally bit!)

And, yeah, look at that, we get back a result with a scale of SIX.

In fact, if you to the trouble to see what type we end up with here (like by selecting the results into a temp table and inspecting the data types), this is numeric(38,6). 38 is the highest precision you can get with numeric.

  • Correct: 37% – Clever rabbits!
  • Incorrect: 63% – Most of these folks guessed 1.5, and I do not blame you at all

What’s up with this? Well, there’s a bit of info inside the documentation on AVG, but it’s about the decimal data type. Numeric and decimal are twins (but you want to use one or the other consistently, because of things like foreign keys). In the section on return types, it explains that the return type for decimal category (p, s) will be:

decimal(38, s) divided by decimal(10, 0)

Plugging that formula into the numbers we’re working with in this question:

Yep, it checks out.

Takeaways

Weird things like this are why some folks say not to do calculations in the SQL language.

For times when you do need to do calculations in TSQL, remember to think about your data types– both inside your functions, and outside.

Free Session! DBA vs Deadlock: How to Out-Index a Deadly Blocking Scenario

DBA vs Deadlock: How to Out-Index a Deadly Blocking Scenario

June 14, 2018 8:30 AM – 9:30 AM PST <- This is past, but no worries, it was recorded

Deadlocks strike fear into the hearts of even seasoned DBAs — but they don’t have to!

In this session, you’ll get the code to cause a sample deadlock in SQL Server. You’ll see how to interpret the deadlock graph to find out where the conflict lies, and how to design an index to make the deadlock disappear.

You’ll leave the session with the steps you need to confidently tackle future deadlocks.

Watch the recording of this free online session here.

Want the scripts? Here they are: DBA vs Deadlock: Script Download

What is Deferred Name Resolution, and How Can It Go Wrong?

I got a question recently about why a particular bit of code was failing when running.

My first thought: oh, that’s probably related to… whatsitcalled.

You know, whatsitcalled, one of those features you notice just often enough so that it’s name doesn’t come to mind immediately.

It’s deferred.

In this case, it was actually named “Deferred Name Resolution.”

Deferred name resolution lets you create a procedure referencing something that doesn’t exist

In this case, I’m creating a temporary stored procedure (out of laziness, it means I don’t have to clean up a quick demo) –

CREATE OR ALTER PROCEDURE #test
AS
IF 1=0
       EXECUTE dbdoesnotexist.dbo.someproc;
GO

The database dbdoesnotexist does NOT exist, but I’m still allowed to create the procedure.

When I do so, I get an informational message:

The module '#test' depends on the missing object 'dbdoesnotexist.dbo.someproc'. The module will still be created; however, it cannot run successfully until the object exists.

This can be useful in some cases where you’ll be querying a table or procedure that may not exist all the time, but which will exist when a certain code block is run.

You can’t always count on deferred name resolution

What if our code refers to something that may exist, but isn’t accessible?

Here’s a slightly different code sample:

CREATE DATABASE offlinedb;
GO
ALTER DATABASE offlinedb SET OFFLINE;
GO

CREATE OR ALTER PROCEDURE #test
AS
IF 1=0
       execute offlinedb.dbo.someproc;
GO

This results in error 942

Creating the procedure fails in this case. The error given is:

Msg 942, Level 14, State 4, Procedure #test, Line 5 [Batch Start Line 17]
Database 'offlinedb' cannot be opened because it is offline.

If I set the empty database “offlinedb” to be online, then deferred name resolution works and I can create #test. If I drop “offlinedb”, same thing– no problems.

But while offlinedb exists in an offline state, I get error 942 at the time I attempt to create the procedure.

Takeaways

Little quirks like this are a good argument to configure test and pre-production/staging environments in ways that mirror production as much as possible.

And if you think you might run into this situation, it might be worth using a bit of Dynamic SQL to avoid it!

Register for My Free Session on Execution Plan Forcing

I am excited to be giving a free online session as part of the Idera Live Virtual Conference, 2018!

Here are the details on my session:

To Force Plans, or Not to Force Plans, That Is the Question

Wed, May 16, 2018 – 9AM Pacific / Noon Eastern

We have new ways to force execution plans in SQL Server: in SQL Server 2017, you can even let SQL Server temporarily force query plans for you and test if it works well!

In this session, you’ll learn:

  • How forcing plans with Query Store compares with the older technology of Plan Guides
  • How to tell when a plan has been forced
  • What “morally equivalent plans” are (and why they’re a good thing)
  • How to see if a query has a high variation in performance
  • How to decide, “Should I force this plan?” and “Should I let SQL Server force plans for me?”

Register for this free session here

But wait, there’s more

I’m not the only one giving cool sessions that day — check out more free sessions at the Idera Live Virtual Conference.

Hope to see you there!