Posted on

Quizzes: Logical and Physical Joins in SQL Server

One highlight of my week is writing a quiz for my weekly quizletter. It’s also one of the most difficult things I do, because nothing teaches you about how tricky wording is like writing a quiz! In recent weeks I’ve done a couple of quizzes on joins.

Did you miss either of these quizzes? If so, it’s the perfect Friday to have a little quiz fun and learn about joins!

Quiz 1: Logical joins

Do you know the nitty gritty on inner joins, outer joins, and cross applies in TSQL? Or are you just interested to learn? Either way, take the eight-question quiz on logical joins.

Quiz 2: Physical joins

How well can you identify the physical join operators seen in SQL Server’s graphical execution plans? Dive into this six-question quiz.

Posted on

Do Index Changes Remove Execution Plans from Cache?

A good, satisfying recompile…

When you modify the indexes on a table, SQL Server needs to reconsider how it executes queries that reference that table. But the way this appears when you’re looking at your execution plan cache is far from obvious:

  • Impacted query plans are NOT removed from cache at the time you change an index. The next time the query is executed, it will be recompiled, and the new plan and related execution information will be placed in cache.
  • This is true whether or not the index is used by the query.
  • Even if the index is on columns in the table which are not referenced by the query, the query will recompile on the next run.

To see this in action, you can play along with this sample script.

A drama of a query plan and changing indexes

Here’s how our story goes…

Creating an index doesn’t remove a plan from cache

We run our Demo Query five times. Querying sys.dm_exec_query_stats and related DMVs, we see five executions, plan_generation_num = one.

Then we create an index on a table referenced by the query.

Querying the DMVs we still see five executions, plan_generation_num = one, and the query plan in cache. It hasn’t been removed!

Running the query again causes a recompile

We run our Demo Query again, this time four times.

Note: It’s important to highlight the query exactly, and not highlight spaces before the query on one run, and highlight differently the next run. Leading spaces are considered part of the query and inconsistent use of spaces impacts plan reuse.

Querying the DMVs, we now see four executions, and a new time for the plan creation_time. Plan_generation_num = two.

If the index is relevant to the query and SQL Server decides to use it, we’ll see a new execution plan.

An index rebuild doesn’t remove the plan from cache, either

After running ALTER INDEX REBUILD, we still see four executions in the cache, and plan_generation_num=two.

But if we rerun Demo Query three times, the first run triggers a recompile. We get a fresh creation_time for the plan, and now plan_generation_num has incremented to three.

The same thing holds for dropping an index

This plays by the exact same rules. When you drop an index, it doesn’t remove the plan from the cache immediately.

Put another way, you can have query plans in cache which refer to indexes that have been dropped, and do not currently exist. The query plan will be recompiled the next time the query is executed — if it’s still in cache. (Maybe it could be removed in the meanwhile due to memory pressure, a restart, an administrator’s command, etc.)

This may seem weird, but I think it’s a good thing

I like this behavior in SQL Server. This means that if you run nightly index maintenance jobs which hit a lot of indexes with rebuilds, your execution plan cache isn’t largely cleared out. If you’re looking at sys.dm_exec_query_stats the next morning, you’re more likely to see relevant information.

Posted on

Running SQL Server in a Docker Container on a Mac

A user recently asked me what a good use case might be to run SQL Server on Linux in a Docker container.

I recently got started with SQL Server in Docker on my Mac, and I’m really excited about the potential. Learn why and see how easy it is to rebuild your SQL Server on Docker once you have things configured in this 7 minute video:

In this video:

  • 00:10 – The use case I’m most excited about
  • 02:20 – Deleting all my Docker containers
  • 03:15 – Pulling the latest docker image for microsoft/mssql-server-linux
  • 04:15 – Using docker run to set up my container
  • 04:50 – Connecting to the docker container from SSMS installed in a VM running Windows

How to set up your Mac the first time

I got everything going on my Mac using Aaron Bertrand’s great article, VS Code on Mac meets SQL Server on Linux (in Docker).

Setup took me around an hour (with lots of breaks), and the only halfway tricky part was figuring out what IP address to connect to, since I have a different networking setup than Aaron does.

What if I don’t want to delete all my data?

In the video, I show clearing all your docker containers and grabbing fresh info. It’s definitely not a refined approach. If you’d like to be more civilized and update existing Docker containers, Jeremiah Peschka wrote about how to do that.

What if I want to run SQL Server in Linux in a Docker Container on Windows?

Start with the Microsoft Quickstart guide.

Posted on

The COMPRESS() TSQL Function in SQL Server 2016+

One cool little feature in SQL Server 2016 is COMPRESS(). It’s a TSQL function available in all editions that shrinks down data using the GZIP algorithm (documentation).

Things to know about COMPRESS():

  • Compressed data is in the VARBINARY(max) data type
  • You get the data “back to normal” by  using the DECOMPRESS function – which also outputs VARBINARY(max)
  • You can’t use columns of the VARBINARY(max) type in an index key column– but it may be useful to use the column as a filter in a filtered index, in some cases

Let’s play with an example

If you’d like to play with all the code, grab it from this gist.

Let’s say that we have a table with a ‘Notes’ style field. For most rows, ‘Notes’ is null. For the rows where it has data, sometimes it’s short, but sometimes it’s a super long set of characters.

My first step is to create two tables with 10 million rows each with a ‘Notes’ column. ‘Notes’ is null for most rows — but for 10K rows, it has varchar data in it ranging from 3 bytes to 29KB.

How much compression do I get?

I generated ‘Notes’ data that compresses very well – I just used a repeating set of characters. The table has just one other column, so most of the size is in the ‘Notes’ column, so this results in a much smaller table.

In the original table, the long character rows couldn’t fit on the in-row 8KB pages, so most of the ‘Notes’ column had to be stored on LOB pages (for the rows where it’s not null). My super-compressible rows are so small that they don’t have to go off-row: not only is my table much smaller, it doesn’t even have to use LOB pages.

(Note: I used the default setting where large value types may be stored in-row when I created the tables.)

Datatype matters – a lot

If you use COMPRESS(), you need to make sure you remember the data type that you used before compressing the data. You can get the data back with the DECOMPRESS() function, but to read it properly you have to convert it back into the original type.

In this case, I compressed VARCHAR data. Here’s what it looks like reading the data with just plain DECOMPRESS(), and DECOMPRESS() converted to VARCHAR and NVARCHAR.

You have limited indexing options – but more than you might think

In the documentation on COMPRESS(), it says, “Compressed data cannot be indexed.” That’s not 100% true, at least not according to my understanding of the word “indexed”.

It is true that COMPRESS() outputs a VARBINARY(max) column. This data type can’t be a key column in an index.

However, columns with this data type can be used in the filter of a filtered index in a way that may be interesting for cases like our ‘Notes’ column, where most of the rows have a NULL value for ‘Notes’ and the column is highly selective.

In this case, we might want to create a covering filtered index, like this:

CREATE INDEX ix_filtertest 
on dbo.Compressed (CompressedId)
    INCLUDE (Notes)
WHERE (Notes IS NOT NULL);
GO

Using Notes as an included column does store another copy of it. But in the case of our ‘Notes’ column, the compressed column is all in-row data and is not particularly large.

To get SQL Server to use this index, we may have to write our queries carefully, though. Just using the ‘Notes’ column in a predicate doesn’t do it:

Instead, we have to explicitly add a predicate to our query that matches the IS NOT NULL filter:

So it may be possible in some cases to use indexes with data generated with COMPRESS(), depending on what you want to do.

Posted on

Downloadable SQL Server Desktop Wallpapers

I’ve set things up so you can quickly download desktop wallpapers featuring cartoons and SQL Server concepts.

Get ’em here

Grab wallpapers at sqlworkbooks.com/wall.

Want to know when I add new wallpapers?

I’m working on a new query tuning wallpaper for this week’s tempdb webcast. Subscribe to the weekly quizletter (there’s a checkbox to sign up on the wallpaper download) to get updated as soon as a new wallpaper is added to the pack!