Posted on

Find the Scalar Function Call: What it Means If It’s Hiding in Probe Residual

User defined functions are fairly simple to create in SQL Server, but figuring out exactly how they are being used can take a little time – even in a simple execution plan. This is because the functions can be tucked away into lots of different operators in the plan, even join operators.

The good news: once you learn what to look for, it’s not terribly complicated. There are some fancy terms involved, but the definitions are pretty simple.

Here’s a quick look at a function call hidden in a hash match operator, which is being used to implement an inner join.

The estimated query execution plan

This query uses the Microsoft WideWorldImporters sample database. The query calls the built in scalar user defined function, Website.CalculateCustomerPrice:

SELECT
    il.InvoiceLineID
FROM Sales.Invoices AS i
JOIN Sales.InvoiceLines AS il
    ON i.InvoiceID=il.InvoiceID
WHERE 
    [Website].[CalculateCustomerPrice] ( i.CustomerID, il.StockItemID, i. InvoiceDate ) 
        <>
     il.UnitPrice;
GO

Here’s the query execution plan:

There’s not a lot of places for the function call to hide in this particular plan, right? No Compute Scalars to look into at all.

If we look at the properties of the Hash Match join, we can find the function call:

How does the hash join work?

The description at the top of this tool tip is quite helpful when it comes to remembering how hash matches work: SQL Server is building a table in memory from the Sales.Invoices table. It’s then “probing” into it with items from the Sales.InvoicesLines table.

The “Hash Keys Probe” part of the tool tip tells us that it is looking for matches based on the InvoiceID column (our join column).

What about the ‘Probe Residual’?

After we find matches based on the customer id, we have more work “left over” — that’s the “residual” bit.

For every row that matches, SQL Server is plugging values into the Website.CalculateCustomerPrice() function and comparing the result to the Unit price column, just like we asked for in the where clause.

In other words, this is happening for every row in Sales.InvoiceLines that has a matching row in Sales.Invoices.

Which is every single invoice & invoice line, as it turns out.

Bonus

If you’re ever looking for an example of a query where collecting an actual execution plan dramatically slows down execution (observe effect), this is a good one. On instances I’ve tested, this query takes 14-18 seconds to run when no plan is collected, and about a minute longer when you enable actual plan collection.

Takeaways

When you’re looking at execution plans, it can be very tricky to spot user defined functions in the plan:

  • You need to not only check into Compute Scalar operators if they are around, but also look carefully at the properties of other operators, including joins
  • If you spot the function call as part of a “probe residual” on a hash join, it is being applied to every row that meets the initial join criteria
Posted on

Adaptive Queries in Standard Edition: Interleaved Exec for Multi-Statement TVFs

It’s tough to keep track of which features work in each version of SQL Server, and which Editions support them.

My memory told me that the new Adaptive Joins feature in SQL Server 2017 was Enterprise Edition only…  and that’s correct, but I didn’t realize that the fancy new feature to make Multi-Statement TVFs smarter has much wider licensing.

Interleaved Execution for Multi-Statement Table Valued Functions works in Enterprise, Standard, Web, and Express Editions

I discovered this by accident when I was doing some testing on my Standard Edition instance. I was looking at an execution plan, and I checked the properties of a TVF and …

Function Beaver couldn’t be more excited about this

First, I double-checked and made sure I really was connected to my Standard Edition instance. YEP.

Then I checked the SQL Server 2017 Editions and Components chart to see if this was an accident or not

Sure enough, ‘Batch Mode Adaptive Joins’ and ‘Batch Mode Memory Grant Feedback’ are Enterprise Edition features.

But ‘Interleaved Execution for Multi-Statement Table Valued Functions’ is available in all the editions.

What does ‘Interleaved Execution for Multi-Statement TVFs’ do for you?

Multi-statement table valued functions are difficult to optimize as part of a query: they run multiple statements, so how the heck do you estimate how many rows are going to come out of that?

This new feature lets the optimizer stop and take a peek!

When the optimizer finds a multi-statement TVF that qualifies*, it pauses the optimization process (badass!), and then goes briefly into the execution phase.

(This is the interleaving part.) It can run through that little part of the plan to get a good estimate, instead of making a blind guess.

Good estimates means a much better chance that you’ll get the right joins, memory grant, and qualify for parallelism when needed.

That usually adds up to faster query execution.

*Which Multi-Statement TVFs qualify?

Not all multi-statement TVFs are created equal. If your TVF is part of a data modification, or if your TVF is inside a CROSS APPLY, it doesn’t qualify for this magic — at least not in SQL Server 2017. That may change in the future, this is just the V1.

How do I turn this on?

If you’re on SQL Server 2017, your database compatibility level controls whether or not this feature is available.

If your compat mode is 140, your multi-statement table valued functions are very likely to speed up.

I say ‘very likely’ because there can be some edge cases where better estimates lead to worse performance.

Want to learn more?

If you agree with Function Beaver that this is the best thing since CREATE FUNCTION…

Posted on

Why Is My Function Missing From sys.dm_exec_query_stats and Query Store?

It’s time to write a blog post, so I can save time when I go searching for this in a couple years!

Functions can be tricksy in SQL Server

It can be difficult to see functions in execution plans, and to figure out when they’re eating a lot of CPU time on your SQL Server.

One way that’s handy to see if functions are being executed frequently is to check out the sys.dm_exec_query_stats Dynamic Management View and look at the execution count, CPU time, and logical reads of the ‘CREATE FUNCTION’ statement for a given function, to estimate its impact.

But recently, I was working on a project and…

I couldn’t find my functions in sys.dm_exec_query_stats! I also couldn’t find them in Query Store.

Or, rather, I couldn’t find SOME of my functions

Some of my functions in the demo code were showing up just fine. I was really puzzled by that. I thought …

Maybe this is a bug with ‘CREATE OR ALTER’? A sign of some weird memory pressure? Something introduced in SQL Server 2017? A buggy side effect of implicit conversions in some of the functions? A problem with the queries I was using? A weird setting on the database? (Also: about 100 other things that didn’t turn out to be the case.)

I finally wrote up some simple demo code, tested it against a SQL Server 2008 R2 instance (omitting the Query Store components), compared it with SQL Server 2017, and found it to be consistent.

Some functions just don’t accrue information in sys.dm_exec_query_stats, and that’s been true for quite a while.

I tested six types of functions:

  • Scalar functions that do, and do not do data access
  • Multi-statement table valued functions that do, and do not do data access
  • Single statement table valued functions that do, and do not do data access

Which of these functions appear in sys.dm_exec_query_stats and sys.query_store_query?

Philosoraptor, reinvented

To be clear, the queries run which call the functions all appear in sys.dm_exec_query_stats and sys.query_store_query. There are no missing calling queries.

However, the calling query only shows things like how many times it was executed. It doesn’t show how many times the function was executed (the function may be executed many times per query). Sometimes that’s handy to see when you’re troubleshooting.

CREATE FUNCTION entries with per-function-execution data only appeared for my:

  • Scalar function that does data access
  • Multi-statement table valued function that does data access
  • Multi-statement table valued function that does NOT do data access

It makes sense to me that my single statement table valued functions are missing from the list, because those functions can be “inlined” into their outer queries. They don’t really run on their own.

However, it is a bit confusing that the scalar function that does not do data access is missing from the list. (I don’t think it’s a huge tragedy, but it makes it look like you’re missing something — hence me spending a while on this!)

Which of these functions appear in sys.dm_exec_function_stats?

We got sys.dm_exec_function_stats in SQL Server 2016, so you can’t use this on Ye Older SQL Instances. This view recorded information for my:

  • Scalar function that does data access
  • Scalar function that does NOT do data access

Why no table valued functions? This one is documented! The DMV simply doesn’t have information about Table Valued Functions.

What does this all mean?

If you have a lot of scalar functions that don’t do data access, and you’re on SQL Server 2016 and higher, check out sys.dm_exec_function_stats  for aggregate information on how they are running.

And if you’re just confused about why not all your functions are showing up in sys.dm_exec_query_stats or Query Store, hopefully this post saves you some time!