Why is UDF so much slower than a subquery?

I have a case when I need to translate (search) several values โ€‹โ€‹from one table. The first way I wrote this is to use subqueries:

SELECT (SELECT id FROM user WHERE user_pk = created_by) AS creator, (SELECT id FROM user WHERE user_pk = updated_by) AS updater, (SELECT id FROM user WHERE user_pk = owned_by) AS owner, [name] FROM asset 

Since I often use this subquery (that is, I have about 50 tables with these fields), and I might need to add some more code to the subquery (for example, "AND active = 1"), I thought that put them in a user-defined function UDF and use this. But the performance using this UDF was terrible.

 CREATE FUNCTION dbo.get_user ( @user_pk INT ) RETURNS INT AS BEGIN RETURN ( SELECT id FROM ice.dbo.[user] WHERE user_pk = @user_pk ) END SELECT dbo.get_user(created_by) as creator, [name] FROM asset 

Performance # 1 is less than 1 second. Productivity No. 2 is about 30 seconds ...

Why, or more importantly, is there a way I can code in SQL Server 2008 so I don't have to use so many subqueries?

Edit:

Just a little more explanation of when this is helpful. This simple request (i.e. get userid) becomes much more complicated when I want to get text for the user, since I need to combine with the profile to get the language, with the company to see if the language should be selected. 'from there instead and with a translation table to get the translated text. And for most of these requests, performance is a minor issue for readability and serviceability.

+12
performance sql sql-server sql-server-2008 user-defined-functions
Feb 04 '09 at 10:01
source share
4 answers

UDF is a black box for query optimizer, so it runs for each row. You make a line cursor. For each row in the asset, find the identifier three times in another table. This happens when you use scalar or multi-operator user-defined functions (built-in user-defined functions are just macros that expand into an external query)

One of the many articles on this issue is Scalar Functions, Embedding, and Performance: An entertaining title for a boring post .

Subqueries can be optimized to correlate and eliminate row operations.

What you really want is:

 SELECT uc.id AS creator, uu.id AS updater, uo.id AS owner, a.[name] FROM asset a JOIN user uc ON uc.user_pk = a.created_by JOIN user uu ON uu.user_pk = a.updated_by JOIN user uo ON uo.user_pk = a.owned_by 

February 2019 update

SQL Server 2019 is starting to fix this problem.

+32
04 Feb '09 at 10:09
source share

Like other posters, using associations will certainly give you the best overall performance.

However, since you stated that you do not need a headache to maintain similar 50-union joins or subqueries, try using the built-in table function as follows:

 CREATE FUNCTION dbo.get_user_inline (@user_pk INT) RETURNS TABLE AS RETURN ( SELECT TOP 1 id FROM ice.dbo.[user] WHERE user_pk = @user_pk -- AND active = 1 ) 

Then your original request will become something like:

 SELECT (SELECT TOP 1 id FROM dbo.get_user_inline(created_by)) AS creator, (SELECT TOP 1 id FROM dbo.get_user_inline(updated_by)) AS updater, (SELECT TOP 1 id FROM dbo.get_user_inline(owned_by)) AS owner, [name] FROM asset 

A function an inline function oriented to a table should have better performance than a scalar function or a multi-valued table function.

Performance should be roughly equivalent to the original request, but any future changes can be made to UDF, which makes it much more convenient.

+12
Feb 04 '09 at 11:54
source share

To get the same result (NULL if the user is deleted or inactive).

  select u1.id as creator, u2.id as updater, u3.id as owner, [a.name] FROM asset a LEFT JOIN user u1 ON (u1.user_pk = a.created_by AND u1.active=1) LEFT JOIN user u2 ON (u2.user_pk = a.created_by AND u2.active=1) LEFT JOIN user u3 ON (u3.user_pk = a.created_by AND u3.active=1) 
+2
Feb 04 '09 at 10:40
source share

Am I missing something? Why is this not working? You select only the identifier that you already have in the table:

 select created_by as creator, updated_by as updater, owned_by as owner, [name] from asset 

By the way, when designing, you should really avoid keywords such as name , such as field names.

0
04 Feb '09 at 14:53
source share



All Articles