Tricky Situations When Rewriting Functions In SQL Server

Walk In Line


Far and away, some of the trickiest situations I run into when helping clients is rewriting scalar functions that have WHILE loops in them.

This sort of procedural code is often difficult, but not impossible, to replace with set-based logic.

Sure, lots of IF/THEN/ELSE stuff can be tough too, though that’s often easier to  manage with CASE expressions in stacked CTEs or derived tables.

I ran across a really interesting function recently that I had to rewrite that had a couple WHILE loops in it, and I’ve simplified the example here to show my approach to fixing it.

Table Face


The original intent of the function was to do some string manipulation and return a cleaned version of it.

There were several loops that looked for “illegal” characters, add in formatting characters (like dashes), etc.

The problem the function caused wasn’t it running for a long time (we’ll talk more about that tomorrow), it was that the function was called in really critical code paths that Function Repercussions© were messing with:

  • Row by row execution
  • Inhibited parallelism

These are not the kinds of functions that are Froid Friendly© either. If they were, I could largely leave them alone. Maybe.

Depends on bugs.

Bad Way


The bad way of doing this is like so. If you write functions like this, feel bad. Let it burn a little.

Ten years ago, I’d understand. These days, there’s a billion blog posts about why this is bad.

CREATE OR ALTER FUNCTION
    dbo.CountLetters_Bad
(
    @String varchar(20)
)
RETURNS bigint
AS 
BEGIN

DECLARE
    @CountLetters bigint = 0,
    @Counter int = 0;
    
    WHILE 
        LEN(@String) >= @Counter
    BEGIN
        IF PATINDEX
           (
               '%[^0-9]%', 
               SUBSTRING
               (
                   @String, 
                   LEN(@String) - @Counter, 
                   1
               )
           ) > 0
        BEGIN
            SET @CountLetters += 1;
            SET @Counter += 1;
        END;        
        ELSE        
        BEGIN
            SET @Counter += 1;
        END;
    END;

    RETURN @CountLetters;

END;
GO 


SELECT 
    CountLetters = 
        dbo.CountLetters_Bad('1A1A1A1A1A');

Better Way


This is a better way to write this specific function. It doesn’t come with all the baggage that the other function has.

But the thing is, if you just test them with the example calls at the end, you wouldn’t nearly be able to tell the difference.

CREATE OR ALTER FUNCTION 
   dbo.CountLetters
(   
    @String AS varchar(20)  
)  
RETURNS table
AS
RETURN

WITH
    t AS 
(
    SELECT TOP(LEN(@String))
        *,
        s = 
            SUBSTRING
            (
                @String, 
                n.Number +1, 
                1
            )
    FROM dbo.Numbers AS n
)
    SELECT
        NumLetters = 
            COUNT_BIG(*)
    FROM t
    WHERE PATINDEX('%[^0-9]%', t.s) > 0;
GO

SELECT
    cl.*
FROM dbo.CountLetters('1A1A1A1A1A') AS cl;

Pop Quiz Tomorrow


This is a problem I run into a lot: developers don’t really test SQL code in ways that are realistic to how it’ll be used.

  • Look, this scalar UDF runs fine for a single value
  • Look, this view runs fine on its own
  • Look, this table variable is great when I pass a test value to it

But this is hardly the methodology you should be using, because:

  • You’re gonna stick UDFs all over huge queries
  • You’re gonna join that view to 75,000 other views
  • You’re gonna let users pass real values to table variables that match lots of data

In tomorrow’s post, I’m gonna show you an example of how to better test code that calls functions, and what to look for.

Thanks for reading!

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that, and need to solve database performance problems quickly. You can also get a quick, low cost health check with no phone time required.



2 thoughts on “Tricky Situations When Rewriting Functions In SQL Server

  1. I’d use a combination from TRANSLATE(), REPLACE() and LEN(). TRANSLATE() is new in SQL 2019 and can be replaced by multiple Calls of REPLACE() in older SQL versions. This way you don’t have to blow up the number of rows, so it would work on longer strings too without running into spills to tempdb because SQL has no way to know how much memory it needs. And it would work with almost any bad_char-definition too (you could even use a second parameter for the bad chars):

    USE tempdb
    GO
    CREATE OR ALTER FUNCTION dbo.f_Count_Bad_Chars
    (@input_string VARCHAR(1000))
    RETURNS TABLE AS RETURN
    WITH conf AS (SELECT ‘1234567890’ AS bad_chars
    , CHAR(255) AS dummy_char)
    SELECT @input_string AS input_string
    , conf.bad_chars
    , LEN(@input_string) – LEN(c3.input_with_removed_bad_chars) AS number_bad_chars
    FROM conf
    CROSS APPLY (SELECT REPLICATE(conf.dummy_char, LEN(conf.bad_chars)) AS dummy_string
    ) AS c1
    CROSS APPLY (SELECT TRANSLATE(@input_string, conf.bad_chars, c1.dummy_string) AS input_with_translated_bad_chars
    ) AS c2
    CROSS APPLY (SELECT REPLACE(c2.input_with_translated_bad_chars, conf.dummy_char, ”) as input_with_removed_bad_chars
    ) AS c3

    ;
    GO
    — test with ~1.5 mio row
    DROP TABLE IF EXISTS #bigger_tbl
    DROP TABLE IF EXISTS #tbl
    SELECT ac.name + CAST(ac.object_id AS VARCHAR(10)) AS name
    INTO #bigger_tbl
    FROM sys.all_columns AS ac
    CROSS JOIN (SELECT TOP 100 * FROM sys.all_columns) AS ac2

    — takes about 3 seconds on my server
    SELECT f.*
    INTO #tbl
    FROM #bigger_tbl AS bt
    CROSS APPLY dbo.f_Count_Bad_Chars(bt.name) AS f

    SELECT *
    FROM #tbl AS t

Comments are closed.