jump to navigation

Scalar Functions Not Behaving as Scalar Functions December 8, 2009

Posted by mwidlake in performance, Uncategorized.
Tags: ,
3 comments

My thanks go to Graeme Hobbs, who I currently have the pleasure of working with, for explaining this to me.

This is all about scalar functions that do not act like scalar functions.

Where I am spending much of my time at the moment, we have an issue with a very simple sql statement. It takes on average 2 buffer gets per execution. How could that be an issue? Well, it is being executed a few thousand times an hour. By “a few thousand” read 50,000 times. Even a simple statement being executed 15 times a second is not good if it is not needed. Especially, as in this case, there is a context switch.

I got a look at the code and it is doing something like this:

HAVING SUM( col.vol_gbp ) >=
my_user.pkg_application_settings.get_number_value( ‘rp157_eric’ )

I quickly batted an email to my esteemed colleague Mr Hobbs saying “but this is a scalar. It will be executed once for the statement and the value passed into each execution of the whole query, it can’t be the source of the 50,000 executions an hour”.

Oh woe, for my ignorance has undone me. But it’s OK, Graeme not only took time to explain, but knocked up the following nice demo of WHY this is a problem. And said I could put it on my Blog. What a nice chap.

From Graeme:
******************************************************

Consider package:

CREATE OR REPLACE PACKAGE BODY ghtest
IS
l_called_count NUMBER := 0;

-- Function and procedure implementations
FUNCTION f_get_called_count
RETURN NUMBER
IS
BEGIN
RETURN( l_called_count );
END;

FUNCTION f_call
RETURN NUMBER
IS
BEGIN
l_called_count := l_called_count + 1;
RETURN 1;
END;

FUNCTION f_reset
RETURN NUMBER
IS
BEGIN
l_called_count := 0;
RETURN 1;
END;
END ghtest;

– nothing up my sleeve
SELECT ghtest.f_get_called_count
FROM DUAL;

F_GET_CALLED_COUNT
——————
0
ie no calls made yet

SQL> SELECT   id
           , COUNT( * )
        FROM ( SELECT *
                FROM MY_USR.tbl_SOME_DATA
               WHERE ROWNUM < 10 )
    GROUP BY id
      HAVING COUNT( * ) > ghtest.f_call;

ID     COUNT(*)
---------- ----------
1624361594 3

SELECT ghtest.f_get_called_count
FROM DUAL;

F_GET_CALLED_COUNT
------------------
7

NOTE the counter has gone up to 7 as there were 7 IDs found from that code -6 IDs with one record and one with 3.

SELECT ghtest.f_reset
FROM DUAL;

F_RESET
———-
1

That just blanks the counter (see the package).

-- nothing up my sleeve
SELECT ghtest.f_get_called_count
FROM DUAL;

F_GET_CALLED_COUNT
------------------
0

SELECT   id
          , COUNT( * )
       FROM ( SELECT *
               FROM my_usr.tbl_some_data
              WHERE ROWNUM < 10 )
   GROUP BY id
     HAVING COUNT( * ) > (SELECT ghtest.f_call FROM dual);

ID     COUNT(*)
---------- ----------
1624361594 3

SELECT ghtest.f_get_called_count
FROM DUAL;

F_GET_CALLED_COUNT
------------------
1

You can see that the scalar subquery
HAVING COUNT( * ) > (SELECT ghtest.f_call FROM dual);
calls f_call once.

Whereas
HAVING COUNT( * ) > ghtest.f_call;
results in it being called 7 times – the number of rows returned from the group by

That is precisely the problem.
******************************************************

Apparently our developers should know this, as it is a “known issue”. But not known to me until now (unless I knew it and forgot it)

So, hats of to Graeme.

{apologies, wordpress utterly screwed the layout/truncated text when I pasted in the example, have fixed now}

Follow

Get every new post delivered to your Inbox.

Join 159 other followers