jump to navigation

PL/SQL bug with DBMS_RANDOM? October 8, 2015

Posted by mwidlake in bug, PL/SQL, SQL.
Tags: , , ,
trackback

I think I’ve found an (admittedly obscure) bug with DBMS_RANDOM, group functions, PL/SQL and/or SQL.

Have a look and see if you also think this is odd – or have I missed the totally obvious?

(This is all on 12.1.0.2)

{Update – my conclusion is, and thanks to Joel and Sayan for their comments, that this is not a “bug”. Oracle do not promise us how PL/SQL functions are executed due to the way SQL can be re-written by the parser. I think most of us stumbling over something like this would treat it as a bug though. You have to look at the column projection, again see the comments, to see how Oracle is deciding to get the columns derived by a naked call to DBMS_RANDOM.VALUE (by naked I mean no inclusion of parameters passed in and, significantly, no reference to columns). It’s just the way it is}

Without going into the details (we would be here for hours if I did) I’m looking into the overhead of context switching between PL/SQL and SQL. It it fairly common knowledge that when you call a PL/SQL function from SQL there is a context switch when the SQL engine hands over control to the PL/SQL engine. I’ve been doing some work into how much the overhead is and that it is incurred for each distinct PL/SQL function (plus loads of other considerations around it).

In doing so I saw something unexpected (to me, anyway) which I have simplified down to this:

select /* mdw_z1 */ avg(dbms_random.value) avg_dbrav1,avg(dbms_random.value) avg_dbrav2
      ,max(created) max_cre
from all_objects
where rownum <50000

AVG_DBRAV1 AVG_DBRAV2 MAX_CRE
---------- ---------- -----------------
.502234063 .502234063 11-SEP-2014 09:31
Elapsed: 00:00:00.89

Note that the two averages of DBMS_RANDOM.VALUE are exactly the same. It is so improbably as to be pretty much impossible that over all those rows, the different random values generated for each column add up to exactly the same. They are getting the same values for each row. I’m very, very dubious of any “identical seeding” issue (ie they both get the same seed and from then provide identical values) as even if DBMS_RANDOM is basing it’s output on something like initial seed, SCN of statement and number of iterations, it is still being referenced twice for each row.

Some further evidence is that when I increase the number of calls to DBMS_RANDOM the elapsed time is almost identical and the statement CPU and PLSQL_EXEC_TIME (pulled from V$SQL) do not increase in any significant way (PLSQL_EXEC_TIME actually goes down):

select /* mdw_z2 */ avg(dbms_random.value) avg_dbrav1,avg(dbms_random.value) avg_dbrav2
       ,avg(dbms_random.value) avg_dbrav3,avg(dbms_random.value) avg_dbrav4
       ,avg(dbms_random.value) avg_dbrav5,avg(dbms_random.value) avg_dbrav6
      ,max(created) max_cre
from all_objects
where rownum <50000

AVG_DBRAV1 AVG_DBRAV2 AVG_DBRAV3 AVG_DBRAV4 AVG_DBRAV5 AVG_DBRAV6 MAX_CRE
---------- ---------- ---------- ---------- ---------- ---------- -----------------
.500367568 .500367568 .500367568 .500367568 .500367568 .500367568 11-SEP-2014 09:31
Elapsed: 00:00:00.84 -- (cf .89 before)

-- information about the two SQL statements from v$SQL, identified by my comments
SQL_ID        PLAN_HASHV  PRSE  EXCS     BUFFS DISCS     CPU_T PLSQL_EXEC    RWS
------------- ---------- ----- ----- --------- ----- --------- ---------- ------
SQL_TXT
--------------------------------------------------------------------------------
4y4jsj7uy12t3 1587414607     1     1     41481     0    843750     312691      1
select /* mdw_z2 */ avg(dbms_random.value) avg_dbrav1,avg(dbms_random.value) avg
gpdga3qars8p2 1587414607     1     1     41481     0    828125     316648      1
select /* mdw_z1 */ avg(dbms_random.value) avg_dbrav1,avg(dbms_random.value) avg

I’ve verified that adding other PL/SQL calls adds about 0.4 seconds to the execution time per extra (simple) PL/SQL function and you see the CPU Time and PLSQL_EXEC_TIME increase.

It is as if Oracle realises the call to DBMS_RANDOM is the same and so does it once per row and puts the results into each column. It does this for other deterministic PL/SQL functions but I would (a) be worried if DBMS_RANDOM was being treated as deterministic (as it’s purpose is to NOT be deterministic) and (b) I know I have used DBMS_RANDOM to generate test data many times, often for several columns in my generating SQL SELECT statement, and I never noticed this before. So, I decided to check that it was only returning 1 value per row to populate the results for each column derived from DBMS_RANDOM.VALUE:


select dbms_random.value drv1, dbms_random.value drv2,created
from all_objects where rownum < 6

      DRV1       DRV2 CREATED
---------- ---------- -----------------
.341919389 .020497964 11-SEP-2014 08:40
.569447631 .727672346 11-SEP-2014 08:40
.019986319 .709239586 11-SEP-2014 08:40
.286970852 .144263004 11-SEP-2014 08:40
 .14440676 .538196808 11-SEP-2014 08:40

Huh? That rather damages my theory and actually works the way I expected, hoped for and thought I remembered. And you can just look at the two columns and you know they are not going to add up to exactly the same! (the last digit adds up to 21 for DRV1 , 28 for DRV2) {How many of you checked that and got 27 for DRV1?}.

So I wrote some code that should be logically identical to my original SQL statement but does the data collection “manually”:

with source as
(select /*+ materialize */
        dbms_random.value  dbrav1    ,dbms_random.value dbrav2
       ,created
from all_objects
where rownum <50000
)
select/* mdw_z3 */ avg(dbrav1)  avg_dbrav1     ,avg(dbrav2)  avg_dbrav2
      ,max(created)  max_cre
from source

AVG_DBRAV1 AVG_DBRAV2 MAX_CRE
---------- ---------- -----------------
.497954489 .497633494 11-SEP-2014 09:31

Elapsed: 00:00:00.96

As you can see, all I do is force the data to be collected into an internal temporary table using the WITH clause and hint it to stop oracle merging the code together and then average the columns. And now I get two different values for the two DBMS_RANDOM.VALUE derived columns.

This version of the code also accrues more runtime and statement CPU/PLSQL_EXEC_TIME, as I mentioned above, when I add more PL/SQL calls. In the below the extended list of “columns” version takes 1.78 seconds, CPU time increases from 843,750 microseconds to 1,703,125 and PLSQL_EXEC_TIME increases from 316,589 microseconds to 917,208

with source /*2 */ as
(select /*+ materialize */
        dbms_random.value  dbrav1    ,dbms_random.value dbrav2
       ,dbms_random.value  dbrav3    ,dbms_random.value dbrav4
       ,dbms_random.value  dbrav5    ,dbms_random.value dbrav6
       ,created
from all_objects
where rownum <50000
)
select/* mdw_z4 */ avg(dbrav1)  avg_dbrav1     ,avg(dbrav2)  avg_dbrav2
      ,avg(dbrav3)  avg_dbrav3     ,avg(dbrav4)  avg_dbrav4
      ,avg(dbrav5)  avg_dbrav5     ,avg(dbrav6)  avg_dbrav6
      ,max(created)  max_cre
from source

AVG_DBRAV1 AVG_DBRAV2 AVG_DBRAV3 AVG_DBRAV4 AVG_DBRAV5 AVG_DBRAV6 MAX_CRE
---------- ---------- ---------- ---------- ---------- ---------- -----------------
.499007362 .501985119 .498591643  .50252316  .49939127  .49804233 11-SEP-2014 09:31

Elapsed: 00:00:01.78

--
--
SQL_ID        PLAN_HASHV  PRSE  EXCS     BUFFS DISCS     CPU_T PLSQL_EXEC    RWS
------------- ---------- ----- ----- --------- ----- --------- ---------- ------
SQL_TXT
--------------------------------------------------------------------------------
49zjaaj41dg00 3034557986     1     1     43465   962   1703125     917208      1
with source /*2 */ as (select /*+ materialize */         dbms_random.value  dbra
0rtbx97f14b0k 3034557986     1     1     42294   382    843750     316586      1
with source as (select /*+ materialize */         dbms_random.value  dbrav1    ,

I did have the execution plans in here too but the post is already quite long. They are identical though, as is shown by the same value of 3034557986 for the PLAN_HASH_VALUE for both statements

So in Summary, the below two versions of the (logically identical as far as I can see) code give different results. The difference is that one is allowing Oracle to do the averaging natively and in the other I am forcing the data to be collected into an internal temporary table and then averaged:

select /* mdw_z1 */ avg(dbms_random.value) avg_dbrav1,avg(dbms_random.value) avg_dbrav2
      ,max(created) max_cre
from all_objects
where rownum <50000

AVG_DBRAV1 AVG_DBRAV2 MAX_CRE
---------- ---------- -----------------
.502234063 .502234063 11-SEP-2014 09:31

with source as
(select /*+ materialize */
        dbms_random.value  dbrav1    ,dbms_random.value dbrav2
       ,created
from all_objects
where rownum <50000
)
select/* mdw_z3 */ avg(dbrav1)  avg_dbrav1     ,avg(dbrav2)  avg_dbrav2
      ,max(created)  max_cre
from source

AVG_DBRAV1 AVG_DBRAV2 MAX_CRE
---------- ---------- -----------------
.497954489 .497633494 11-SEP-2014 09:31

If no one can explain what I am missing, I suppose I should raise a bug with Oracle. Which could be tricky seeing as my access to MyOracleSupport is a bit tenuous…

In case you want to play, this is my whole test script, which does everything but query V$SQL for the statement stats at the end. I am sure you can manage that yourselves…

-- the two columns from dbms_random get the same result - which I did not expect
select /* mdw_z1 */ avg(dbms_random.value) avg_dbrav1,avg(dbms_random.value) avg_dbrav2
      ,max(created) max_cre
from all_objects
where rownum <50000
/

select /* mdw_z2 */ avg(dbms_random.value) avg_dbrav1,avg(dbms_random.value) avg_dbrav2
       ,avg(dbms_random.value) avg_dbrav3,avg(dbms_random.value) avg_dbrav4
       ,avg(dbms_random.value) avg_dbrav5,avg(dbms_random.value) avg_dbrav6
      ,max(created) max_cre
from all_objects
where rownum <50000
/
--
-- The below forces oracle to gather the data into an internal temporary table first.
-- and then I average that
with source as
(select /*+ materialize */
        dbms_random.value  dbrav1    ,dbms_random.value dbrav2
       ,created
from all_objects
where rownum <50000
)
select/* mdw_z3 */ avg(dbrav1)  avg_dbrav1     ,avg(dbrav2)  avg_dbrav2
      ,max(created)  max_cre
from source
/
-- and now I make it do more executions against dbms_random, to see if makes a difference
-- which will lend support to my idea it is doing more contect switching.
with source /*2 */ as
(select /*+ materialize */
        dbms_random.value  dbrav1    ,dbms_random.value dbrav2
       ,dbms_random.value  dbrav3    ,dbms_random.value dbrav4
       ,dbms_random.value  dbrav5    ,dbms_random.value dbrav6
       ,created
from all_objects
where rownum <50000
)
select/* mdw_z4 */ avg(dbrav1)  avg_dbrav1     ,avg(dbrav2)  avg_dbrav2
      ,avg(dbrav3)  avg_dbrav3     ,avg(dbrav4)  avg_dbrav4
      ,avg(dbrav5)  avg_dbrav5     ,avg(dbrav6)  avg_dbrav6
      ,max(created)  max_cre
from source
/
Advertisements

Comments»

1. jgarry - October 8, 2015

I really need my afternoon nap, but perhaps this is really sql set processing rather than a pl/sql bug. See https://asktom.oracle.com/pls/asktom/f?p=100:11:0::::P11_QUESTION_ID:3181424400346795479 Tom makes that distinction numerous times.

2. jgarry - October 8, 2015

D’Oh, by “that distinction” I mean SQL deciding whether to optimize the call by not doing it again, which is beyond our control in set theory.

mwidlake - October 9, 2015

Hi Joel,

Thanks for that – I had it in my mind that I would be safe from the issues of when and how often the function is fired as it is in the SELECT section. As oracle can re-write your query, it can alter when (and how often) where predicates are applied, how tables are joined, sub-query or merge statements…. But looking at the Ask Tom page and thinking more, as I am using group functions and for the DBMS_RANDOM call there is no link back to the column rows, could oracle be doing some sort of optimization. I had relied on a quick scan of plans earlier in the work (as I indicate in the text, I’m looking at a lot of things with the use of PL/SQL functions) and the same PHV in this case that they are the same…

But Sayan is showing me in the below comments I was being sloppy.

Bad Martin, BAD Martin.

🙂

3. Sayan Malakshinov - October 9, 2015

The plan shows that this is a CBO bug.
Have a look at projection sections

SQL> @tests/random.sql
SQL> with v1000 as (select level id from dual connect by level<=1000)
  2  select
  3      avg(dbms_random.value) avg_dbrav1, avg(dbms_random.value) avg_dbrav2
  4     ,avg(dbms_random.value) avg_dbrav3, avg(dbms_random.value) avg_dbrav4
  5     ,avg(dbms_random.value) avg_dbrav5, avg(dbms_random.value) avg_dbrav6
  6  from v1000;

AVG_DBRAV1 AVG_DBRAV2 AVG_DBRAV3 AVG_DBRAV4 AVG_DBRAV5 AVG_DBRAV6
---------- ---------- ---------- ---------- ---------- ----------
.505652247 .505652247 .505652247 .505652247 .505652247 .505652247

SQL> select * from table(dbms_xplan.display_cursor('','','typical +projection'));

Plan hash value: 2919102380

-------------------------------------------------------------------------------
| Id  | Operation                      | Name | Rows  | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------
|   0 | SELECT STATEMENT               |      |       |     2 (100)|          |
|   1 |  SORT AGGREGATE                |      |     1 |            |          |
|   2 |   VIEW                         |      |     1 |     2   (0)| 00:00:01 |
|   3 |    CONNECT BY WITHOUT FILTERING|      |       |            |          |
|   4 |     FAST DUAL                  |      |     1 |     2   (0)| 00:00:01 |
-------------------------------------------------------------------------------

Column Projection Information (identified by operation id):
-----------------------------------------------------------

   1 - (#keys=0) COUNT("DBMS_RANDOM"."VALUE"())[22],
       SUM("DBMS_RANDOM"."VALUE"())[22]
   3 - LEVEL[4]

Compare with it:

SQL> with v1000 as (select level id from dual connect by level<=1000)
  2  select
  3      avg(dbms_random.value-0    ) avg_dbrav1, avg(dbms_random.value-0    ) avg_dbrav2
  4     ,avg(dbms_random.value-0-0  ) avg_dbrav3, avg(dbms_random.value-0-0  ) avg_dbrav4
  5     ,avg(dbms_random.value-0-0-0) avg_dbrav5, avg(dbms_random.value-0-0-0) avg_dbrav6
  6  from v1000;

AVG_DBRAV1 AVG_DBRAV2 AVG_DBRAV3 AVG_DBRAV4 AVG_DBRAV5 AVG_DBRAV6
---------- ---------- ---------- ---------- ---------- ----------
.503317965 .503317965 .498451869 .498451869 .490114791 .490114791

SQL> select * from table(dbms_xplan.display_cursor('','','typical +projection'));

Plan hash value: 2919102380

-------------------------------------------------------------------------------
| Id  | Operation                      | Name | Rows  | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------
|   0 | SELECT STATEMENT               |      |       |     2 (100)|          |
|   1 |  SORT AGGREGATE                |      |     1 |            |          |
|   2 |   VIEW                         |      |     1 |     2   (0)| 00:00:01 |
|   3 |    CONNECT BY WITHOUT FILTERING|      |       |            |          |
|   4 |     FAST DUAL                  |      |     1 |     2   (0)| 00:00:01 |
-------------------------------------------------------------------------------

Column Projection Information (identified by operation id):
-----------------------------------------------------------

   1 - (#keys=0) COUNT("DBMS_RANDOM"."VALUE"()-0-0-0)[22],
       COUNT("DBMS_RANDOM"."VALUE"()-0-0)[22],
       COUNT("DBMS_RANDOM"."VALUE"()-0)[22],
       SUM("DBMS_RANDOM"."VALUE"()-0-0-0)[22],
       SUM("DBMS_RANDOM"."VALUE"()-0-0)[22], SUM("DBMS_RANDOM"."VALUE"()-0)[22]
   3 - LEVEL[4]

First plan shows in the projection sections, that others avg(dbms_random.value) were eliminated. BTW, interesting that we can see COUNT(DBMS_RANDOM.VALUE) and SUM(DBMS_RANDOM.VALUE) – I think it can force calculation error – maybe SUM counts different results of “dbms_random.value” from COUNT values.

mwidlake - October 9, 2015

Thank you very much for that – I think you have highlighted that I was not being rigorous enough when I was relying on seeing the Plan Hash Value the same. I’m sure I’ve seen from either Tom Kyte or Jonathan Lewis that this PHV does not include all the projections and filters, I should have held that in mind.

I’ll double check with my test case a little later today and confirm.

mwidlake - October 10, 2015

I’ve checked my original test case now (it’s not the one on the blog, I “simplified” the code for people to try by using ALL_OBJECTS but it makes the EXPLAIN_PLAN massive.
I see the same thing – the relevant parts of the actual plan are the same between version that get the same values for all calls to DBMS_RANDOM compared to that which gets different values, but the column projection is different. I need to find out more about what the column projection sections are saying to me (I might be badgering some friends at OOW15!) but I can see the difference (shown below).

My conclusion is this is NOT a bug. It is a side effect of using functions that are not deterministic in the context I am using them – and Oracle is very specific about how they have never in the past and never will ensure that the number of times a function is referenced will match your expectations, due to the ability of the SQL Parser to re-write your code.

-- gives same value for all columns derived from DBMS_STATS.VALUE
-- There is only 1 Count and 1 SUM reference to the function in the column projection
--
select /* mdw_x1 */ avg(dbms_random.value)
avg_dbrav1,avg(dbms_random.value) avg_dbrav2       ,avg(str_util.dr1)
avg_stutdr1   ,avg(str_util.dr2) avg_stutdr2       ,avg(addr_id)
avg_addr_id from person where pers_id >100100 and rownum < 100000

Plan hash value: 3553163410

-------------------------------------------------------------------------------------------------
| Id  | Operation                             | Name    | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                      |         |       |       |  7790 (100)|          |
|   1 |  SORT AGGREGATE                       |         |     1 |    10 |            |          |
|*  2 |   COUNT STOPKEY                       |         |       |       |            |          |
|   3 |    TABLE ACCESS BY INDEX ROWID BATCHED| PERSON  |   100K|   976K|  7790   (1)| 00:00:01 |
|*  4 |     INDEX RANGE SCAN                  | PERS_PK |       |       |   216   (1)| 00:00:01 |
-------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter(ROWNUM<100000)
   4 - access("PERS_ID">100100)

Column Projection Information (identified by operation id):
-----------------------------------------------------------

   1 - (#keys=0) COUNT("ADDR_ID")[22], COUNT("STR_UTIL"."DR2"())[22],
       COUNT("STR_UTIL"."DR1"())[22], COUNT("DBMS_RANDOM"."VALUE"())[22], SUM("ADDR_ID")[22],
       SUM("STR_UTIL"."DR2"())[22], SUM("STR_UTIL"."DR1"())[22],
       SUM("DBMS_RANDOM"."VALUE"())[22]
   2 - "ADDR_ID"[NUMBER,22]
   3 - "ADDR_ID"[NUMBER,22]
   4 - "PERSON".ROWID[ROWID,10]


-- Gives different values for columns derived from DBMS_RANDOM.VALUE
-- There are SUM and COUNT references to all of the column aliases derived from DBMS_RANDOM
-- (It is a shame the call to DBMS_RANDOM is not clearly shown but I think it is because it is at
-- "a lower level" in the SQL)
with source as (select /*+ materialize */         dbms_random.value
dbrav1    ,dbms_random.value dbrav2        ,str_util.dr1       stutdr1
 ,str_util.dr2      stutdr2        ,addr_id  from person  where pers_id
>100100 and rownum < 100000 ) select/* mdw_x3 */ avg(dbrav1)
avg_dbrav1     ,avg(dbrav2)  avg_dbrav2       ,avg(stutdr1) avg_stutdr1
   ,avg(stutdr2) avg_stutdr2       ,avg(addr_id)  avg_addr_id from
source

Plan hash value: 3667624103

--------------------------------------------------------------------------------------------------------------------
| Id  | Operation                              | Name                      | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                       |                           |       |       |  7828 (100)|          |
|   1 |  TEMP TABLE TRANSFORMATION             |                           |       |       |         |     |
|   2 |   LOAD AS SELECT                       |                           |       |       |         |     |
|*  3 |    COUNT STOPKEY                       |                           |       |       |         |     |
|   4 |     TABLE ACCESS BY INDEX ROWID BATCHED| PERSON                    |   100K|   976K|  7790   (1)| 00:00:01 |
|*  5 |      INDEX RANGE SCAN                  | PERS_PK                   |       |       |   216   (1)| 00:00:01 |
|   6 |   SORT AGGREGATE                       |                           |     1 |    65 |         |     |
|   7 |    VIEW                                |                           | 99999 |  6347K|    39   (3)| 00:00:01 |
|   8 |     TABLE ACCESS FULL                  | SYS_TEMP_0FD9D663A_5C9E6B | 99999 |   976K|    39   (3)| 00:00:01 |
--------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   3 - filter(ROWNUM<100000)
   5 - access("PERS_ID">100100)

Column Projection Information (identified by operation id):
-----------------------------------------------------------

   1 - COUNT("ADDR_ID")[22], COUNT("STUTDR2")[22], COUNT("STUTDR1")[22], COUNT("DBRAV2")[22],
       COUNT("DBRAV1")[22], SUM("ADDR_ID")[22], SUM("STUTDR2")[22], SUM("STUTDR1")[22], SUM("DBRAV2")[22],
       SUM("DBRAV1")[22]
   2 - SYSDEF[4], SYSDEF[0], SYSDEF[1], SYSDEF[112], SYSDEF[0]
   3 - "PERSON".ROWID[ROWID,10], "PERS_ID"[NUMBER,22], "ADDR_ID"[NUMBER,22], ROWNUM[8]
   4 - "PERSON".ROWID[ROWID,10], "PERS_ID"[NUMBER,22], "ADDR_ID"[NUMBER,22]
   5 - "PERSON".ROWID[ROWID,10], "PERS_ID"[NUMBER,22]
   6 - (#keys=0) COUNT("ADDR_ID")[22], COUNT("STUTDR2")[22], COUNT("STUTDR1")[22], COUNT("DBRAV2")[22],
       COUNT("DBRAV1")[22], SUM("ADDR_ID")[22], SUM("STUTDR2")[22], SUM("STUTDR1")[22], SUM("DBRAV2")[22],
       SUM("DBRAV1")[22]
   7 - "DBRAV1"[NUMBER,22], "DBRAV2"[NUMBER,22], "STUTDR1"[NUMBER,22], "STUTDR2"[NUMBER,22],
       "ADDR_ID"[NUMBER,22]
   8 - "C0"[NUMBER,22], "C1"[NUMBER,22], "C2"[NUMBER,22], "C3"[NUMBER,22], "C4"[NUMBER,22]
4. Sayan Malakshinov - October 9, 2015

btw, different executions of DBMS_RANDOM may return same results.
From documentation:
> It will automatically initialize with the date, user ID, and
> process ID if no explicit initialization is performed.

So parallel slaves may return same results:
http://orasql.org/2012/06/13/dbms_random-in-parallel/


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: