jump to navigation

My first book is now physically in my hands. August 22, 2016

Posted by mwidlake in Instrumentation, PL/SQL, publications, SQL, writing.
Tags: , ,
4 comments
Proud "parent" of a bouncing baby book

Proud “parent” of a bouncing baby book

Today a box arrived from Oracle Press. In it were a few copies of “Real-World SQL and PL/SQL” which I co-authored with Arup Nanda, Brendan Tierney, Heli Helskyaho and Alex Nuitjen. I know I only blogged about the book a couple of weeks back, how I became involved and the impact it had on my life for several months. But as I can now physically handle and read the final article, I could not resist putting up a quick post on it. Honestly, I’ll stop being a book bore soon.

My contribution to the book was three chapters in the section “Essential Everyday Advanced PL/SQL”. The idea was to covers some core, standard ways of using PL/SQL which are often overlooked or implemented without considering the impact they can have. There are a few things I cover that are often talked about, generally regarded as a good thing to do – but so often are not done! So just to quickly summarise my chapters:

Chapter 6 is about running PL/SQL from SQL, ie calling both built-in and user defined functions from SQL. It’s a great way to compartmentalise your business logic and extend the capabilities of Oracle’s SQL implementation in an easy and seamless manner. Only people are often unaware of the potential performance and read consistency impact it can have, or how Oracle 11g and 12c help reduce these issues.

Chapter 7, “Instrumenting and Profiling PL/SQL”, covers something that I feel is a major oversight in many PL/SQL development projects. Instrumenting your code, any code (not just PL/SQL), is vital to producing an application that is professional and will continue to work correctly for many, many years. However, it’s a bit like washing your hands after going to the loo – we all know it is the correct thing to do but so many people just don’t! Without instrumentation it is almost impossible to see how your code is performing, where time is spent and where problems are when they occur. I’m sick of having to guess where the problem is when people report slow performance when some basic and light-weight instrumentation will tell you exactly where the problem is. And as for profiling PL/SQL, it’s one of the rarest things to be done but it is so helpful.

It physically exists

It physically exists

Chapter 9 is on using PL/SQL for Automation and Administration. Like many people, I have automated many tasks with a PL/SQL harness – backups, partitions maintenance, metric gathering, data life-cycle management, regular data loads. You end up writing the same sort of process over and over again and usually there are several versions of such controlling frameworks across a company, written by different people (and sometimes the same people!). A large part of this chapter takes the code for creating the examples from chapter 6 and the instrumentation from chapter 7 and builds up a simple but comprehensive framework which can be used to control almost any data load or administrative task you need to do with an Oracle database. The key thing is it can be used for many, many processes so you need only the one framework. So you don’t have to keep writing them as it’s boring to keep writing them. And because the framework demonstrated includes instrumentation, the framework you implement will be easy to monitor and debug in years to come. I have to confess, I kind of wrote that chapter “just for me”. It is my standard automation framework that I now always use, so I can concentrate on the actual task being done and not the nuts-and-bolts of controlling it, and I’ve wanted to properly document it for years.

Something all the authors agreed on is that whilst most technical books describe how elements of a language or feature work, they do not really talk about the “how and why” you do things. The stuff you learn by using the language for a long time and doing a good deal of things wrong. In this book we attempt to put in some of that “how and why”. In my chapters there are a few anecdotes about when things have gone wrong and, as a result, you learn some of the “how not”:-)

I’m at a lot of conferences over the next few months, including OOW16, DOAG16 and UKOUG Tech16. If you get a copy of the book and want it signed, you’ll find me very happy to do so. Many of my co-authors are at these events too, so you could get us all to scribble all over the book. NB this will not work for electronic versions:-)

BTW do you like the t-shirt?

The Book. August 4, 2016

Posted by mwidlake in PL/SQL, SQL, writing.
Tags: , , ,
6 comments

I’ve just added a picture to the right side of this site. It is for a book about SQL and PL/SQL. If you look at the image of the front cover, at the bottom is a list of authors and, near the end, is my name. It’s all finished and at the printers, but it is not out yet – It should be published in the next few weeks.

The British part of me wants to mumble and say “oh, yes, hmmm, I did contribute to a book… but I think you should concentrate on the chapters by the other chaps, they are proper experts, very clever gentleman and lady… I was just involved in a couple of lesser chapters…”

The part of me that spent weeks and months of late nights and long weekends writing it wants to scream “Look! LOOK! I damn well got it done! And it was way more painful than any of my author friends told me it would be as, despite their best efforts, I did not get How Hard Writing A Book Is!
I BLED FOR THAT BOOK!”

And the final part of me wants to say “If you buy this book, would you mind awfully sending it to me and I’ll cut out my chapters and paste in new ones with a few more things covered and a bit more clarity and I really should have mentioned… and I’ll send it back”. Apparently this is exactly how Douglas Adams and Terry Pratchet felt about all their books, so I feel in good company in that respect. I re-wrote one chapter 3 times and I still think I could do it better. Think? I know I could do it better!!!! Next year I’ll do it better than the current better…

How did I get involved in this… nightmare? It was Brendan’s fault. I hate Brendan. My wife hates Brendan. My cat, before she passed on, hated Brendan. When I am drinking beers with him in September, around the fun-fair that is OOW16, I’m going to suddenly turn around and hit him Really Very Hard somewhere soft. Somewhere reproductive I think…

It was, I believe, March 2015 that Brendan Tierney asked me if I had ever thought of writing a book. I said “yes” and explained how I’d had some ideas back in my Teens about what “intelligent aliens” would really be like and the oddities of people – but then found Mr Adams had covered that way better than I ever could. And then I had thought about a spoof on Dungeons and Dragons but then found Pratchett had that totally covered and I now had only one idea left… “No…” he said “I mean a technical book – about Oracle”. Oh! After all, he said, I blogged, presented and wrote articles. What was the difference?

Brendan and Heli Helskyaho had come up with the idea for a book about SQL and PL/SQL which was not an intro book and not a huge tome about all aspect of either – but more about using both languages to solve real-world issues, based on real experience. It would be aimed at those who could write reasonable SQL and who could throw together a quick PL/SQL program/package but wanted to know more about how experts used the languages based on experience. They had Arup Nanda on board already as well as Alex Nuijten and Chet Justice. I knew these people! Arup is a brilliant DBA and teacher, Alex is one of the best presenters on the circuit and Chet is Oraclenerd! All are ACE Directors. So I said no – looking at the 5 of them, I was not an expert. I’m just a skilled-but-still-learning journeyman.

At this point Brendan got tetchy at me (‘being tetchy’, for non-UK people, means ‘easily annoyed but doing a very poor job of hiding you are annoyed’). “how long have you programmed in SQL and PL/SQL?” about 25 years – before PL/SQL was really ‘out there’…
“When did you last develop a production solution in PL/SQL?” About 2 months ago – it was cool, it was fully instrumented, restartable and used plain SQL for the heavy lifting…and bulk processed the rest…
“Who’s better at this than you”. Well, Adrian Billington, Boneist Dawn, Andy Clarke… for SQL Stew Ashton, Chris Saxon is sh1t hot… “so you can name your peers?!?”.
“what is the most challenging thing you have done with PL/SQL?” – I listed a few things…

The point he was making was, I’ve used both languages for two and a half decades to solve problems others had struggled with. OK, I am not the “Best”, but I’m not bad and I’ve done things wrong often enough to learn some lessons! I know I can deliver a solid solution that will either still be working properly in 10 years or, in my eyes more importantly, telling you why it is not. And the key thing was, as Brendan pointed out, I was happy to share.

So I agreed to contribute in a minor way.

And then Chet had to pull out for personal reasons – and guess who inherited half of what we was covering?:-). I thus became an equal player. (Just a quick note, Chet stayed as our tech editor and he kept me “honest”. Everyone on the book helped me, the new guy, keep up.)

Writing a book is a lot, lot, lot harder than writing a blog or an article. I’d been told about this – I was a non-technical reviewer(*) for Jonathan Lewis’s “Oracle Core” and we talked a little about it the whole process – and there was a long, long discussion between the Oaktable members about the effort and financial reward of book authorship (“an awful lot” and “sod all” respectively). I still did not get it. If you are writing a chapter that is 20 times longer than an article it is not simply 20 times harder or takes 20 times as long. It is both, plus a dash more. Part of the reason is the need to get a flow through such a large body of text and I wanted to do that across my 3 chapters. The other is, somehow a book feels more important and you want to makes sure your mistakes are kept to a minimum – both for your own pride and so as not to mislead the reader. Also, as a shared book (and I was the only new author involved) I was very conscious of letting the side down.

So the reality was that for 6 months I worked on those 3 chapters and, during that time, I struggled to maintain my duties as a house husband, the garden went to hell and my regular exercise ceased. Occasional days were so bad that the cat went unfed and my wife had to cook her own dinner. The hard stares were difficult to take, as was my wife being annoyed with me. And I was only doing a few chapters!

But it is done and I am now looking forward to seeing a copy “in the flesh”. I think that will feel weird. One of my regrets in life is that I did not stay in science long enough to be published. I feel this makes up for that.

Would I do it again? No. I’d rather go back to commuting into London every day and I hated that.

Will I change my mind in a year or two? Maybe. I still have that one idea for a Sci-Fi book.

(*) I represented the “knows some stuff but is still learning” intended reader of Jonathan’s book – I was not correcting mistakes or advising him on technical content. I was saying “please tell me more about X as I’m still confused”. I rather enjoyed it.

Getting Your Transaction SCN – USERENV(COMMITSCN) January 19, 2016

Posted by mwidlake in development, performance, SQL.
Tags: , , ,
2 comments

A few days ago I was introduced (or re-introduced) to USERENV(‘COMMITSCN’) by Jonathan Lewis. This is an internal function that allows limited access to the SCN of your transaction.

I was trying to find a way to get the actual commit SCN easily as it struck me that Oracle would have it to hand somewhere and it would be unique to the change and generated very efficiently. I could not find anything to do it so I asked Jonathan and he pointed me straight to this post he did about it a while back. What a nice chap. However, the post is from 1999 (last CENTURY!) so I thought I should just check it out first…

I’ll say up front that this is an undocumented feature of a now-deprecated function. Well, almost undocumented – older SQL reference manuals mention that ‘commitscn’ returns a NUMBER as opposed to the VARCHAR2 returned by most parameters you can use with USERENV, but it does not list it as a valid parameter for that function.
USERENV has been deprecated since Oracle 10g (see the old 10g documentation link here about USERENV) and you have been instructed to use SYS_CONTEXT(‘userenv’,’parameter‘) as described in the 12c database SQL reference manual here. However, there is no way to get the commit SCN from SYS_CONTEXT that I can find, so I thought I’d check out if USERENV(‘COMMITSCN’) still works. It does, on my version of Oracle 12.1.0.2!

There are some strict limits to this function. To begin with, you can’t select it, you can only use it on insert/update:


-- attempt a simple select of the SCN
mdw> select userenv('commitscn') from dual;
select userenv('commitscn') from dual
               *
ERROR at line 1:
ORA-01725: USERENV('COMMITSCN')  not allowed here

--But I can use in an insert,
mdw> insert into test_scn (seq_no,vc1,scn_no)
  2  values (100,'abcd',userenv('commitscn'))

1 row created.

mdw> select * from test_scn where seq_no=100

    SEQ_NO VC1            SCN_NO
---------- ---------- ----------
       100 abcd         11144739

-- Now commit my new record
mdw> commit;
Commit complete.

mdw> select * from test_scn where seq_no=100
  2  /

    SEQ_NO VC1            SCN_NO
---------- ---------- ----------
       100 abcd         11144753

--LOOK at the value for SCN_NO now! Compare to before the commit!

If you look at the above you will see a couple of things. The first is that, as I said, you cannot SELECT the function USERENV(‘COMMITSCN’).
The other is, though a value is put into the column when I insert a row using it, and I see that when I query the information back, it changes when I commit. This is because Oracle is recording something at the point of commit, not at the point of the SQL statement running – and the new SCN is only generated when you commit. A lot could have happened since I did the INSERT, I might have gone for a cup of tea and a batch job kicked off doing 1 million transactions, each with it’s own SCN. So though a placeholder of the current SCN is put into your view of the table row, the value put in the actual table is generated at the time of the commit.

Another limiting rule is that you can only reference USERENV(‘COMMITSCN’) once in a transaction, for one row. If I try and create 2 rows using the function in the same transaction I get an error, if I try to update more than 1 row I get an error:

mdw> insert into test_scn (seq_no,vc1,scn_no)
  2  values (101,'abcd',userenv('commitscn'))

1 row created.

mdw> insert into test_scn (seq_no,vc1,scn_no)
  2  values (102,'abcd',userenv('commitscn'))

insert into test_scn (seq_no,vc1,scn_no)
            *
ERROR at line 1:
ORA-01721: USERENV(COMMITSCN) invoked more than once in a transaction

-- now test updating several records
mdw> commit;
Commit complete.

mdw> select * from test_scn;

    SEQ_NO VC1            SCN_NO
---------- ---------- ----------
         1 AAAAAAA      11143743
         2 AAAAAAA      11143746
         3 AAAAAAA      11143749
         4 AAAAAAA      11143774
         5 AAAAAAA      11143777
       100 abcd         11144753
       101 abcd         11145543

mdw> update test_scn set scn_no = userenv('commitscn');
update test_scn set scn_no = userenv('commitscn')
       *
ERROR at line 1:
ORA-01721: USERENV(COMMITSCN) invoked more than once in a transaction

-- but one row works
mdw> update test_scn set scn_no = userenv('commitscn') where rownum =1;
1 row updated.

USERENV(‘COMMITSCN’) is old, undocumented and limited in use. So why am I looking at it, let alone even telling you all about it? {Apart from the fact that a lot of you *love* this sort of obscure, tid-bitty stuff}. Well, as there is no replacement for it that I am aware of. You can get the current SCN in a couple of ways, the easiest probably being to get it from V$DATABASE:

mdw> select current_scn from v$database;
any key>

CURRENT_SCN
-----------
   11146718

However, that is the last SCN used at the time you check it and is not the SCN when you commit. ie it is a different thing. I always find it irksome on those odd occasions when something is deprecated in Oracle with nothing really to replace it.

I just demonstrate again that USERENV(‘COMMITSCN’) is a little special below, and not the same as just selecting SCN from V$DATABASE. Before I go any further, I think the value USERENV(‘COMMITSCN’) puts into the table is the actual COMMIT SCN minus 1. I mostly think this as Jonathan said so:-). I do see each time I run this test that the first select from V$DATABASE and then my insert and a commit straight away results in a value in the table 2 higher than the select.

Further iterations (2nd and 3rd in this case) show the value selected from V$DATABASE and the value inserted into TEST_SCN immediately after are the same, and are 3 higher than the previous iteration. I anticipated an increase of two, once for the change to the UNDO tablespace for the insert and once for the insert. I am not sure where the third one comes in.

However, in the fourth iteration I have a PAUSE in my SQL*Plus script between checking V$DATABASE and doing my insert and, in a second session, I do some simple inserts and commits {it does not matter what, so I don’t show it}. Thus the difference between the SCN I collected from V$DATABASE and the value inserted into the table.
Finally, in the fifth iteration, I check the value in V$DATABASE, do the insert, query it back and see the two are the same. And THEN I pause so I can do some changes and commit them in my second session. After I’ve done that I continue my first session which commits my latest insert into TEST_SCN. I check the values actually stored in the table and, just as at the top of this post, you see that the value actually preserved in the table is a later SCN than the placeholder one. It is doing something special.

(the below has the noddy code to create my test table and sequence as well as the test)

-- test_scn1
--create table test_scn (seq_no number,vc1 varchar2(10),scn_no number)
-- create sequence scn_seq;
select current_scn from v$database;
insert into test_scn values (scn_seq.nextval,'AAAAAAA',userenv('commitscn'));
commit;
select * from test_scn;
select current_scn from v$database;
insert into test_scn values (scn_seq.nextval,'AAAAAAA',userenv('commitscn'));
commit;
select * from test_scn;
select current_scn from v$database;
insert into test_scn values (scn_seq.nextval,'AAAAAAA',userenv('commitscn'));
commit;
select * from test_scn;
select current_scn from v$database;
pause 'commit some stuff in second session and then press any key'
insert into test_scn values (scn_seq.nextval,'AAAAAAA',userenv('commitscn'));
commit;
select * from test_scn;
select current_scn from v$database;
insert into test_scn values (scn_seq.nextval,'AAAAAAA',userenv('commitscn'));
pause 'commit some stuff in second session again and then press any key'
select * from test_scn;
commit;
select * from test_scn;
select current_scn from v$database;

-- the output of the test
mdw> @test_scn1

--check V$DATABASE SCN
CURRENT_SCN
-----------
   11147809

-- Create and commit 1 row
1 row created.
Commit complete.

    SEQ_NO VC1            SCN_NO
---------- ---------- ----------
        11 AAAAAAA      11147811
-- note that the inserted SCN is 2 higher than the current SCN.

--Same steps, 2nd iteration
CURRENT_SCN
-----------
   11147814

1 row created.
Commit complete.

    SEQ_NO VC1            SCN_NO
---------- ---------- ----------
        11 AAAAAAA      11147811
        12 AAAAAAA      11147814
-- select SCN and inserted SCN are the same, 3 higher than first iteration

-- same steps, 3rd iteration
CURRENT_SCN
-----------
   11147817

1 row created.
Commit complete.

    SEQ_NO VC1            SCN_NO
---------- ---------- ----------
        11 AAAAAAA      11147811
        12 AAAAAAA      11147814
        13 AAAAAAA      11147817
-- select SCN and inserted SCN are the same, 3 higher than first iteration

-- 4th iteration, a pause
CURRENT_SCN
-----------
   11147820

'commit some stuff in second session and then press any key'
-- I did indeed change and commit some stuff in second session, before I create my record in test_scn

1 row created.
Commit complete.

    SEQ_NO VC1            SCN_NO
---------- ---------- ----------
        11 AAAAAAA      11147811
        12 AAAAAAA      11147814
        13 AAAAAAA      11147817
        14 AAAAAAA      11147831
-- larger gap in SCN (11147817 to 11147831

-- 5th iteration, pause now after insert and before commit
CURRENT_SCN
-----------
   11147834

1 row created.
'commit some stuff in second session again and then press any key'
-- I did indeed change and commit some stuff in second session 

    SEQ_NO VC1            SCN_NO
---------- ---------- ----------
        11 AAAAAAA      11147811
        12 AAAAAAA      11147814
        13 AAAAAAA      11147817
        14 AAAAAAA      11147831
        15 AAAAAAA      11147834

-- Notice the current_scn from V$DATABASE and the last row in the table match, 11147834..

Commit complete.

    SEQ_NO VC1            SCN_NO
---------- ---------- ----------
        11 AAAAAAA      11147811
        12 AAAAAAA      11147814
        13 AAAAAAA      11147817
        14 AAAAAAA      11147831
        15 AAAAAAA      11147842

-- But after the commit the SCN in row "15" has increased to 11147842.

CURRENT_SCN
-----------
   11147851

-- and the next check of V$DATABASE SCN shows the usual increase of 3 by the commit.

As you can see from the above, USERENV(‘COMMITSCN’) is doing something a little special and, despite all the limitations, I might actually have a use for it…

Pragma UDF – Some Current Limitations November 11, 2015

Posted by mwidlake in performance, PL/SQL, SQL, Testing.
Tags: , , , ,
6 comments

There are currently some limitations to when pragma UDF will speed up your calls to PL/SQL functions from SQL.

In my post introducing the new pragma UDF feature of Oracle 12c I explained how it can be used to reduce the impact of context switching when you call a PL/SQL function from SQL.

In my example I showed how running a SQL-only SELECT statement that formatted a name for display over 100,000 records took 0.03 seconds went up to 0.33 seconds when the formatting SQL was put in a user defined PL/SQL function. This impact on performance is a shame as it is so beneficial to encapsulate business logic in one single place with PL/SQL. Stating that the PL/SQL function is a user defined one with the pragma UDF option reduced the run time to 0.08 seconds – which is removing most of the context switching overhead. Check out the prior post for full details.

This improvement in performance is great news and is as good, and sometimes better, than using the other new capability of 12c – allowing you to state a function as part of a SQL statement using the WITH clause, if you know about that (I plan to do a further post on that). As a quick example, here is my display name function code expressed within a WITH clause:

with 
  function l_disp_name(p_sn      in varchar2
                      ,p_fn1     in varchar2
                      ,p_fn2     in varchar2 :=null  
                      ,p_title   in varchar2 :=null )
return varchar2
is
v_return     varchar2(1000);
begin
  v_return := case when p_title is null then ''
                   else initcap(p_title)||' '
              end
            ||initcap(p_fn1)||' '
            ||case when p_fn2 is null then ''
                   else substr(p_fn2,1,1)||' '
              end
            ||initcap(p_sn);
return v_return;
end l_disp_name;
select  max(l_disp_name(p_sn =>surname           ,p_fn1  =>first_forename
                       ,p_fn2=>second_forename   ,p_title=>pers_title)        ) text_output
       ,count(*)
from pers
/

The above runs in 0.10 seconds, just slightly slower than 0.08 for my pragma UDF function

However, I need to warn you of some current limitations to pragma UDF. Nearly all the examples on the web so far are

  • using very, very simple functions that take in a number and return a number
  • Use a stand-alone stored function

And they work fine. However, I had real trouble getting a performance gain when I was working with my function that took in four varchar2 inputs and returned a varchar2 value. No error was given when I marked the function with pragma UDF but there was no performance gain (or loss).

I eventually worked out some limitations to pragma UDF on my version of Oracle – 12.1.0.2.0

  1. It gives a performance boost when the inputs and return values are NUMBER, VARCHAR2, multiple VARCHAR2 IN parameters
  2. There is no performance boost when either or both the IN parameter or RETURN value is a DATE
  3. There is no performance boost if there are any default values for VARCHAR2 IN parameters
  4. If the function gains a performance benefit from pragma UDF as a standalone stored function, it appears to also gain an advantage if it is a function defined as pragma UDF within a package – so you can still keep all your functions in packages.

You might notice that my example of using the WITH clause states a function that has default values. The WITH option gains the performance advantage of that feature just fine with IN parameter defaults.

The take-home message is that, at present, pragma UDF only seems to help functions with certain types of IN or RETURN values and is nullified by default values – so if you see no performance gain for your functions, this might be why. I need to stress that my tests were not exhaustive, I have not investigated many other combinations.

I’ve discussed the issue with a couple of people within Oracle and the relevant Product Manager is looking to investigate further for me, which is jolly decent of the fellow.

My investigation is of course only by empirical testing, it does not reveal how pragma UDF works. But, as I said in my first post, it seems to aid how information is passed between the PL/SQL and SQL engines as it is variation in those that seem to nullify the benefit of pragma UDF. If you want to duplicate my tests, you can do with the below scripts. I show my test output first, with comments produced with PROMPT commands. I then give you the SQL to create the test table, the functions and package I used and the test script. Feel free to take, expand and let me know of anything different or further you may find. I say nothing of interest after the scripts, so this is in effect the end of the post:-)

The output of my test, with prompts:

running udf_tests
investigating why pragam udf helps some simple functions and not others
---------------------------------------------------------------------- --

simple number in-number out function
NUMBER_OUTPUT   COUNT(*)                                                                            
------------- ----------                                                                            
  10000000000     100000                                                                            
Elapsed: 00:00:00.12

NUMBER_OUTPUT   COUNT(*)                                                                            
------------- ----------                                                                            
  10000000000     100000                                                                            
Elapsed: 00:00:00.03
** udf helps

simple varchar in-varchar out function
TEXT_OUTPUT                      COUNT(*)                                                           
------------------------------ ----------                                                           
XYYYY                              100000                                                           
Elapsed: 00:00:00.12

TEXT_OUTPUT                      COUNT(*)                                                           
------------------------------ ----------                                                           
XYYYY                              100000                                                           
Elapsed: 00:00:00.04
** udf helps

two varchar in-varchar out function, is the issue with more than one in parameter?
TEXT_OUTPUT                      COUNT(*)                                                           
------------------------------ ----------                                                           
XYYYYYYYYY                         100000                                                           
Elapsed: 00:00:00.14

TEXT_OUTPUT                      COUNT(*)                                                           
------------------------------ ----------                                                           
XYYYYYYYYY                         100000                                                           
Elapsed: 00:00:00.04
** udf helps

simple date in-date out function
DATE_OUTPUT            COUNT(*)                                                                     
-------------------- ----------                                                                     
14-MAY-2010 13:11        100000                                                                     
Elapsed: 00:00:00.15

DATE_OUTPUT            COUNT(*)                                                                     
-------------------- ----------                                                                     
21-NOV-2004 13:11        100000                                                                     
Elapsed: 00:00:00.15
***************************************************************SIMILAR TIME!!!
udf does not help

is date out the issue,  date in-num out function
NUMBER_OUTPUT   COUNT(*)                                                                            
------------- ----------                                                                            
      2454431     100000                                                                            
Elapsed: 00:00:00.17

NUMBER_OUTPUT   COUNT(*)                                                                            
------------- ----------                                                                            
      2454231     100000                                                                            
Elapsed: 00:00:00.18
***************************************************************SIMILAR TIME!!!
udf does not help

is date in the issue,  num in-date out function
DATE_OUTPUT            COUNT(*)                                                                     
-------------------- ----------                                                                     
07-AUG-2018 18:11        100000                                                                     
Elapsed: 00:00:00.21

DATE_OUTPUT            COUNT(*)                                                                     
-------------------- ----------                                                                     
11-NOV-2015 17:57        100000                                                                     
Elapsed: 00:00:00.21
***************************************************************SIMILAR TIME!!!
udf does not help

so back to my original function I had issues with
a difference with the multiple vcs in func and my orig func is my orig had defaults
thus I will try a version with no defaults
TEXT_OUTPUT                      COUNT(*)                                                           
------------------------------ ----------                                                           
Ms Wyyyyyyyyy W Wyyyyyyyyy         100000                                                           
Elapsed: 00:00:00.19

TEXT_OUTPUT                      COUNT(*)                                                           
------------------------------ ----------                                                           
Ms Wyyyyyyyyy W Wyyyyyyyyy         100000                                                           
Elapsed: 00:00:00.08
****************************************************************UDF has an IMPACT

now with one of the parameters set to a default
TEXT_OUTPUT                      COUNT(*)                                                           
------------------------------ ----------                                                           
Ms Wyyyyyyyyy W Wyyyyyyyyy         100000                                                           
Elapsed: 00:00:00.32

TEXT_OUTPUT                      COUNT(*)                                                           
------------------------------ ----------                                                           
Ms Wyyyyyyyyy W Wyyyyyyyyy         100000                                                           
Elapsed: 00:00:00.32
****************************************************************UDF has NO IMPACT
****************************************************************ALSO defaults cause both versions to be slower

now call the simple disp_name_udf function that benefits standalone from within a package

standalone
TEXT_OUTPUT                      COUNT(*)                                                           
------------------------------ ----------                                                           
Ms Wyyyyyyyyy W Wyyyyyyyyy         100000                                                           
Elapsed: 00:00:00.08

within package
TEXT_OUTPUT                      COUNT(*)                                                           
------------------------------ ----------                                                           
Ms Wyyyyyyyyy W Wyyyyyyyyy         100000                                                           
Elapsed: 00:00:00.08
*********************************************** WORKS - so long as neither spec of body have prm defaults

and just to round of, using a subquery factored function which my prior tests showed reduced overhead
TEXT_OUTPUT                      COUNT(*)                                                           
------------------------------ ----------                                                           
Ms Wyyyyyyyyy W Wyyyyyyyyy         100000                                                           
Elapsed: 00:00:00.10

the WITH function benefits even with defaults
TEXT_OUTPUT                      COUNT(*)                                                           
------------------------------ ----------                                                           
Ms Wyyyyyyyyy W Wyyyyyyyyy         100000                                                           
Elapsed: 00:00:00.10

Creating the test table

drop table pers;
create table pers
(pers_id             number(8)    not null
,surname             varchar2(30) not null
,first_forename      varchar2(30) not null
,second_forename     varchar2(30)
,pers_title          varchar2(10)
,sex_ind             char(1)      not null
,dob                 date
,addr_id             number(8)
,pers_comment        varchar2(2000)
)
/
insert into pers
select rownum 
      ,rpad(chr(65+mod(rownum,24)),10,chr(65+mod(rownum,25))) 
      ,rpad(chr(65+mod(rownum,24)),10,chr(65+mod(rownum,25))) @cre
      ,rpad(chr(65+mod(rownum,24)),10,chr(65+mod(rownum,25))) 
      ,decode(mod(rownum,4),0,'MR',1,'MRS',2,'Ms',3,'MR','DR')
      ,decode(mod(rownum,2),0,'M',1,'F')
      ,sysdate - (3000+mod(rownum,30000))
      ,rownum +1001
      ,rpad(chr(65+mod(rownum,24)),200,chr(65+mod(rownum,25))) 
from dual
connect by level < 100001
/

Creating the functions and a small package

--num_num
CREATE OR REPLACE FUNCTION normal_num_num(p_id IN NUMBER) RETURN NUMBER IS
v_num number;
BEGIN
  v_num:=p_id*p_id;
  RETURN v_num;
END;
/
CREATE OR REPLACE FUNCTION udf_num_num(p_id IN NUMBER) RETURN NUMBER IS
PRAGMA UDF;
v_num number;
BEGIN
  v_num:=p_id*p_id;
  RETURN v_num;
END;
/
--
-- vc_vc
CREATE OR REPLACE FUNCTION normal_vc_vc(p_id IN varchar2) RETURN varchar2 IS
v_vc varchar2(100);
BEGIN
  v_vc:=substr(p_id,1,5);
  RETURN v_vc;
END;
/
CREATE OR REPLACE FUNCTION udf_vc_vc(p_id IN varchar2) RETURN varchar2 IS
PRAGMA UDF;
v_vc varchar2(100);
BEGIN
  v_vc:=substr(p_id,1,5);
  RETURN v_vc;
END;
/
-- vc_vc_2
CREATE OR REPLACE FUNCTION normal_vc_vc_2(p_id1 IN varchar2,p_id2 IN varchar2) RETURN varchar2 IS
v_vc varchar2(100);
BEGIN
  v_vc:=substr(p_id1,1,5)||substr(p_id2,2,5);
  RETURN v_vc;
END;
/
CREATE OR REPLACE FUNCTION udf_vc_vc_2(p_id1 IN varchar2,p_id2 IN varchar2) RETURN varchar2 IS
PRAGMA UDF;
v_vc varchar2(100);
BEGIN
  v_vc:=substr(p_id1,1,5)||substr(p_id2,2,5);
  RETURN v_vc;
END;
/
--
-- dt_dt
CREATE OR REPLACE FUNCTION normal_dt_dt(p_id IN date) RETURN date IS
v_dt date;
BEGIN
  v_dt:=p_id+1000;
  RETURN v_dt;
END;
/
CREATE OR REPLACE FUNCTION udf_dt_dt(p_id IN date) RETURN date IS
PRAGMA UDF;
v_dt date;
BEGIN
  v_dt:=p_id-1000;
  RETURN v_dt;
END;
/
-- dt_num
CREATE OR REPLACE FUNCTION normal_dt_num(p_id IN date) RETURN number IS
v_num number;
BEGIN
  v_num:=to_char(p_id,'J')+100;
  RETURN v_num;
END;
/
CREATE OR REPLACE FUNCTION udf_dt_num(p_id IN date) RETURN number IS
PRAGMA UDF;
v_num number;
BEGIN
  v_num:=to_char(p_id,'J')-100;
  RETURN v_num;
END;
/
-- num_dt
CREATE OR REPLACE FUNCTION normal_num_dt(p_id IN number) RETURN DATE IS
v_dt date;
BEGIN
  v_dt:=sysdate+(p_id/100);
  RETURN v_dt;
END;
/
CREATE OR REPLACE FUNCTION udf_num_dt(p_id IN number) RETURN DATE IS
PRAGMA UDF;
v_dt date;
BEGIN
  v_dt:=sysdate-(p_id/100);
  RETURN v_dt;
END;
/
create or replace function normal_disp_name (p_sn      in varchar2
                       ,p_fn1     in varchar2
                       ,p_fn2     in varchar2  
                       ,p_title   in varchar2  ) return varchar2 is
v_return     varchar2(1000);
begin
  v_return := case when p_title is null then ''
                   else initcap(p_title)||' '
              end
            ||initcap(p_fn1)||' '
            ||case when p_fn2 is null then ''
                   else substr(p_fn2,1,1)||' '
              end
            ||initcap(p_sn);
return v_return;
end;
/
create or replace function udf_disp_name (p_sn      in varchar2
                       ,p_fn1     in varchar2
                       ,p_fn2     in varchar2  
                       ,p_title   in varchar2  ) return varchar2 is
PRAGMA UDF;
v_return     varchar2(1000);
begin
  v_return := case when p_title is null then ''
                   else initcap(p_title)||' '
              end
            ||initcap(p_fn1)||' '
            ||case when p_fn2 is null then ''
                   else substr(p_fn2,1,1)||' '
              end
            ||initcap(p_sn);
return v_return;
end;
/
create or replace function normal_disp_name_defaults (p_sn      in varchar2
                       ,p_fn1     in varchar2
                       ,p_fn2     in varchar2  
                       ,p_title   in varchar2 :=null  ) return varchar2 is
v_return     varchar2(1000);
begin
  v_return := case when p_title is null then ''
                   else initcap(p_title)||' '
              end
            ||initcap(p_fn1)||' '
            ||case when p_fn2 is null then ''
                   else substr(p_fn2,1,1)||' '
              end
            ||initcap(p_sn);
return v_return;
end;
/
create or replace function udf_disp_name_defaults (p_sn      in varchar2
                       ,p_fn1     in varchar2
                       ,p_fn2     in varchar2  
                       ,p_title   in varchar2 :=null ) return varchar2 is
PRAGMA UDF;
v_return     varchar2(1000);
begin
  v_return := case when p_title is null then ''
                   else initcap(p_title)||' '
              end
            ||initcap(p_fn1)||' '
            ||case when p_fn2 is null then ''
                   else substr(p_fn2,1,1)||' '
              end
            ||initcap(p_sn);
return v_return;
end;
/
create or replace package t_pkg as
function udf_disp_name (p_sn      in varchar2
                       ,p_fn1     in varchar2
                       ,p_fn2     in varchar2  
                       ,p_title   in varchar2 ) return varchar2;
end t_pkg;
/
create or replace package body t_pkg as
function udf_disp_name (p_sn      in varchar2
                       ,p_fn1     in varchar2
                       ,p_fn2     in varchar2  
                       ,p_title   in varchar2 ) return varchar2 is
PRAGMA UDF;
v_return     varchar2(1000);
begin
  v_return := case when p_title is null then ''
                   else initcap(p_title)||' '
              end
            ||initcap(p_fn1)||' '
            ||case when p_fn2 is null then ''
                   else substr(p_fn2,1,1)||' '
              end
            ||initcap(p_sn);
return v_return;
end;
end t_pkg;
/

The test script

-- udf_tests
set lines 100 pages 50
set feed off
col text_output     form a30
col number_output   form 99999999999
col date_output     form a20
spool udf_tests.lst
prompt  running udf_tests
prompt
set pause off
set autotrace off
set timi on
prompt investigating why pragam udf helps some simple functions and not others
prompt ----------------------------------------------------------------------- 
--
prompt
--
prompt simple number in-number out function
select /* mdw_16a */       max(normal_num_num(pers_id)) number_output
       ,count(*) from pers
/
select /* mdw_16b */        max(udf_num_num(pers_id)) number_output
       ,count(*)  from pers
/
prompt
prompt ** udf helps
prompt
--
prompt simple varchar in-varchar out function
select /* mdw_16c */       max(normal_vc_vc(surname)) text_output
       ,count(*) from pers
/
select /* mdw_16d */       max(udf_vc_vc(surname)) text_output
       ,count(*)  from pers
/
prompt
prompt ** udf helps
prompt
--
--
prompt  two varchar in-varchar out function, is the issue with more than one in parameter?
select /* mdw_16e */       max(normal_vc_vc_2(surname,first_forename)) text_output
       ,count(*) from pers
/
select /* mdw_16f */       max(udf_vc_vc_2(surname,first_forename)) text_output
       ,count(*)  from pers
/
prompt
prompt ** udf helps
prompt
--
prompt simple date in-date out function
select /* mdw_16g */       max(normal_dt_dt(DOB)) date_output
       ,count(*) from pers
/
select /* mdw_16h */       max(udf_dt_dt(DOB)) date_output
       ,count(*)  from pers
/
prompt ***************************************************************SIMILAR TIME!!!
prompt udf does not help
prompt

--
prompt is date out the issue,  date in-num out function
select /* mdw_16i */       max(normal_dt_num(DOB)) number_output
       ,count(*) from pers
/
select /* mdw_16j */       max(udf_dt_num(DOB)) number_output
       ,count(*)  from pers
/
prompt ***************************************************************SIMILAR TIME!!!
prompt udf does not help

--
prompt is date in the issue,  num in-date out function
select /* mdw_16k */       max(normal_num_dt(pers_id)) date_output
       ,count(*) from pers
/
select /* mdw_16l */       max(udf_num_dt(pers_id)) date_output
       ,count(*)  from pers
/
prompt ***************************************************************SIMILAR TIME!!!
prompt udf does not help
--
--
prompt
prompt so back to my original function I had issues with
prompt a difference with the multiple vcs in func and my orig func is my orig had defaults
prompt thus I will try a version with no defaults
prompt 
select /* mdw_16m */
        max(normal_disp_name(surname,first_forename,second_forename,pers_title)) text_output
       ,count(*) from pers
/
select /* mdw_16n */
        max(udf_disp_name(surname,first_forename,second_forename,pers_title)) text_output
       ,count(*)  from pers
/
prompt ****************************************************************UDF has an IMPACT
prompt
prompt
prompt now with one of the parameters set to a default 
select /* mdw_16o */
        max(normal_disp_name_defaults(surname,first_forename,second_forename,pers_title)) text_output
       ,count(*) from pers
/
select /* mdw_16p */
        max(udf_disp_name_defaults(surname,first_forename,second_forename,pers_title)) text_output
       ,count(*)  from pers
/
prompt ****************************************************************UDF has NO IMPACT
prompt ****************************************************************ALSO defaults cause both versions to be slower
prompt 
prompt now call the simple disp_name_udf function that benefits standalone from within a package
prompt
prompt standalone
select /* mdw_16q */
        max(udf_disp_name(surname,first_forename,second_forename,pers_title)) text_output
       ,count(*)  from pers
/
prompt
prompt within package
select /* mdw_16r */
        max(t_pkg.udf_disp_name(surname,first_forename,second_forename,pers_title)) text_output
       ,count(*)  from pers
/
prompt *********************************************** WORKS - so long as neither spec of body have prm defaults
prompt
prompt and just to round of, using a subquery factored function which my prior tests showed reduced overhead
with 
  function l_disp_name(p_sn      in varchar2
                      ,p_fn1     in varchar2
                      ,p_fn2     in varchar2  
                      ,p_title   in varchar2 )
return varchar2
is
v_return     varchar2(1000);
begin
  v_return := case when p_title is null then ''
                   else initcap(p_title)||' '
              end
            ||initcap(p_fn1)||' '
            ||case when p_fn2 is null then ''
                   else substr(p_fn2,1,1)||' '
              end
            ||initcap(p_sn);
return v_return;
end l_disp_name;
select /*mdw_16s */
        max(l_disp_name(p_sn =>surname           ,p_fn1  =>first_forename
                       ,p_fn2=>second_forename   ,p_title=>pers_title)        ) text_output
       ,count(*)
from pers
/

prompt the WITH function benefits even with defaults
with 
  function l_disp_name(p_sn      in varchar2
                      ,p_fn1     in varchar2
                      ,p_fn2     in varchar2 :=null  
                      ,p_title   in varchar2 :=null )
return varchar2
is
v_return     varchar2(1000);
begin
  v_return := case when p_title is null then ''
                   else initcap(p_title)||' '
              end
            ||initcap(p_fn1)||' '
            ||case when p_fn2 is null then ''
                   else substr(p_fn2,1,1)||' '
              end
            ||initcap(p_sn);
return v_return;
end l_disp_name;
select /*mdw_16t */
        max(l_disp_name(p_sn =>surname           ,p_fn1  =>first_forename
                       ,p_fn2=>second_forename   ,p_title=>pers_title)        ) text_output
       ,count(*)
from pers
/
--
spool off

Pragma UDF – Speeding Up your PL/SQL Functions Called From SQL November 4, 2015

Posted by mwidlake in performance, PL/SQL, SQL.
Tags: , , ,
6 comments

A new feature for PL/SQL was introduced in V12, pragma UDF. UDF stands for User Defined Functions. It can speed up any SQL you have that uses PL/SQL functions you created yourself.

{please see this second post on some limitations of pragma UDF in respect of IN & RETURN data types and parameter defaults}.

We can create our own functions in PL/SQL and they can be called from both PL/SQL and SQL. This has been possible since V7.3 and is used extensively by some sites to extend the capabilities of the database and encapsulate business logic.

A problem with this, though, is that every time you swap from SQL to PL/SQL (or the other way around) you have to do a context switch each time, which can be quite cpu and memory intensive. If you are using your own PL/SQL function in the SELECT list of a SQL statement and you are selecting a lot of rows (say as part of a business report) then the overhead can be quite considerable as you could be doing a context switch per row. I won’t go into too much detail here (partly as I go in to considerable detail on the subject in a book I am working on for 2016) on how you can investigate the context switching and when exactly it occurs, but I will show you one of the two new ways in Oracle 12 to reduce the overhead, namely PRAGMA UDF. At present this seems to be a little used and rarely-mentioned feature on the blogsphere, with articles just covering simple examples of almost no-business-function, numeric functions.

I’ll give you a slightly less simple example but my next post will give you details on some limitations of pragma UDF. Here I am just setting the scene. I have the below PERSON table which has the parts of the names in distinct columns, with the contents forced to upper case (as is standard practice). We will create a function to provide a nicely init-capped and spaced display name and a second function which is identical but uses PRAGMA UDF.

PERSON
Name                                     Null?    Type
---------------------------------------- -------- ---------------
PERS_ID                                  NOT NULL NUMBER(8)
SURNAME                                  NOT NULL VARCHAR2(30)
FIRST_FORENAME                           NOT NULL VARCHAR2(30)
SECOND_FORENAME                                   VARCHAR2(30)
PERS_TITLE                                        VARCHAR2(10)
SEX_IND                                  NOT NULL CHAR(1)
DOB                                               DATE
ADDR_ID                                           NUMBER(8)
STAFF_IND                                         CHAR(1)
LAST_CONTACT_ID                                   NUMBER(8)
PERS_COMMENT                                      VARCHAR2(2000)


create or replace function normal_disp_name (p_sn      in varchar2
                                            ,p_fn1     in varchar2
                                            ,p_fn2     in varchar2  
                                            ,p_title   in varchar2  ) return varchar2 is
v_return     varchar2(1000);
begin
  v_return := case when p_title is null then ''
                   else initcap(p_title)||' '
              end
            ||initcap(p_fn1)||' '
            ||case when p_fn2 is null then ''
                   else substr(p_fn2,1,1)||' '
              end
            ||initcap(p_sn);
return v_return;
end;
/
create or replace function udf_disp_name (p_sn      in varchar2
                                         ,p_fn1     in varchar2
                                         ,p_fn2     in varchar2  
                                         ,p_title   in varchar2  ) return varchar2 is
-- The Below is the KEY bit
PRAGMA UDF;
v_return     varchar2(1000);
-- {Identical from here}

-- select some data with one of the functions, it does not matter which
select pers_title title,    first_forename    ,second_forename    , surname
      ,normal_disp_name(p_sn =>surname           ,p_fn1  =>first_forename
                       ,p_fn2=>second_forename   ,p_title=>pers_title) display_name
from person
...


TITLE   first_fn   secon_fn   SURNAME         DISPLAY_NAME
------- ---------- ---------- --------------- ----------------------------
MR      HARRISON   RICHARD    HARRIS          Mr Harrison R Harris
MRS     ANNEKA     RACHAEL    HARRIS          Mrs Anneka R Harris
MRS     NICKIE     ALISON     ELWIG           Mrs Nickie A Elwig
MASTER  JAMES      DENZIL     ELWIG           Master James D Elwig
MR      JEFF                  GARCIA          Mr Jeff Garcia
...
MRS     AMELIA     MARIA      ORPINGTON-SMYTH Mrs Amelia M Orpington-Smyth

So we have our test table, you can see my normal_disp_name function and that the *only* difference with the second version is the inclusion of PRAGMA_UDF in the declaration section. That is partly why it is such a nice feature, you can just add this one line to existing code and you should get the benefit. Should….. {see second post when I do it}

Finally, I show some code using the function and the output.

To demonstrate the impact of context switching I will select 100,000 records from my test table in 3 ways: using only native SQL functions and thus no context switching; using my traditional PL/SQL function which suffers from context switching; with my new “pragma UDF” function to reduce the overhead of the context switching.

select avg(length(
      case when pers_title is null then ''
                   else initcap(pers_title)||' '
              end
            ||initcap(first_forename)||' '
            ||case when second_forename is null then ''
                   else substr(second_forename,1,1)||' '
              end
            ||initcap(surname)
          )      )  avg_name_length
       ,count(*)
from person
where pers_id > 100100
and rownum < 100000

select  avg(length(normal_disp_name(p_sn =>surname           ,p_fn1  =>first_forename
                                   ,p_fn2=>second_forename   ,p_title=>pers_title)        ) ) disp_name_len
       ,avg(addr_id)
       ,count(*)
from person
where pers_id > 100100 and rownum < 100000

select  avg(length(udf_disp_name(p_sn =>surname           ,p_fn1  =>first_forename
                                ,p_fn2=>second_forename   ,p_title=>pers_title)        ) ) disp_name_len
       ,avg(addr_id)
       ,count(*)
from person
where pers_id > 100100 and rownum < 100000

One thing to mention in passing is that the code using either function is much easier to read and self-documenting. This is one of the benefits of proceduralising your code, as well as creating just one place to maintain it. If I had to re-use that native-SQL section in a half-dozen reports I would probably mess up at least one of the times I cut-and-paste it and I would now have several places to maintain that code.

I ran the test statements several times and took the average of the 2nd to 6th runs, to remove the initial parsing & caching overhead that comes with the first execution and to get more reliable figures than one further run would give me.

Version                      Run Time average (secs)
Native SQL                   0.03
Traditional PL/SQL           0.33
PRAGMA UDF PL/SQL            0.08

As you can see, just including PRAGMA UDF removed most of the overhead caused by context switching.

How does PRAGMA UDF work? I’m not sure, the official Oracle documentation is pretty light on it and just says:

“The UDF pragma tells the compiler that the PL/SQL unit is a user defined function that is used primarily in SQL statements, which might improve its performance

Note the italics (which are mine). “Might improve its performance” but no detail as to what it does. As I understand it, it alters the internal representation of data as it is passed between the SQL and PL/SQL engines via the IN and RETURN values (Note it does not change the data types!) – but treat that as a bit of wild speculation for now. I have some evidence for it that you will see in, yes, the next post.

Pragma UDF can slow down slightly functions being called directly from PL/SQL. So use it only for functions you know will be called from SQL.

I’ll make one other observation. Using PL/SQL functions increased the run time to process 100,000 records on my modest test system by all of 0.3 seconds. But that is 10 times the time taken for the native SQL statement. Pragma UDF removes around 80% of this overhead. It’s a nice saving but is probably inconsequential if your code is actually doing any physical IO at all (my example is processing already cached blocks). And if you are only processing a few records or one record in a GUI screen, the context switching is moot {meaning, of no significance}.

But if you have code that processes a huge set of data and uses a lot of user defined PL/SQL functions (and again I go into a lot more detail about this in the book) using pragma UDF in 12C could gain you quite a bit of extra performance. If you have code where even 0.00001 seconds is important (think trading systems) then again there may be a worthwhile benefit.

PL/SQL bug with DBMS_RANDOM? October 8, 2015

Posted by mwidlake in bug, PL/SQL, SQL.
Tags: , , ,
7 comments

I think I’ve found an (admittedly obscure) bug with DBMS_RANDOM, group functions, PL/SQL and/or SQL.

Have a look and see if you also think this is odd – or have I missed the totally obvious?

(This is all on 12.1.0.2)

{Update – my conclusion is, and thanks to Joel and Sayan for their comments, that this is not a “bug”. Oracle do not promise us how PL/SQL functions are executed due to the way SQL can be re-written by the parser. I think most of us stumbling over something like this would treat it as a bug though. You have to look at the column projection, again see the comments, to see how Oracle is deciding to get the columns derived by a naked call to DBMS_RANDOM.VALUE (by naked I mean no inclusion of parameters passed in and, significantly, no reference to columns). It’s just the way it is}

Without going into the details (we would be here for hours if I did) I’m looking into the overhead of context switching between PL/SQL and SQL. It it fairly common knowledge that when you call a PL/SQL function from SQL there is a context switch when the SQL engine hands over control to the PL/SQL engine. I’ve been doing some work into how much the overhead is and that it is incurred for each distinct PL/SQL function (plus loads of other considerations around it).

In doing so I saw something unexpected (to me, anyway) which I have simplified down to this:

select /* mdw_z1 */ avg(dbms_random.value) avg_dbrav1,avg(dbms_random.value) avg_dbrav2
      ,max(created) max_cre
from all_objects
where rownum <50000

AVG_DBRAV1 AVG_DBRAV2 MAX_CRE
---------- ---------- -----------------
.502234063 .502234063 11-SEP-2014 09:31
Elapsed: 00:00:00.89

Note that the two averages of DBMS_RANDOM.VALUE are exactly the same. It is so improbably as to be pretty much impossible that over all those rows, the different random values generated for each column add up to exactly the same. They are getting the same values for each row. I’m very, very dubious of any “identical seeding” issue (ie they both get the same seed and from then provide identical values) as even if DBMS_RANDOM is basing it’s output on something like initial seed, SCN of statement and number of iterations, it is still being referenced twice for each row.

Some further evidence is that when I increase the number of calls to DBMS_RANDOM the elapsed time is almost identical and the statement CPU and PLSQL_EXEC_TIME (pulled from V$SQL) do not increase in any significant way (PLSQL_EXEC_TIME actually goes down):

select /* mdw_z2 */ avg(dbms_random.value) avg_dbrav1,avg(dbms_random.value) avg_dbrav2
       ,avg(dbms_random.value) avg_dbrav3,avg(dbms_random.value) avg_dbrav4
       ,avg(dbms_random.value) avg_dbrav5,avg(dbms_random.value) avg_dbrav6
      ,max(created) max_cre
from all_objects
where rownum <50000

AVG_DBRAV1 AVG_DBRAV2 AVG_DBRAV3 AVG_DBRAV4 AVG_DBRAV5 AVG_DBRAV6 MAX_CRE
---------- ---------- ---------- ---------- ---------- ---------- -----------------
.500367568 .500367568 .500367568 .500367568 .500367568 .500367568 11-SEP-2014 09:31
Elapsed: 00:00:00.84 -- (cf .89 before)

-- information about the two SQL statements from v$SQL, identified by my comments
SQL_ID        PLAN_HASHV  PRSE  EXCS     BUFFS DISCS     CPU_T PLSQL_EXEC    RWS
------------- ---------- ----- ----- --------- ----- --------- ---------- ------
SQL_TXT
--------------------------------------------------------------------------------
4y4jsj7uy12t3 1587414607     1     1     41481     0    843750     312691      1
select /* mdw_z2 */ avg(dbms_random.value) avg_dbrav1,avg(dbms_random.value) avg
gpdga3qars8p2 1587414607     1     1     41481     0    828125     316648      1
select /* mdw_z1 */ avg(dbms_random.value) avg_dbrav1,avg(dbms_random.value) avg

I’ve verified that adding other PL/SQL calls adds about 0.4 seconds to the execution time per extra (simple) PL/SQL function and you see the CPU Time and PLSQL_EXEC_TIME increase.

It is as if Oracle realises the call to DBMS_RANDOM is the same and so does it once per row and puts the results into each column. It does this for other deterministic PL/SQL functions but I would (a) be worried if DBMS_RANDOM was being treated as deterministic (as it’s purpose is to NOT be deterministic) and (b) I know I have used DBMS_RANDOM to generate test data many times, often for several columns in my generating SQL SELECT statement, and I never noticed this before. So, I decided to check that it was only returning 1 value per row to populate the results for each column derived from DBMS_RANDOM.VALUE:


select dbms_random.value drv1, dbms_random.value drv2,created
from all_objects where rownum < 6

      DRV1       DRV2 CREATED
---------- ---------- -----------------
.341919389 .020497964 11-SEP-2014 08:40
.569447631 .727672346 11-SEP-2014 08:40
.019986319 .709239586 11-SEP-2014 08:40
.286970852 .144263004 11-SEP-2014 08:40
 .14440676 .538196808 11-SEP-2014 08:40

Huh? That rather damages my theory and actually works the way I expected, hoped for and thought I remembered. And you can just look at the two columns and you know they are not going to add up to exactly the same! (the last digit adds up to 21 for DRV1 , 28 for DRV2) {How many of you checked that and got 27 for DRV1?}.

So I wrote some code that should be logically identical to my original SQL statement but does the data collection “manually”:

with source as
(select /*+ materialize */
        dbms_random.value  dbrav1    ,dbms_random.value dbrav2
       ,created
from all_objects
where rownum <50000
)
select/* mdw_z3 */ avg(dbrav1)  avg_dbrav1     ,avg(dbrav2)  avg_dbrav2
      ,max(created)  max_cre
from source

AVG_DBRAV1 AVG_DBRAV2 MAX_CRE
---------- ---------- -----------------
.497954489 .497633494 11-SEP-2014 09:31

Elapsed: 00:00:00.96

As you can see, all I do is force the data to be collected into an internal temporary table using the WITH clause and hint it to stop oracle merging the code together and then average the columns. And now I get two different values for the two DBMS_RANDOM.VALUE derived columns.

This version of the code also accrues more runtime and statement CPU/PLSQL_EXEC_TIME, as I mentioned above, when I add more PL/SQL calls. In the below the extended list of “columns” version takes 1.78 seconds, CPU time increases from 843,750 microseconds to 1,703,125 and PLSQL_EXEC_TIME increases from 316,589 microseconds to 917,208

with source /*2 */ as
(select /*+ materialize */
        dbms_random.value  dbrav1    ,dbms_random.value dbrav2
       ,dbms_random.value  dbrav3    ,dbms_random.value dbrav4
       ,dbms_random.value  dbrav5    ,dbms_random.value dbrav6
       ,created
from all_objects
where rownum <50000
)
select/* mdw_z4 */ avg(dbrav1)  avg_dbrav1     ,avg(dbrav2)  avg_dbrav2
      ,avg(dbrav3)  avg_dbrav3     ,avg(dbrav4)  avg_dbrav4
      ,avg(dbrav5)  avg_dbrav5     ,avg(dbrav6)  avg_dbrav6
      ,max(created)  max_cre
from source

AVG_DBRAV1 AVG_DBRAV2 AVG_DBRAV3 AVG_DBRAV4 AVG_DBRAV5 AVG_DBRAV6 MAX_CRE
---------- ---------- ---------- ---------- ---------- ---------- -----------------
.499007362 .501985119 .498591643  .50252316  .49939127  .49804233 11-SEP-2014 09:31

Elapsed: 00:00:01.78

--
--
SQL_ID        PLAN_HASHV  PRSE  EXCS     BUFFS DISCS     CPU_T PLSQL_EXEC    RWS
------------- ---------- ----- ----- --------- ----- --------- ---------- ------
SQL_TXT
--------------------------------------------------------------------------------
49zjaaj41dg00 3034557986     1     1     43465   962   1703125     917208      1
with source /*2 */ as (select /*+ materialize */         dbms_random.value  dbra
0rtbx97f14b0k 3034557986     1     1     42294   382    843750     316586      1
with source as (select /*+ materialize */         dbms_random.value  dbrav1    ,

I did have the execution plans in here too but the post is already quite long. They are identical though, as is shown by the same value of 3034557986 for the PLAN_HASH_VALUE for both statements

So in Summary, the below two versions of the (logically identical as far as I can see) code give different results. The difference is that one is allowing Oracle to do the averaging natively and in the other I am forcing the data to be collected into an internal temporary table and then averaged:

select /* mdw_z1 */ avg(dbms_random.value) avg_dbrav1,avg(dbms_random.value) avg_dbrav2
      ,max(created) max_cre
from all_objects
where rownum <50000

AVG_DBRAV1 AVG_DBRAV2 MAX_CRE
---------- ---------- -----------------
.502234063 .502234063 11-SEP-2014 09:31

with source as
(select /*+ materialize */
        dbms_random.value  dbrav1    ,dbms_random.value dbrav2
       ,created
from all_objects
where rownum <50000
)
select/* mdw_z3 */ avg(dbrav1)  avg_dbrav1     ,avg(dbrav2)  avg_dbrav2
      ,max(created)  max_cre
from source

AVG_DBRAV1 AVG_DBRAV2 MAX_CRE
---------- ---------- -----------------
.497954489 .497633494 11-SEP-2014 09:31

If no one can explain what I am missing, I suppose I should raise a bug with Oracle. Which could be tricky seeing as my access to MyOracleSupport is a bit tenuous…

In case you want to play, this is my whole test script, which does everything but query V$SQL for the statement stats at the end. I am sure you can manage that yourselves…

-- the two columns from dbms_random get the same result - which I did not expect
select /* mdw_z1 */ avg(dbms_random.value) avg_dbrav1,avg(dbms_random.value) avg_dbrav2
      ,max(created) max_cre
from all_objects
where rownum <50000
/

select /* mdw_z2 */ avg(dbms_random.value) avg_dbrav1,avg(dbms_random.value) avg_dbrav2
       ,avg(dbms_random.value) avg_dbrav3,avg(dbms_random.value) avg_dbrav4
       ,avg(dbms_random.value) avg_dbrav5,avg(dbms_random.value) avg_dbrav6
      ,max(created) max_cre
from all_objects
where rownum <50000
/
--
-- The below forces oracle to gather the data into an internal temporary table first.
-- and then I average that
with source as
(select /*+ materialize */
        dbms_random.value  dbrav1    ,dbms_random.value dbrav2
       ,created
from all_objects
where rownum <50000
)
select/* mdw_z3 */ avg(dbrav1)  avg_dbrav1     ,avg(dbrav2)  avg_dbrav2
      ,max(created)  max_cre
from source
/
-- and now I make it do more executions against dbms_random, to see if makes a difference
-- which will lend support to my idea it is doing more contect switching.
with source /*2 */ as
(select /*+ materialize */
        dbms_random.value  dbrav1    ,dbms_random.value dbrav2
       ,dbms_random.value  dbrav3    ,dbms_random.value dbrav4
       ,dbms_random.value  dbrav5    ,dbms_random.value dbrav6
       ,created
from all_objects
where rownum <50000
)
select/* mdw_z4 */ avg(dbrav1)  avg_dbrav1     ,avg(dbrav2)  avg_dbrav2
      ,avg(dbrav3)  avg_dbrav3     ,avg(dbrav4)  avg_dbrav4
      ,avg(dbrav5)  avg_dbrav5     ,avg(dbrav6)  avg_dbrav6
      ,max(created)  max_cre
from source
/

STANDARD date considerations in Oracle SQL and PL/SQL July 29, 2015

Posted by mwidlake in internals, PL/SQL.
Tags: , ,
7 comments

Most of us know that the Oracle DATE datatype has upper and lower limits. From the Oracle 11g Database Concepts manual:

Oracle Database can store dates in the Julian era, ranging from January 1, 4712 BCE through December 31, 9999 CE (Common Era, or ‘AD’). Unless BCE (‘BC’ in the format mask) is specifically used, CE date entries are the default.

I never believe 100% anything I read, so I’ll try that. I’ll set my session to show dates with the AD/BC indicator and step back in time:

ora122> ALTER SESSION SET NLS_DATE_FORMAT = 'DD-MON-YYYYAD';
Session altered.

-- select today
ora122> select sysdate from dual;

SYSDATE
-------------
29-JUL-2015AD

--now let us go back to "the edge of time"
ora122> select sysdate -2457232 from dual;

SYSDATE-24572
-------------
01-JAN-4712BC

ora122> select sysdate -2457233 from dual;
select sysdate -2457233 from dual
               *
ERROR at line 1:
ORA-01841: (full) year must be between -4713 and +9999, and not be 0

-- Now to do similar in PL/SQL

declare
out_text varchar2(100);
begin
  select to_char(sysdate) into out_text from dual;
  dbms_output.put_line (out_text);
end;

ora122> @std1

31-DEC-4713BC

PL/SQL procedure successfully completed.

How did I do that? We can see from the SQL that the documentation is correct and SQL refuses to accept a date before the lower limit. How did I get a date before 01-JAN-4712BC in my PL/SQL? Especially as my default SYSDATE?

I’ll let you think about that for 30 seconds, you can look at a picture of my recently gone and much missed cat (NB she is happily snoozing in this shot!).

I miss this fleabag

I *really* miss this fleabag

So how did I do it? I cheated.

But I cheated in a very, very interesting way. I did not show you all of my PL/SQL code, which I now reveal below…:

declare
sysdate varchar2(20) := '31-DEC-4713BC';
begin
declare
out_text varchar2(100);
begin
  select to_char(sysdate) into out_text from dual;
  dbms_output.put_line (out_text);
end;
end;
/

So, showing you my whole code (see, don’t believe everything you read – sometimes things are simply wrong and sometimes people deceive you) you can see the critical part at the start. My anonymous PL/SQL block is in fact a child block to another. And in that parent block, right at the start, I over-ride the definition of SYSDATE. in the declaration section

sysdate varchar2(20) := ’31-DEC-4713BC’;

I not only set it to a specific value, I set it to be a Varchar2 datatype. The TO_CHAR of it later on in the logic, which I included in the code I originally showed you, was just more subterfuge on my part. The PL/SQL engine does not care if you TO_CHAR an already CHAR-type field, but it hid the fact that I’d played this trick.

You could define a local SYSDATE variable, as a date, set to a specific date and time if you wish. Even one in the future. And anyone seeing odd behaviour and checking to see if the initialization paramater FIXED_DATE had been set would find that it had not and might soon be questioning their own sanity.

How many of you knew you could do that? You can over-ride what most of us would consider a Reserved Word in PL/SQL. I suspect it is something that people new to PL/SQL might find out by accident (because no one told them you could not use reserved words for variable names) but experienced people would not as it is simply a daft thing to do. I’d go further, it is a stupid thing to do. Think how much trouble it could cause in introducing bugs and making the code hard to understand. {And thinking further about this, I might see if I can get re-employed at a couple of places and starting doing things like this, just for the evil pleasure, as it could be an utter nightmare to spot}.

The reason this trick works is that SYSDATE, along with many interesting things, are not “Built In” to the PL/SQL language but are defined in two key packages – SYS.STANDARD and SYS.DBMS_STANDARD. These are always there and define many core things. You can DESC both of them in SQL*Plus or pull the package specification out of DBA_SOURCE and, unlike many of the other Built In packages, the code is not wrapped for STANDARD, so you can look at it. You can do this with a “lowly” DBA-type user, you do not need to be on as SYS or SYSTEM.

I am not sure of the exact rules but I think that when you use a locally qualified variable (ie you do not state the code block, package or stored function/procedure it comes from) it looks at the current variables as defined in the current and parent PL/SQL blocks first and then looks at STANDARD and then DBMS_STANDARD. I am not going to mess with STANDARD or DBMS_STANDARD, even on my play box, to find out the exact order of the two. If I spent 10 minutes looking at the specifications I might be able to see that one references the others I suppose…

This is part of the specification from DBMS_STANDARD:

package dbms_standard is
  -- types
   type ora_name_list_t is table of varchar2(64);

  -- DBMS_ID and DBMS_QUOTED_ID define the length of identifiers
  -- in objects for SQL, PL/SQL and users.
   subtype dbms_id is varchar2(30);
   subtype dbms_quoted_id is varchar2(32);

   subtype dbms_id_30 is varchar2(30);
   subtype dbms_quoted_id_30 is varchar2(32);
   subtype dbms_id_128 is varchar2(128);
   subtype dbms_quoted_id_128 is varchar2(130);

  -- Trigger Operations
  procedure raise_application_error(num binary_integer, msg varchar2,
      keeperrorstack boolean default FALSE);
    pragma interface (C, raise_application_error);         -- 1 (see psdicd.c)
    pragma restrict_references (raise_application_error, WNPS, RNPS, WNDS, RNDS);
  function inserting return boolean;
    pragma interface (C, inserting);                       -- 2
    pragma restrict_references (inserting, WNPS, RNPS, WNDS);
  function deleting  return boolean;
    pragma interface (C, deleting);                        -- 3
    pragma restrict_references (deleting, WNPS, RNPS, WNDS);
  function updating  return boolean;
    pragma interface (C, updating);                        -- 4
    pragma restrict_references (updating, WNPS, RNPS, WNDS);

You won’t find a package body of DBMS_STANDARD – that is because, I believe, all entries in the package specification are types or functions/procedures that lead to C functions, via the ADA-like {If you did not know, PL/SQL is based on the ADA language} pragma directives of “pragma interface (C, {something}), which says this function/procedure is coded in another language (C in this case) and is called {something}. Don’t ask me more, I don’t know.

eg:
procedure commit;
pragma interface (C, commit);

Even the base data types are defined in STANDARD:

package STANDARD AUTHID CURRENT_USER is              -- careful on this line; SED edit occurs!

  /********** Types and subtypes, do not reorder **********/
  type BOOLEAN is (FALSE, TRUE);

  type DATE is DATE_BASE;

  type NUMBER is NUMBER_BASE;
  subtype FLOAT is NUMBER; -- NUMBER(126)
  subtype REAL is FLOAT; -- FLOAT(63)
  subtype "DOUBLE PRECISION" is FLOAT;
  subtype INTEGER is NUMBER(38,0);
  subtype INT is INTEGER;
  subtype SMALLINT is NUMBER(38,0);
  subtype DECIMAL is NUMBER(38,0);
  subtype NUMERIC is DECIMAL;
  subtype DEC is DECIMAL;


  subtype BINARY_INTEGER is INTEGER range '-2147483647'..2147483647;
  subtype NATURAL is BINARY_INTEGER range 0..2147483647;
  subtype NATURALN is NATURAL not null;
  subtype POSITIVE is BINARY_INTEGER range 1..2147483647;
  subtype POSITIVEN is POSITIVE not null;
  subtype SIGNTYPE is BINARY_INTEGER range '-1'..1;  -- for SIGN functions

  type VARCHAR2 is NEW CHAR_BASE;

  subtype VARCHAR is VARCHAR2;
  subtype STRING is VARCHAR2;

  subtype LONG is VARCHAR2(32760);
...

Anyway, I leave the reader to go and look at the package specifications and the STANDARD package body {some of which I show at the end} but I leave you with a repeat of the above warnings: Don’t go replacing the core variables and functions in your PL/SQL code just because you can and do not, repeat, do NOT mess with those two packages. I am sure Oracle Corp will throw your support contract out the window if you do.

As promised above, here is the code for SYSDATE, in SYS.STANDARD, and it is very interesting – in the manner as mentioned above it calls a function (pessdt) that only calls a C program (presumably to get the datetime from the server clock) and failing that, reverts to the SQL method of selecting the pseudocolumn from dual. SYSTIMESTAMP below it is the same:

  function pessdt return DATE;
    pragma interface (c,pessdt);

  -- Bug 1287775: back to calling ICD.
  -- Special: if the ICD raises ICD_UNABLE_TO_COMPUTE, that means we should do
  -- the old 'SELECT SYSDATE FROM DUAL;' thing.  This allows us to do the
  -- SELECT from PL/SQL rather than having to do it from C (within the ICD.)
  function sysdate return date is
    d date;
  begin
    d := pessdt;
    return d;
  exception
    when ICD_UNABLE_TO_COMPUTE then
      select sysdate into d from sys.dual;
      return d;
  end;
--
--
-- 
  function pessts return timestamp_tz_unconstrained;
    pragma interface (c,pessts);

  -- Special: if the ICD raises ICD_UNABLE_TO_COMPUTE, that means we should do
  -- the old 'SELECT systimestamp FROM dual;' thing.  This allows us to do the
  -- SELECT from PL/SQL rather than having to do it from C (within the ICD.)
  FUNCTION systimestamp RETURN timestamp_tz_unconstrained
  IS  t timestamp_tz_unconstrained;
  BEGIN
    t := pessts;
    RETURN t;
  EXCEPTION
    WHEN ICD_UNABLE_TO_COMPUTE THEN
      SELECT systimestamp INTO t FROM sys.dual;
      RETURN t;
  END;

Friday Philosophy – Do good DBAs need PL/SQL Skills? March 1, 2013

Posted by mwidlake in Friday Philosophy, PL/SQL.
Tags: ,
16 comments

This Friday Philosophy was prompted by a discusion between some OakTable people about did we think “good” DBAs should know PL/SQL? Not all the tricks, bulk processing, using all the built-ins, but able to write PL/SQL with cursor loops and some exception handling that could eg cycle thorough tables and archive off data or implement some logon trigger functionality.

My response was “that depends on the age of the DBA”.

If you had asked me that question 15 years ago I would have said Yes, a good DBA would and should know PL/SQL – and most good DBAs did.
If you had asked me 10 years ago I would have said I’d hope they would and most DBAs I respected has some PL/SQL skills.
If you had asked me 5 years ago I would have sighed and had a little rant about how they should but the younger ones don’t and that is wrong.

But now, I would say that no, a good DBA does not need PL/SQL skills as so often they have so many other things they have to do and the tools to manage the database are somewhat better than they were. But inside I would still be thinking any DBA beyond their first year or two in the job would benefit from knowing the basics of PL/SQL.

It seems to me that a DBA now is generally expected to look after a very large number of instances, application servers, agents etc and all their time is taken doing the bread-and-butter tasks of backups and recoveries, patching, duplicating data, raising SRs (and that seems to take more and more time each SR every year), unlocking accounts, sorting out permissions…

Not only that but there is more and more to Oracle that a good DBA needs to understand as the technolgy gets more complex. Oracle has tried to make Oracle look after itself more but the result seems to be that for larger systems there are more moving parts to go wrong – and when they do it is often the DBA who has to sort it out. As an example, you no longer need to set several instance parameters to allocate memory to the components of the PGA and SGA. Just set the Memory Target. But if the system starts to throw odd errors about components being out of memory, the DBA needs to sort that out. They need to know about the dynamic memory adjust ments and check them out. They need to understand that certain components are now calculated in a different way, like the log buffer size. And probably they will have to revert back to the old parameters so they still need to know all about that!

So unless a DBA is an old hand and “grew up” with all this, they have no time to develop PL/SQL skills. Thus my response was “that depends on the age of the DBA”.

Should a good DBA know SQL?

Yes. I still see that as a given. Buttons, widgets and assistants will only get you so far.

DBMS_APPLICATION_INFO for Instrumentation September 3, 2012

Posted by mwidlake in development, Instrumentation, performance.
Tags: , , , ,
24 comments

I just wanted to put up a post about DBMS_APPLICATION_INFO. This is a fantastic little built-in PL/SQL package that Oracle has provided since Oracle 8 to allow you to instrument your code. i.e record what it is doing. I’m a big fan of DBMS_APPLICATION_INFO and have used it several times to help identify where in an application time is being spent and how that pattern of time has altered.

Some PL/SQL developers use it and some don’t. It seems to me that it’s use comes down to where you work, as most PL/SQL developers are aware of it – but not everyone uses it (a friend of mine made the comment recently that “all good PL/SQL developers use it“. I can understand his point but don’t 100% agree).

It is incredibly easy to use. You use the procedures SET_MODULE(module_name,action_name), SET_ACTION(action_name) and SET_CLIENT_INFO(client_info) to set the values of the corresponding columns in V$SESSION for your current session. This is done in memory, there is no inserting or updating of any table rows, so it is incredibly quick and light-weight. The below shows a subset of V$SESSION including the columns that get set:

desc v$session
Name Null? Type
----------------------------------------- -------- ----------------
SADDR RAW(8)
SID NUMBER
SERIAL# NUMBER
AUDSID NUMBER
PADDR RAW(8)
USER# NUMBER
USERNAME VARCHAR2(30)
COMMAND NUMBER
...
MODULE VARCHAR2(64)
MODULE_HASH NUMBER
ACTION VARCHAR2(64)
ACTION_HASH NUMBER
CLIENT_INFO VARCHAR2(64)
...

Just something odd to note. MODULE, ACTION and CLIENT_INFO are shown at VC(64) but if you check the documentation you will see that:

MODULE is limited to 48 bytes
ACTION is limited to 32 bytes
CLIENT_INFO can be set to the whole 64 bytes

I have no idea why MODULE and ACTION are limited in this way but it might be historic, backward compatibility with prior versions.

As a quick example of their use:

test_11_2> --demo_dai.sql
test_11_2> -- quick demo of dbms_application_info
test_11_2> COL sid form 9999
test_11_2> col username form a12
test_11_2> col module form a18
test_11_2> col action form a22
test_11_2> col client_info form a15
test_11_2> -- set module and action
test_11_2> EXEC DBMS_APPLICATION_INFO.SET_MODULE(MODULE_NAME=>'SALES_LOAD',ACTION_NAME=>'VALIDATE_ALL_PRICES')
test_11_2> select sid,username,module,action,client_info from v$session where sid=650;

SID USERNAME MODULE ACTION CLIENT_INFO
----- ------------ ------------------ ---------------------- ---------------
650 ERIC SALES_LOAD VALIDATE_ALL_PRICES
test_11_2> --
test_11_2> -- Update the action within a module
test_11_2> EXEC DBMS_APPLICATION_INFO.SET_ACTION(ACTION_NAME=>'Update all Prices')
test_11_2> select sid,username,module,action,client_info from v$session where sid=650;

SID USERNAME MODULE ACTION CLIENT_INFO
----- ------------ ------------------ ---------------------- ---------------
650 ERIC SALES_LOAD Update all Prices

test_11_2> -- clear them
test_11_2> EXEC DBMS_APPLICATION_INFO.SET_MODULE(null,null)
test_11_2> select sid,username,module,action,client_info from v$session where sid=650;

SID USERNAME MODULE ACTION CLIENT_INFO
----- ------------ ------------------ ---------------------- ---------------
650 ERIC

test_11_2> -- you can set just an action
test_11_2> EXEC DBMS_APPLICATION_INFO.SET_action(ACTION_NAME=>'orphan process')
test_11_2> select sid,username,module,action,client_info from v$session where sid=650;

SID USERNAME MODULE ACTION CLIENT_INFO
----- ------------ ------------------ ---------------------- ---------------
650 ERIC orphan process

test_11_2> -- set them to something sensible
test_11_2> EXEC DBMS_APPLICATION_INFO.SET_MODULE(MODULE_NAME=>'RETURNS_LOAD',ACTION_NAME=>'MATCH_TO_SALE')
test_11_2> select sid,username,module,action,client_info from v$session where sid=650;

SID USERNAME MODULE ACTION CLIENT_INFO
----- ------------ ------------------ ---------------------- ---------------
650 ERIC RETURNS_LOAD MATCH_TO_SALE

test_11_2> -- client_info can be longer
test_11_2> EXEC DBMS_APPLICATION_INFO.SET_client_info(client_info=>'entered this module at '||sysdate)
test_11_2> select sid,username,module,action,client_info from v$session where sid=650;

SID USERNAME MODULE ACTION CLIENT_INFO
----- ------------ --------------- --------------- ----------------------------------------
650 ERIC RETURNS_LOAD MATCH_TO_SALE entered this module at 03-SEP-2012 13:07

The intention is that you set the MODULE as you enter a functional chunk of your application (be it a PL/SQL package or a chunk of JAVA, whatever {but be aware of the stateless nature of systems using middle tiers and pooled connections}) and then update the ACTION as you progress. You can set an action with no module, as demonstrated above, but I would recommend against it. On that topic I would make the following recommendations:

  • Set the MODULE at the start, eg as you enter a PL/SQL package’s main module
  • Set the MODULE before you do all the validation – you may wish to see if that validation is as quick as you think.
  • Use SET_ACTION to update the action prudently. ie when the activity of the code alters but not all the time. It will come clearer with use, but what you are after is to be able to judge the relative run durations of parts of the application. Too much detail can mask the overall picture. If you can alter code you can always add more instrumentation.
  • Clear the MODULE and ACTION explicitly as you exit a package. The activity is lightweight and you do not want any code that lacks this instrumentation to inherit values from calling code.
  • Do not set an ACTION without having set the MODULE earlier in the code.
  • Keep the text of these values short and specific.

You do not have to look at V$SESSION to see these values, which is good as you may lack the privs needed to do so. Since oracle 9 there have also been procedures to check what the values are set to. e.g:

test_11_2> BEGIN
2 DECLARE
3 V_MODULE VARCHAR2 (64);
4 V_ACTION VARCHAR2 (64);
5 begin
6 dbms_application_info.READ_MODULE(V_MODULE,V_ACTION);
7 DBMS_OUTPUT.PUT_LINE('Module IS '||V_MODULE||' Action IS '||V_ACTION);
8 END;
9 END;
10 /

Module IS RETURNS_LOAD Action IS MATCH_TO_SALE
test_11_2>

So as you can see, these values can be set, they can be accessed via the data dictionary or the package itself and, if you believe my wild claims, they are very lightweight to use. However, these values really come into themselves with eg OEM’s performance screens. I’ll leave examples of that to a second post. Here I just wanted to cover how easy it is to set and get the information.

That does lead me onto what I think is the main reason that DBMS_APPLICATION_INFO is not as widely used as it could {and should} be. Developers often lack access to eg OEM and so do not see the full benefit of using it. Also, developers tend to be {note the get-out-clause) more concerned with making the code work and less with monitoring it after it is released. DBA-types tend to be more concerned with monitoring it. Thus the DBA types would want it in the code and developers would not want to spend the time coding it in. We are all one big team though, so some negotiation and beer-buying (or management enforcement) might well resolve that one.

Pickler Fetch – What is it? August 11, 2011

Posted by mwidlake in performance, PL/SQL.
Tags: , ,
12 comments

How many of you have seen the step COLLECTION ITERATOR PICKLER FETCH appearing in Explain Plans and wondered “what is a Pickler Fetch”?

{Note – make sure you check out the comments on this post if you are interested in Pickler Fetches. Timur Akhmadeev and Dom Brooks correct and extend my original efforts}.

----------------------------------------------------------------------------------------------------
| Id  | Operation                             | Name       | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                      |            |       |       |    31 (100)|          |
|*  1 |  FILTER                               |            |       |       |            |          |
|   2 |   NESTED LOOPS                        |            |       |       |            |          |
|   3 |    NESTED LOOPS                       |            |     1 |    71 |    31   (4)| 00:00:01 |
|   4 |     SORT UNIQUE                       |            |     4 |     8 |    26   (0)| 00:00:01 |
|   5 |      COLLECTION ITERATOR PICKLER FETCH|            |     4 |     8 |    26   (0)| 00:00:01 |
|*  6 |     INDEX RANGE SCAN                  | CHHE_PK    |     1 |       |     2   (0)| 00:00:01 |
|   7 |    TABLE ACCESS BY INDEX ROWID        | CHILD_HEAP |     1 |    69 |     4   (0)| 00:00:01 |
----------------------------------------------------------------------------------------------------

It is where Oracle works through a SQL array. I’ve usually seen this within PL/SQL where an array variable is used within a procedure or package, being passed in containing a set of accounts or customers that are of interest and is used with a cursor. But you might also see it as the step processing the output of a pipelined function (one that returns a set of “rows”) being cast into a table. See this example by Tom Kyte. Tom also suggests that it is called a “Pickler” fetch as the data is pickled – packed and formatted. I never knew that, I just thought someone in Oracle development was having a giggle and it was “Pickled” as it was preserved from the PL/SQL side of the SQL engine. It seems that I was a little off-target with that.

{My thanks to Timur (see comments) who corrected me when I said it was a PL/SQL array. It is not, it is a SQL object}.

The above plan was generated for the below SQL statement:

cursor c_collect is 
select  PARE_ID
       ,CRE_DATE
       ,VC_1
from child_heap
where pare_id in (select  id  from table(p_parents) p)
and   cre_date >=trunc(p_date) and cre_date < trunc(p_date+1);

Anyway, I just wanted to make a few comments about Pickler Fetches:

  • The CBO is not very good at looking “inside” such arrays to determine the cardinality of that step. This can cause performance issues.
  • Because using them involves a few steps and potentially involves grants, it is possible for them to be “temporarily removed” during testing and so their impact is not seen.
  • They can leak memory. I think.

I’m now going to cover each of those points in turn. If you just wanted to know what a pickler fetch is and I’ve answered that for you, I suggest you go back to whatever you were doing before you started reading this:-)

By “not very good at looking inside pickler fetches” I mean that the CBO seems to make a poor “base it on the first seen array” (V11 up) or utterly fails to identify how many records are inside the SQL array (V10 down), depending on the Oracle version. From an Oracle 10.2.0.4 system I’ve got the two following examples:

-------------------------------------------------------------------------------------------------------------------
| Id  | Operation                               | Name              | Rows| Bytes | Cost (%CPU)| Time     | Pstart
--------------------------------------------------------------------------------------------------------------------
|   9 |      NESTED LOOPS                       |                   |   1 |   139 |   114   (1)| 00:00:01 |      
|* 10 |       HASH JOIN SEMI                    |                   |   1 |    50 |    11  (10)| 00:00:01 |      
|* 11 |        TABLE ACCESS BY INDEX ROWID      | ABCDE_ABCDEFGHIJ  |   3 |   144 |     4   (0)| 00:00:01 |      
|* 12 |         INDEX RANGE SCAN                | XX_PK             |   3 |       |     3   (0)| 00:00:01 |      
|  13 |        COLLECTION ITERATOR PICKLER FETCH|                   |     |       |            |          |      
|  14 |       PARTITION RANGE ITERATOR          |                   |   6 |   534 |   103   (0)| 00:00:01 |   KEY
|* 15 |        INDEX RANGE SCAN                 | AAXX_PK           |   6 |   534 |   103   (0)| 00:00:01 |   KEY

Note the cost and expected rows for the Pickler Fetch step. Or rather, the lack of them.

This would manifest itself in the following way in OEM screens:

Note the cardinality of the Pickler Fetch step is 0 {sorry, I cut off the column headings}. This resulted in this step having a cost of 0 and all the subsequent steps of having an expected cardinality of one and having very low costs – between 1 and 3 {Again, not shown, sorry}.

The end result of this was that the CBO struggled to accuratley cost any given plan and came up with several, usually quite terrible, plans that it swapped between as other table stats varied. The CBO was picking between very complex plans with total costs of 100 or 200 or so! Any difference was “significant”.

Please note, OPTIMIZER_DYNAMIC_SAMPLING was set to 4 on this system and I tried hints and session settings to higher levels and they did not prompt the CBO to look into the array, on 10.2.0.4 at least.

In 11.1 things seem to be better, as is shown in the explain plan at the top of this post. The step has a cost. I have to confess, I have not tested this very much on 11 {and if anyone has, feel free to correct me/add enlightenment via comments or pointers to other sources}, but it seems to set the cardinality to the number of elements the Pickler Fetch finds in the first itteration. Unless it uses the same sort of trick Oracle 11 now uses for bind variables (detecting when the value supplied is out of range and generates a new plan) this is going to lead to the old and much ‘loved’ issue of the plan being fixed by the first execution, irrespective of how suitable that plan is.

How do you fix this issue? Well, I resort to the cardinality hint. Usually the number of records being passed into the array is not too variable and any half-decent value is better than nothing in Oracle 10 and before. As for in 11, I like stating the value rather than risking a variable ‘first seen at parsing time’ setting. It is a judgement call. The below is from 11.1 but I’ve used it extensively in 10.2, where the impact is much more significant:

select  PARE_ID
       ,CRE_DATE
       ,VC_1
from child_heap
where pare_id in (select  id  from table(p_parents) )
and   cre_date >=trunc(p_date) and cre_date < trunc(p_date+1)
--
----------------------------------------------------------------------------
| Id  | Operation                             | Name       | Rows  | Bytes |
----------------------------------------------------------------------------
|   0 | SELECT STATEMENT                      |            |       |       |
|*  1 |  FILTER                               |            |       |       |
|   2 |   NESTED LOOPS                        |            |       |       |
|   3 |    NESTED LOOPS                       |            |     1 |    71 |
|   4 |     SORT UNIQUE                       |            |     4 |     8 |
|   5 |      COLLECTION ITERATOR PICKLER FETCH|            |     4 |     8 |
|*  6 |     INDEX RANGE SCAN                  | CHHE_PK    |     1 |       |
|   7 |    TABLE ACCESS BY INDEX ROWID        | CHILD_HEAP |     1 |    69 |
----------------------------------------------------------------------------
--
-- and now with hint
--
select  PARE_ID
       ,CRE_DATE
       ,VC_1
from child_heap
where pare_id in (select /*+ CARDINALITY(p,11) */ id  from table(p_parents) p)
and   cre_date >=trunc(p_date) and cre_date < trunc(p_date+1)
--
-----------------------------------------------------------------------------
| Id  | Operation                             | Name       | Rows  | Bytes |
-----------------------------------------------------------------------------
|   0 | SELECT STATEMENT                      |            |       |       |
|*  1 |  FILTER                               |            |       |       |
|   2 |   NESTED LOOPS                        |            |       |       |
|   3 |    NESTED LOOPS                       |            |     1 |    71 |
|   4 |     SORT UNIQUE                       |            |    11 |    22 |
|   5 |      COLLECTION ITERATOR PICKLER FETCH|            |    11 |    22 |
|*  6 |     INDEX RANGE SCAN                  | CHHE_PK    |     1 |       |
|   7 |    TABLE ACCESS BY INDEX ROWID        | CHILD_HEAP |     1 |    69 |
-----------------------------------------------------------------------------

Note the change of ROWS to 11 in step 5. In V10 this is a change from blank to 11 and in real situations, do not be at all suprised if the plan changes dramatically – away from nested loop access and more to hash joins. {I should note, the cardinality hint is not documented in Oracle 10 or 11 and any use you make of it in live code is your responsibility. Sorry about that}.

What about my second point, about testing them? Well, as an example of testing Pickler processing of SQL arrays, which are defined SQL types, this is what I had to do to run my basic test:

-- pare_types_demo.sql
--
-- create and use some record and table types.
--
create or replace type pare_rec as object
(id number(10)
,name varchar2(100))
/
create or replace type pare_rec_t
as table of pare_rec
/
--
set serveroutput on size unli
--
create or replace procedure get_children
(p_parents in pare_rec_t
,p_date    in date)
is
cursor c_collect is 
select  PARE_ID
       ,CRE_DATE
       ,VC_1
from child_heap
where pare_id in (select id from table(p_parents))
and   cre_date >=trunc(p_date) and cre_date < trunc(p_date+1);
begin
  for v_rec in c_collect
  loop
    dbms_output.put_line('parent '||to_char(v_rec.pare_id)||' on '||to_char(v_rec.cre_date)
    ||' value '||v_rec.vc_1);
  end loop;
end;
/
--
declare
l_date        date           :=sysdate-100;
l_parents     pare_rec_t;
begin
  l_parents :=pare_rec_T(PARE_REC(1,'ERIC')
                        ,PARE_REC(2,'BARRY')
                        ,PARE_REC(3,'KERMIT')
                        ,PARE_REC(4,'GONZO')
                        );
  GET_CHILDREN (L_PARENTS,L_DATE);
END;
/

I had to create a table type, which is the SQL array, and this was based on an object type which I had to create first {you can have table types based on standard SQL types but very often they are based on a “row” object}. After creating the stored procedure, I had to define and populate the array with a set of records which I then passed in to my procedure call . {If you want to repeat this yourself, check out my postings on IOTs to get the table creation statement for table CHILD_HEAP}.
Now, I created those types so I had access to them. If those types do not belong to you you have to be granted execute on the types to reference them. Not select, execute. Some sites have a pretty strict attuitude to granting execute on anything and types seem to get forgotten when the execute priviledges against packages and procedures are set up. In a recent situation I had, I was forced to do some testing work on Live and it had taken people with big sticks to get me select access on data. Execute privileges were totally refused. Calmly explaining why it was needed and how it was acceptable fell on not so much deaf as bricked-up ears.

So, for testing, the reference to an array passed in is often replaced by a little sub-select. After all, quite often what is being passed in for a pickler fetch is actually a set of records {but a subset of the rows} from a table that has been collected by a previous processing step.
As an example of such a change:

where  ev.SYS_ID                       = pag.SYS_ID
and    ev.EXT_ID                       = pag.EXT_ID
and    P_EXT_ID                        = pag.EXT_ID
and    P_SYS_ID                        = pag.SYS_ID
and    ev.CREATED_DT                  >= pag.START_DT
and (ev.ID, ev.TX) in (select ID,TX from table(P_ARRAY_IN))
--
--BECOMES
--
where  ev.SYS_ID                       = pag.SYS_ID
and    ev.EXT_ID                       = pag.EXT_ID
and    P_EXT_ID                        = pag.EXT_ID
and    P_SYS_ID                        = pag.SYS_ID
and    ev.CREATED_DT                  >= pag.START_DT
--and (ev.ID, ev.TX) in (select ID,TX from table(P_ARRAY_IN)) --NB swap back in before release!
and (ev.ID, ev.TX) in (select ID,TX from CUSTOMERS WHERE SURNAME='WIDLAKE' and FFNAME='MARTIN')

All that has changed is that we now have a little sub-select rather than the casting of the SQL array into a table and, heck, as the developer might say, those were the records that would have been passed in, the code still works as expected and the same data comes back. No identifying which object types you need, no getting the execute permissions, no populating it yourself in the test harness, we can just swap back in the array later.

{I apologise to good developers, who even now are throwing imaginary darts at me. I know Dawn, you would certainly not do this. But I’ve seen it a couple of times. Developers have enough on their plate to go worrying about esoteric aspects of the CBO}

But the thing is, Oracle can look at that select and evaluate it’s cost and get an expected cardinality. The pickler fetch version has the issues I’ve just covered. I had to deal with a situation just like this last year, it does happen. In dev it was fine, in pre-live testing it was not.

What about memory leaks? Well, I had lots of issues with SQL arrays and memory leaks with oracle 10.1. and 10.2 at one client site and there is a documented bug in oracle 8 with pickler fetch and memory leaks but I have to confess, a quick metalink search did not find any hits for Oracle 10 and 11. So maybe you should not trust me on that one. In the situation I saw the arrays were massive, several MB at times, and so if you are using SQL arrays to pass in a modest list of eg accounts or customers, it is not going to be an issue anyway.

You know, this was just going to be a quick post on something I’ve been meaning to mention for months, not a small essay:-).

Follow

Get every new post delivered to your Inbox.

Join 234 other followers