jump to navigation

Friday Philosophy – Solving Simple Problems By Asking The One Key Question. June 17, 2022

Posted by mwidlake in Friday Philosophy, performance.
Tags: ,
add a comment

For many of us solving problems is a large part of our working life in IT. This is especially true when it comes to getting features of a product to do what it is supposed to do or even making a bit of hardware work the way it should. And, if it is a printer, work in any way at all! Often the solution is simple but we ask ourselves the wrong question. I’m going to use this pair of garden secateurs as an example of this. Trust me, go with me on this.

My trusty secateurs as a metaphor for almost all IT problems

I’ve had these secateurs for many years. For the first few years these secateurs worked wonderfully, but in the last 2 years they have not been working so well. They just don’t cut all the way through stems & small branches, you have to almost pull the piece off that you want to smoothly cut away. I was having to squeeze the handles harder and harder to complete the cut. And now it seemed to make no difference how hard I squeezed, they would not cut cleanly. This is like steadily increasing the size of the SGA to improve query performance, getting some initial gains and then nothing more.

The trick with any such cutting device is to keep the blade sharp, and I do. A few strokes with a needle file every few months keeps the blade sharp. The other issue with an anvil-type pair of secateurs like this is to keep the anvil, the flat area the blade comes down onto, clean and the little central gap clear. The blade closes down on the anvil and the thin gap allows the blade to *just* drop below the main face of the anvil and cleanly cut the stem. This is like the regular maintenance you do on a computer system, like cycling out redundant data or ensuring there are no accounts for staff who have left or rebuilding all your indexes ( 😀 ).

Only, no matter how much I sharpened the blade or cleaned out the anvil the secateurs would not make a clean cut. Ahh, I know. If you look at the above pictures you will see on the blade that the area closest in to the pivot is silver, it is where most of the wear is. Had the blade worn back into a curve so that the key part of the blade was not cutting all the way through? I used a flat piece of metal and checked the blade. Well, the far end was a little proud so I worked on it with the file and got the whole blade totally flat and sharp along the whole length. It was sharp enough so that when I pulled garden twine over the blade it cut straight through with ease.

But they still would not cleanly cut through a stem. I oiled the pivot, for no reason other than blind hope (just like changing configuration parameters for something when you really know those parameters have nothing to do with the problem). I sharpened the blade again (I rebuilt my indexes again). The blade was wickedly sharp, it was flat, the anvil was clean, these secateurs had always worked so well. So I asked the question. Or, more accurately, shouted it at the offending thing:

Why wasn’t *IT* working?!?!!

And that is the wrong question. But it’s usually our attitude to any piece of software or hardware that is not doing what it should. We beat the desk and swear and shout “why won’t you work you piece of crap!!!”

At this point you need to stop and ask a subtly different question:

What is it I don’t understand?!???

The two questions are similar but the key thing is they come from different directions. When you are screaming at the unresponsive printer or swearing about the SQL hint having no effect, your mind-set is that the thing is broken or has a bug or is in some other way not doing what it should. It’s the printer’s fault. The manual is wrong on setting up the feature. This thing is not doing what it should.

Well, yes it is not doing the required thing – but given other people can get the feature to work or the printer was working last week, the problem is unlikely to be there. The problem is actually that I am not understanding something. And it could be very fundamental, so ask the most basic things first. For the printer to print something it needs to be on and I need to send the file to it. Is it on! OK, maybe not that basic, but are you sending the file to the right printer?

Getting back to my example hardware, the blade is sharp enough to cut, the face it is cutting onto is clean, there is no issue with the pivot. What does it need to *do* to cut? The blade needs to come down on the anvil. I closed the secateurs and held them up to the light.

Note, I slightly emphasised the gap for the sake of the photograph

Ahhh, there is a gap! If I squeezed really hard the gap would shrink, but not completely close. The problem is not the blade, it is not the anvil, it is something else that is stopping the two coming together. The problem is elsewhere. And it is simple. And this is often how it is with software & computers. Yes, sometime the problems are very complex or have several confounding factors, but when you are trying to get something to just work, the problem is usually one of missing something obvious, which is usually being impacted by a component you are not currently swearing at.

This is just like last time my printer would not respond to requests to print from my laptop. I’d checked “everything”, swore at the printer, turned it off and on a few times, kicked the cat, sacrificed a small child to the printer Gods. I stopped and I did the “what do I not understand” thing. It turned out that due to a brief power outage a few days earlier my laptop was connecting to the internet via my neighbour’s router not mine (we have a mutual agreement to share due to the connection issues of living in a field and that we use different service providers). The printer and the laptop were not on the same network.

If you look at the secateurs in full you will notice that between the handles there are two plastic lugs that come together. I’m guessing they stop you squeezing the blade deep into the anvil, potentially damaging both blade and anvil. Over the years of sharpening I had worn away the blade enough that these plastic lugs were now preventing the blade and anvil coming together. Especially as, due to my frustration, I had given the blade a really thorough sharpening. That’s why I had been having to squeeze the handles harder & harder until eventually no effort would get a clean cut. I got that needle file out and reduced those lugs by a millimetre or so. The blade now closes on the anvil and, being about as sharp as it has ever been, they cut wonderfully. The solution was wonderfully simple and it took me all of 2 minutes to implement once I had worked it out.

With my first “increasing SGA” example, I was squeezing hard/giving the database a larger buffer cache, and that helped, but the real issue was the sorts were spilling to disc. I needed to increase the PGA (or process less data).

beautifully closed secateurs

You see, it was not that the secateurs were maliciously working against me or that the components that usually needs some maintenance needed it. It was that I was not understanding something – something very simple & fundamental it turns out. Rather than repeatedly doing the maintenance steps that had worked before but now were not, I needed to stop doing what was no longer working and ask what had changed.

When you are solving problems and it’s got to the swearing stage, it’s time to walk away for 5 minutes, have a cup of tea, and then come back and ask yourself “what do I not understand?”. Best still, do this right at the start. Though to be fair I’m not very good at that, I always get to the swearing stage before I decided to question my understanding.

Especially with printers. I bloody hate printers. Printers are sentient and out to get you.

Advertisement

Friday Philosophy – When Not To Tune! Or Maybe You Should? May 20, 2022

Posted by mwidlake in Friday Philosophy, performance.
Tags:
add a comment

This week there was an interesting discussion on Twitter between myself, Andy Sayer, Peter Nosko, and several others, on when you do performance tuning. It was started by Andy tweeting the below.

As a general rule, I am absolutely in agreement with this statement and it is something I have been saying in presentations and courses for years. You tune when there is a business need to do so. As Andy says, if a batch process is running to completion in the timeframe available to it, then spending your effort on tuning it up is of no real benefit to the company. The company pays your salary and you have other things to do, so you are in fact wasting their money doing so.

Please note – I said “as a general rule“. There are caveats, exceptions to this rule as I alluded to in my reply-in-agreement tweet:

I cite “Compulsive Tuning Disorder” in my tweet. This is a phrase that has been kicking about for many years, I think it was coined by Gaja Krishna Vaidyanatha. It means someone who, once they have sped up some SQL or code, cannot let it lie and has to keep trying and trying and trying to make it faster. I think anyone who has done a fair bit of performance tuning has suffered from CTD at some point, I know I have. I remember being asked to get a management report that took over an hour to run down to a few minutes, so they could get the output when asked. I got it down to a couple of minutes. But I *knew* I could make it faster to I worked on it and worked on it and I got it down to under 5 seconds.

The business did not need it to run in 5 seconds, they just needed to be able to send the report to the requester within a few minutes of the request (they were not hanging on the phone desperate for this, the manager just wanted it before they forgot they had asked for it). All that the added effort I spent resulted in was a personal sense of achievement and my ability to humble-brag about how much I had sped it up. And my boss being really, very annoyed I had wasted time on it (which is why it sticks out in my mind).

However, there are some very valid reasons for tuning beyond what I’ll call the First Order requirement of satisfying an immediate business need of getting whatever needs to run to run in the time window required. And a couple of these I had not really thought about until the twitter discussion. So here are some of the Second Order reasons for tuning.

1 Freeing Up Resource

All the time that batch job is running for hours it is almost certainly using up resources. If parallel processing is being used (and on any Exadata or similar system it almost certainly is, unless you took steps to prevent it) that could be a lot of CPU, Memory, IO bandwidth, even network. The job might be holding locks or latches. The chances are very high that the 4 hour batch process you are looking at is not the only thing running on the system.

If there other are things running at the same time where there would be a business benefit to them running quicker or they are being impacted by those locks, it could be that tuning those other things would take more effort (or even not be possible) compared to the 4 hour job. So tuning the four hour job releases resource or removes contention.

I had an extreme example of this when some id… IT manager decided to shift another database application on to our server, but the other application was so badly designed it took up all available resource. The quick win was to add some critical indexes to the other application whilst we “negotiated” about it’s removal.

2 Future Proofing

Sometimes something needs tuning to complete in line with a Service Level Agreement, say complete in 10 seconds. You get it done, only to have to repeat the exercise in a few months as the piece of code in question has slowed down again. And you realise that as (for example) data volumes increase this is likely to be a recurring task.

It may well be worth putting in the extra effort to not only get the code to sub-10-seconds but significantly less than that. Better still, spend extra effort to solve the issue causing the code to slow down as more data is present. After all, if you get the code down to 1 second and it goes up to 5, even though it is meeting the SLA the user may well still be unhappy – “Why does this screen get slow, then speeds up for a few months, only to slow down again! At the start of the year it was appearing immediately, now it takes forevvveeeeer!”. End users MUCH prefer stable, average performance than toggling between very fast and simply fast.

3 Impacting Other Time Zones

This is sort-of the first point about resource, but it can be missed as the people being impacted by your overnight batch are in a different part of the globe. When you and the rest of your office are asleep – in America let’s say – Other users are in Asia, having a rubbish time with the system being dog slow. Depending on the culture or structure of your organisation, you may not hear the complaints from people in other countries or parts of the wider company.

4 Pre-emptive Tuning

I’m not so convinced by this one. The idea is that you look for the “most demanding” code (be it long-running, resource usage, or execution rate) and tune it up, so that you prevent any issues occurring in the future. This is not the earlier Freeing Up Resources argument, nothing is really slow at the time you do this. I think it is more pandering to someone’s CTD. I think any such effort would be far better spent trying to design in performance to a new part of the application.

5 Restart/Recovery

I’d not really thought clearly about this one until Peter Nosko highlighted it. If that four hour batch that was causing no issues fails and has to be re-run, then the extra four house could have a serious business impact.

Is this a performance issue or is it a Disaster Recovery consideration? Actually it does not matter what you label it, it is a potential threat to the business. The solution to it is far more complex and depends on several things, one of which is can the batch be partially recovered or not?

When I design long running processes I always consider recovery/restart. If it is a one-off job such as initial data take-on I don’t put much effort into it. It works or it fails and your recovery is to go back to the original system or something. If it a regular data load of 100’s of GB of critical data, that process has to be gold plated.

I design such systems so that they load data in batches in loops. The data is marked with the loop ID and ascending primary key values are logged, and the instrumentation records which stage it has got to. If someone kicks the plug out the server or the batch hits a critical error, you can restart it, it will clean up any partially completed loops and then start from after the last completed batch. It takes a lot more effort to develop but the pay-back can be huge. I worked on one project that was so out of control that the structure of the data was not really known and would change over time, so the batch would fail all the bloody time, and I would have to re-run it once someone had fixed the latest issue . The ability to restart with little lost time was a god-send to me – and the business, not that they really noticed.

Simply tuning up the batch run to complete more quickly, as Peter suggests, so it can be blown away and repeated several times is a pragmatic alternative. NB Peter also said he does something similar to what I describe above.

6 Freeing Up Cloud Resource

Again, I had not thought of this before but Thomas Teske made the point that if your system is living in the cloud and you are charged by resource usage, you may save money by tuning up anything that uses a lot of resource. You are not satisfying a First Order business need of getting things done in time but you are satisfying a First Order business need of reducing cost

7 Learning How To Tune

I think this is greatly under-rated as a reason to tune something, and even indulge in a little CTD even when the First Order business need has been met. It does not matter how much you read stuff by tuning experts, you are only going to properly learn how to performance tune, and especially tune in your specific environment, by practice. By doing it. The philosophy of performance tuning is of course a massive topic, but by looking at why things take time to run, coming up with potential solutions, and testing them, you will become really pretty good at it. And thus better at your job and more useful to your employer.

Of course, this increase in your skills may not be a First, Second, or even Third Order business need of your employer, but it could well help you change employer 😀 .

8 Integration Testing

Again highlighted by Peter Nosko, we should all be doing integrated, continuous testing, or at least something as close to that as possible. Testing a four hour batch run takes, err, more time than a 15 minute batch run. For the sake of integration testing you want to get that time down.

But, as I point out..

{joke – don’t take this tweet seriously}

As I think you can see, there are many reasons why it might be beneficial to tune up something where the First Order business need is being met, but the Second Order benefits are significant. I am sure there are some other benefits to speeding up that batch job that finishes at 4am.

I’ll finish with one observation, made by a couple of other people on the Twitter discussion (such as Nenad Noveljic). If you fix the First Order issue, when management is screaming and cry and offering a bonus to whoever does it (like that ever happens) you get noticed. Fixing Second Order issues they did not see, especially as they now may not even occur, is going to get you no reward at all.

Friday Philosophy – Explaining How Performance Tuning Is Not Magic? March 9, 2018

Posted by mwidlake in Friday Philosophy, performance, SQL.
Tags: , , , ,
8 comments

Solving performance issues is not magic. Oh, I’m not saying it is always easy and I am not saying that you do not need both a lot of knowledge and also creativity. But it is not a dark art, at least not on Oracle systems where we have a wealth of tools and instrumentation to help us. But it can feel like that, especially when you lack experience of systematically solving performance issues.

SQL statement be fast!

I recently spent a couple of days with a client discussing how you solve performance issues and I am preparing a talk on exactly that for the up-coming OUG Ireland conference on the 22nd-23rd March. The guys I was talking to at the client are very, very capable production DBAs, but traditionally the developers at their site initially look at any performance issues. If they think it is the database but not something they can fix, they throw it over the fence at the DBAs. It’s not always the database of course and, if it was a simple fix (missing index, obviouosly inefficient query), then the developers fixed it. So these two guys are expected to only solve the more difficult issues. That’s not really fair as, if you are not practising on the simple problems how are you supposed to gain the experience and confidence to solve the harder ones?

Anyway, a part of the discussion was about Explain Plans. What does the COST mean in the plan, they asked? They saw it as some number that in an undefined way gave an indication of how expensive the step in the plan was, but they could not link it back to anything solid that made sense to them. It looked like a magically produced number that was sometimes very, very wrong. Like most (good) technical people, they want to know the details of things and how they work, they don’t want to simple accept something as working.

So I took them through some simple examples of plans and of how the COST is just a result of simple maths estimating the number of IOs needed to satisfy the step.

I won’t go into the full details here but have a look at the below, this is how I started:

I explained how you read “down” the slope of plan to the end (so step 3) and then worked back up the slope. So the first thing Oracle does is the index range scan. I showed them the BLEVEL of the index, the number of blocks per indexed value and why Oracle knew it would, on average, need 3 IOs to get the leaf block entries for the provided “DOB=to_date(’08-OCT-1934′,’DD-MON-YYYY’)”. Each DOB matched, on average, 20 rows. So the cost of step 3 was passed up to the step 2 of accessing the table rows. This would be done 20 times so the cost was 20+3. 23.

OK, they could accept that, it made sense. So let’s extend it…

I took the original query against PERSON for a given DOB and now joined it to a second table PERSON_NAME. Why is not important, it’s just a demonstration of a simple table join:

Now I explained that as you work “back up the slope of the plan” from the first, most indented step (so from step 5 to 4 to 3) at 3 there is a nested loop joining the rows passed to step 4 to the step in line below it, i.e. step 6. They had already seen steps 5 and 4 in our first example, Oracle is expecting to get 20 rows for a cost of 23. Look at line 4. And for each of those 20 rows, it will do a range scan of the index in step 6 and for each row it finds in the index, collect rows from the joined table in step 7.

So for each of the 20 original rows it does a scan of an index on the joined table for a cost of 2 (I showed the stats to them how this cost is calculated) and expects on average to find 5 matching rows so it needs to do 5 IOs to the PERSON_NAME to get those rows. Add that together and that cost of 7 is done 20 times. 7*20 is 140, plus the 23 from the orginal scan of the PERSON table, the whole COST is… 163.

Light bulbs came on and they got it! My job here is done.

But it was not. I then went on to explain how it is now hard to get such a simple example. This one is, I think, from an early version of Oracle 11. I told them how histograms on a column will make the estimated cardinality (number of records per given value for a column) more accurate, but harder to work out. I showed them how the cost of a unique index scan is reduced by 1. I explained how Oracle was blind to the correlation of column values unless you took steps to tell the optimiser about it (you know, how for a given value of car manufacturer there will be only a few values of car model, the two values are related)…

Worry was creeping back into their faces. “so it is simple mathematics – but the rules are complex? It’s complex simple mathematics?!?” Err, yes. And with 11 and 12 Oracle will use various methods to spot when the simple, complex mathematics does not match reality and will swap plans as a result…

I think I lost them at that point. Especially when they asked about the SQL Profiles and how they modified Costs… Baselines controlling how plans are used… Bind variables…

That is a real problem with Oracle Performance tuning now. Even something as simple as COST is based on a lot of rules, factors and clever tricks. And they are not the only things controlling which plan is executed anymore.

So I took a step back.

I told them to accept that the COST is really, honestly based on maths, and the expected number of ROWS is too. But the actual, specific values could be hard to totally justify. And it is when the estimated COST and (possibly more importantly) the estimated ROWS goes wrong you have problems. So look out for ROWS of 1 (or at least very low) in a plan for a statement that takes more than a few milliseconds. And for very, very large COSTS/ROWS in the millions or more. And what really helps id if you get the ACTUAL ROWS as opposed to the ESTIMATED RIWS. Where there is a significant difference, concentrate your focus there. Of course, getting the ACTUAL ROWS is not always easy and is for a later (and properly technical) post.

So, they asked, if they could not get the actual ROWS and there were no 1’s or millions’s in the plan ROWS/COSTS? How did they know where to concentrate? “Well, you just get a feel for it… do the costs feel reasonable?…”

Damn – I’m not sure I really delivered on my aim of proving Performance Tuning is science and not magic.

Any pointers anyone?

(Update – Vbarun made a comment that made me realise I had messed up the end of this post, I was talking about estimated ROWS and still had the words COST there. Now fixed. And the other thing I was hoping someone was going to suggest as a pointer was – to split the query down to individual tables & two-table joins and *check* how many rows you get back with the where predicates being used. It takes some time but it shows you where the estimates are going wrong.)

Overloaded Indexes (for ODC Appreciation Day) October 10, 2017

Posted by mwidlake in database design, development, performance, SQL.
Tags: , ,
add a comment

ODC Appreciation Day is an idea that Tim Hall (aka Oracle-Base) came up with, to show out appreciation for the Oracle Technology Network (OTN)/Oracle Developer Community.

Fig 1 This is an efficient range scan

I want to show my support but rather than mention an Oracle “feature” I particularly like I’d like to mention a “trick” that I (and many others) often employ to help performance. That trick is Overloaded Indexes.

We can all appreciate that accessing data in a table via an index is a very good thing to do when we want just the few records for a given index key from the thousands or millions of rows in the table. As an example, we are looking up all the records for someone and we know their first name, last name and date of birth – a very common occurrence in things like hospital or billing systems. So our PERSON table has an index on those three columns. Oracle will read the root block of the index, read the one relevant block in each of the branch levels for the key, find all of the leaf block entries for that key – and then collect the relevant rows from the table. Fig 1 shows how we think of this working. i.e. most of the records we want to find will be close together in the table.

Actually, a lot of people who just “use” the Oracle database as a store of information sort-of think this is how an index always works. It efficiently identifies the rows that you want and that is the end of it. If the index is on the value(s) you are looking up rows for (say LAST_NAME, FIRST_NAME, DATE_OF_BIRTH) the index is ideal and that is as good as it gets.

Fig 2 – a more usual, less efficient index range scan

But in reality, the index lookup is often far less efficient than this and is more like fig 2. Working down the index structure to get all of the required index entries is exactly the same, but the rows you want are scattered all over the table. Oracle has to fetch many table blocks to get your data, maybe as many blocks as there records to be found. This is far from efficient.

So what can you do about this? You already have the “perfect” index, on LAST_NAME, FIRST_NAME, DATE_OF_BIRTH, the values you are looking up. Maybe you could add another column to the index to avoid those situations where there are many people with the same name and date of birth. But you may not have that extra information or it is simply not possible to identify the values in the table any more accurately, you really do need all the rows scattered though that table for the given search key.

There are “architectural” things you can do such as create the table as an Index Organised Table (see my little set of blogs about them starting here). You can also use various methods to group the relevant rows together in the table. But all of those methods are Big Impact. You need to recreate the table or plan for this eventuality up-front when you design the system.

But there is a very specific, easy thing you can do to address this particular problem, for the SQL statement you need to speed up. You can add all the columns your query needs into the index. This is an Overloaded Index.

An Overloaded Index holds not only the table columns in the WHERE clause but all the columns needed from that table for the query.

Why does this work? Because when Oracle identifies the range of keys for the key (LAST_NAME, FIRST_NAME, DATE_OF_BIRTH) all the other columns it needs are also in those index leaf entries and there is no need to get the rows from the table. All those lookups to the table are avoided. Fig 3 at the end of this article demonstrates this.

However, I’ll give you a real world example I had recently. A client had a SELECT statement something like the below, with the execution plan shown, and it was running too slowly for the business requirements:

SELECT (SUM(NVL(T.TRAN_VALUE_CR,0))-SUM(NVL(T.TRAN_VALUE_DB,0))) ,
        COUNT(*)
FROM    ACCOUNTS     A ,
        TRANSACTIONS T
WHERE   A.ACC_XYZ_IND      =:3
AND     A.ACC_ACCOUNT_NO   =:1             
AND     A.ACC_SUBACC_NO    =:2
AND     T.TRAN_XYZ_IND     =A.ACC_XYZ_IND
AND     T.TRAN_ACCOUNT_NO  =A.ACC_ACCOUNT_NO
AND     T.TRAN_SUBACC_NO   =A.ACC_SUBACC_NO
AND     T.TRAN_P_IND       =:4 
AND     T.TRAN_DLM_DATE    >=TO_DATE(:5,'YYYYMMDD')

------------------------------------------------------------
| Operation                                 | Name         | 
------------------------------------------------------------
|   0 | SELECT STATEMENT                    |              |
|   1 |  SORT AGGREGATE                     |              |
|   2 |   NESTED LOOPS                      |              |
|   3 |    NESTED LOOPS                     |              |
|*  4 |     INDEX RANGE SCAN                | ACC_PRIME    |
|*  5 |      INDEX RANGE SCAN               | TRAN2_3      |
|*  6 |    TABLE ACCESS BY LOCAL INDEX ROWID| TRANSACTIONS |
------------------------------------------------------------
Statistics
----------------------------------------------------------
       4740  consistent gets
       3317  physical reads

The index used on TRANSACTIONS is:

INDEX_NAME                   TABLE_NAME       PSN COL_NAME
---------------------------- ---------------- --- --------------
TRAN2_3                      TRANSACTIONS     1   TRAN_ACCOUNT_NO
TRAN2_3                      TRANSACTIONS     2   TRAN_SUBACC_NO
TRAN2_3                      TRANSACTIONS     3   TRAN_DLM_DATE

The index TRAN2_3 on the TRANSACTION table that you can see being used in the plan was for all the columns being used in the WHERE clause that actually helped identify the TRANSACTION records required – TRAN_ACCOUNT_NO, TRAN_SUBACC_NO and TRAN_DLM_DATE (the TRAN_XYZ_IND and TRAN_P_IND were always the same so “pointless” to index).

I added a new index to the TRANSACTION table. I added a new index rather than change the existing index as we did not want to impact other code and we wanted to be able to drop this new index if there were any unexpected problems. I added all the columns on the TRANSACTION table that were in the SELECT list, were in the the WHERE clauses even though they did not help better identify the rows needed. If there had been TRANSACTION columns in an ORDER BY or windowing clause, I would have added them too. So my index looked like this:

create index TRAN2_FQ on TRANSACTIONS
(TRAN_ACCOUNT_NO   ,TRAN_SUBACC_NO   ,TRAN_DLM_DATE
,TRAN_P_IND        ,TRAN_XYZ_IND
,TRAN_VALUE_CR     ,TRAN_VALUE_DB)

It is very, very important that the new index holds every column from the TRANSACTION table that the query needs.To prevent accessing the table, all the data the query needs for that table must be in the index.

The query could now satisfy the query by just using the new index, as the below explain plan shows.

----------------------------------------------------
| Id  | Operation                      | Name      |
----------------------------------------------------
|   0 | SELECT STATEMENT               |           |
|   1 |  SORT AGGREGATE                |           |
|   2 |   NESTED LOOPS                 |           |
|*  3 |    INDEX RANGE SCAN            | ACC_PRIME |
|*  4 |     INDEX RANGE SCAN           | TRAN2_FQ  |
----------------------------------------------------
Statistics
----------------------------------------------------
         56  consistent gets
         52  physical reads

There is now no line in the plan for visiting the table TRANSACTIONS and we are using the new TRAN2_FQ index. The consistent gets and physical reads to satisfy the query have gone down from 4740 and 3317 respectively to 56 and 52. I think that is good enough.

Fig 3 shows what is happening. The new index is effectively a “mini IOT” designed to support the given query.

Fig 3 – Tables? Who needs tables!

There are of course a few caveats. The new index needs to be maintained, which is an overhead on all INSERT/UPDATE/MERGE/DELETE activity on the table. The index will only remove the need to visit the table for queries that are very, very similar to the one it is designed for – ones that use the same rows from the TRANSACTIONS table or a subset of them. If you alter the query you, e.g. select another column from the TRANSACTION table you would need to revisit this overloaded index.

Finally, be careful of modifying existing indexes to overload them to support specific queries. If the index is there to support referential integrity you need to think carefully about this and the modified index may be less efficient for other queries that used the original index (as adding columns to an index make it “wider”).

BTW if you think you recognise this from a recent Oracle Scene article then you would be right. I was pushed for time, I used something I had written before 🙂

Why oh Why Do We Still Not Have a Fast Bulk “SQL*Unloader” Facility? December 1, 2016

Posted by mwidlake in Architecture, database design, performance.
Tags: , , ,
6 comments

Way back in 2004 I was working at the UK side of the Human Genome project. We were creating a massive store of DNA sequences in an Oracle database (this was one of two world-wide available stores of this information, for free & open use by anyone {* see note!}). The database was, for back then, enormous at 5-6TB. And we knew it would approx double every 12 months (and it did, it was 28TB when I had to migrate it to Oracle 10 in 2006, over 40TB 6 months later and grew to half a petabyte before it was moved to another organisation). And were contemplating storing similar massive volumes in Oracle – Protein, RNA and other sequence stores, huge numbers of cytological images (sorry, microscope slides).

I did my little bit to work out how we all tick

I did my little bit to work out how we all tick

Significant chunks of this data were sometimes processed in specialist ways using Massively Parallel Processing farms (hundreds or thousands of compute nodes so we could do “2 years of compute” in a day or even a few hours). Oracle was never going to be able to support this within the database.

But Oracle had no fast, bulk data unloading offering, a “SQL*Unloader”. Other RDBMSs did. And that gave me an issue. I had to argue as to why hold this data in Oracle where it was hard to yank out chunks of it rather than keep it in say a simple file store or MySQL?

That year I was invited to Oracle Open World and I was given a slot with a senior VP in the RDBMS/OCFS area (as that is what were using – I know! But we were). I had been warned by someone that he could be a little “robust”. In fact my friend called him “Larry’s Rottweiler”. However my chat with the SVP went fine. Until he asked me for the one thing I would really want and I said “well, it’s been promised for the last 2 or 3 versions of Oracle, but it still does not exist in V10 – I’d like a SQL*Unloader so we can rapidly extract bulk data from Oracle into a structured file format”. He went very cold. I continued: “We need to do really intensive processing of the data with ‘C’ and if it is locked into the database we can’t do that. It’s hard for me to justify using Oracle as our primary data store if the data is now stuck there…”

I honestly thought he would bite!

I honestly thought he would bite!

He went ballistic. He was furious, he was raising his voice, leaning forward, I honestly thought he might hit me. I can’t remember the whole rant but one line stuck: “We will NEVER help you get your data out of Oracle!”. I ended the discussion as soon as I could – which obviously suited him fine also.

And here we are over 10 years later and no SQL*Unloader has appeared. And, as far as I can see, there is no longer even a suggestion that “one day” one will exist. You can obviously get data out of Oracle using SQL*Plus, SQLcl or PL/SQL’s UTL_FILE. But you have to write your own code and to make it parallel or fast is not that simple. And there are some commercial tools out there to do it. But fundamentally we do not have a simple, robust & fast data unloader tool within the Oracle toolset.

But I remain mystified as to why Oracle Corp will not provide a tool to do this. I can see the argument that doing so would help you migrate your Oracle Data set to a different platform and thus move away from Oracle, but I think that is a dumb argument. If you are willing to dump your Oracle environment for something else you are already not happy – and making life difficult is only going to make you less happy and almost certain to never come back! It’s incredibly inconvenient to extract all your data at present but compared to the cost to your organisation of changing a central part of your infrastructure, it’s just a bloody annoyance. It’s like domestic service providers (telephone, internet, TV) that make it easy for you to sign up but a nightmare to leave. Thus guaranteeing that once you leave (cos you will) you will certainly NOT be going back to them in a hurry!

So for the sake of what I see as a misplaced protectionist stance they close the door to rapidly extracting data from Oracle databases for processing in other ways. I’ve come across other situations like this, before and since, but the Human Genome issue threw it into sharp relief for me. The end result I’ve seen a few times (and effectively what we had at the Sanger) is the data gets stored at least twice – once in Oracle and then somewhere else where it is easy to access. Now that’s a daft and bad place to be, multiple copies and one of them at least lacking RI. Thanks for forcing that one on us Oracle.

Something that is changing is Cloud. Oracle wants us all to move our data and systems into the sky. Their whole marketing message at the moment is nothing but cloud-cloud-cloud and they give the impression that cloud is in fact everything, every solution. So how do you get 10TB of data into the cloud? It’s not so hard really. You can trickle it in over time, after all networks are constantly getting faster, but for bulk data this is a poor solution. Or you can ship your data physically. You can’t beat the bandwidth of a transit van. I’ve heard Oracle people at OOW and other conferences saying this is easy-peasy to arrange (as it should be, I’ve done it for clients doing migrations a couple of times).

But how are you going to get it out again? How well do you think that a company that has never provided SQL*Unloader is going to help you get bulk data back out of your Oracle cloud-based databases into on-premises systems? And how well are they going to help you shift it from one of their systems to another cloud system? One they have not sold you? Sometimes people talk about business relationships being like a marriage. Well once you gave gone cloud, you might have ensured any divorce is going to be really messy!

Update – see the comment by Noons on the fact that VLDBs and Data Warehouses are often not as static as you think – so there can be a constant need to move a lot data into and out of such very large data stores. Cloud is not currently going to support this well. And if you have a data life-cycle management policy, say archiving off data 3+ years old. Where are you archiving it off to and how will you do this once it is in the web?

* Please note, this was not human DNA sequence information gathered for medical reasons! It was mostly sequence information for other organisms (chimps, flies, potatoes, rice, viruses, bacteria all sorts) generated by the scientific and, in some cases, biopharma community. There was human DNA sequence in there but by consent and anonymised – you can’t have a human genome sequence without involving some humans! We did not hold your Dodgy Uncle’s DNA test that was taken for that court case…

Friday Philosophy – Your Experience can Keep You Ignorant November 18, 2016

Posted by mwidlake in Friday Philosophy, Knowledge, Perceptions, performance, SQL.
Tags: , , , ,
12 comments

This week I was in an excellent presentation by Kerry Osborne about Outlines, SQL profiles, SQL patches and SQL Baselines. I’ve used three of those features in anger but when I looked at SQL Patches I just could not understand why you would use them – they looked to me like a very limited version of SQL Profiles.

There is a prize for spotting Kerry without a baseball cap

There is a prize for spotting Kerry without a baseball cap

So I asked Kerry about it during his presentation (he had been encouraging us to ask questions and I was curious). The answer? They are almost identical in how they are used and the impact they have but, yes, they are more limited. You can only enter one line of hints (I think it can be more than one hint but I have not checked {update, see comment by Jonathan Lewis – it is 500 characters of hints you can add, so you can comprehensively hint most statements}) with a SQL Patch. But, and this is the crucial bit, they can be used without the tuning pack and can be used on Standard Edition. Which makes them very useful if you are limited to using a version of SE or have not paid for the extra cost tuning pack option on Enterprise Edition. Kerry told me the first part about no cost and Kamil Stawiarski the part about being available on SE.

That’s a really useful thing to know. Why did I not know it? Because nearly all my experience of performance work has been for clients who have Oracle Enterprise Edition and the tuning pack. Most companies who are willing to hire someone to do Oracle Performance work have paid for Oracle Enterprise Edition and usually for many of the options. A company who is saving money by having SE is far less likely to have the money to hire external consultants (more the pity, as I would really like to spend time working for smaller companies where you can usually get more done!)

My experience, or rather lack of it, had blinded me to the possible uses of an Oracle performance feature. I wonder how much other stuff I don’t know or appreciate about an area I claim to be knowledgeable and skilled in – because my experience is mostly for clients with a very full Oracle tool set? How true is this in all sorts of areas of my technical and personal life? Quite a lot I suspect. That can be a little dis-spiriting.

But mulling this over later that evening (with beer of course) four things occurred to me:

  • If someone claims skills in an area but does not know things you do, it could well be simply down to their personal experience to date. It does not mean you know more, it means you know different stuff.
  • How much does someone have to not know before you decide they are not the expert they claim?
  • Do we partial experts keep ourselves slightly ignorant by not asking questions – as we fear that second point?
  • I felt I did the right thing in asking a question about something I felt I should know – but did not. As now I know the answer (and two people at least got to show they know more than me).

I know I have said this before, as many others have. The more you know, the more you realise what you do not know. You just have to keep asking and remember that, we are all ignorant about something until someone tells us about it.

OTN Appreciation Day – Instrument Your Damned Code. Please! October 11, 2016

Posted by mwidlake in ACED, performance, PL/SQL.
Tags: , ,
1 comment so far

Today is OTN Appreciation Day.

This day is the idea of Tim Hall, Mr OracleBase, and you can See his post here. The idea is that as a sign of appreciation to OTN we do a technical (or not so technical) post on a feature of Oracle we like. I’m going to visit an area I have mentioned before…

DBMS_APPLICATION_INFO.

In my opinion, one of the key differences between good code and poor code is that it is instrumented in someway. I certainly believe that an application that is instrumented is much easier to live with and, over the lifetime of that application, runs much more smoothly and significantly more efficiently as (a) problems can be resoled easier and (b) any change in code performance is easy (and thus more likely) to be identified. It is because of this second point that I feel the overhead of instrumentation is not 10% or 20% but rather -15% – it improves overall performance.

The easiest way to instrument PL/SQL code is to use DBMS_APPLICATION_INFO. You use simple calls to set the values for the columns MODULE and ACTION in V$SESSSION and you can also populate a third column CLIENT_INFO.

ora122> desc v$session
 Name                                      Null?    Type
 ----------------------------------------- -------- ------------------------
 SADDR                                              RAW(8)
 SID                                                NUMBER
 USERNAME                                           VARCHAR2(30)
...
 MODULE                                             VARCHAR2(64)
 MODULE_HASH                                        NUMBER
 ACTION                                             VARCHAR2(64)
 ACTION_HASH                                        NUMBER
 CLIENT_INFO                                        VARCHAR2(64)

It is incredibly easy to use the DBMS_APPLICATION_INFO built-in package to set these columns via the procedures set_module(module_name,action_name), set_action(action_name), and set_client_info(client_info). The values set are held in memory, and there is no inserting or updating of any table rows – so it is incredibly quick and light-weight. V$SESSION is a view on top of memory structures.

Part of the reason that DBMS_APPLICATION_INFO is so powerful is that nearly all performance and development tools that are used for Oracle will show you this information. Of course this is especially true of Oracle tools such as OEM/Grid Control and SQL*Developer. Thus when you are looking at what code is taking up time or running slowly, you can use MODULE and ACTION to identify the exact code and area where the issue is – you are not having to guess. One of the reasons that performance tuning and problem resolution can be so hard is that you are guessing as to the source of the problem. Instrumentation means you know, you are not guessing. DBMS_APPLICATION_INFO is simple and thus, almost as a direct result of this, efficient at identifying where the problems are without blinding you with too much information.

The below is from OEM Express, showing which modules and actions are taking up the most time.

OEM Express showing what modules take up time

OEM Express showing what modules take up time

I go into more details about DBMS_APPLICATION_INFO at this post about it and I go into a lot of detail about this and other types of PL/SQL instrumentation in the book I contributed to – click on the link on the right of this web page for details on that 🙂

The OTN Program
Like many people I am doing a technical blog today in order to publicly thank the Oracle Technology Network for what they do. Why do I want to do this? Well, because they have helped me for years and they support me helping others.

As I write this blog I am rudely ignoring Ludovico Caldara explaining 12c dynamic optimizer features to a carefully listening crowd in Denmark, as part of the OTN Nordic tour. They help fund some of us on the ACE program to go on these tours and spread knowledge with people. I would struggle to do all the talks I do if, as an ACE Director, the OTN ACE program did not help cover some of my expenses. They do not pay everything but, heck, I can’t go complaining about them *only* paying for some of it!

OTN provide huge amounts of information on their web site(s), the place you go to download the software and documentation for free.

The final thing for me personally is that they help me contact people in Oracle who can help me better understand parts of the tech or even report back issues and help get them resolved.

Thank You OTN

OUG_Ireland – Bread, Beer & Playing with the Big Boys March 7, 2016

Posted by mwidlake in conference, performance, Presenting, User Groups.
Tags: , ,
2 comments

The last banner turned out to be wrong

The last banner turned out to be wrong

ougire16-hp-gen-v1

Learning, sharing & a small dose of fear – Last week was the OUG Ireland conference 2016. It is the second time I’ve been to the Irish Oracle user group conference and this year it had grown from one day to two days – and that was a great success. The expanded 2-day event, as initially sized, booked up with several weeks still to go. More places were made available – and they all filled as well. You could tell this, at least in the sessions I attended – they were all busy and many were full. The whole event had a busy feel.

I was doing two quite different session. The first was my overview of the core Oracle technology, a broad-brush, broad-appeal session. At the other end of the spectrum, the second was about the details of context switching between SQL & PL/SQL, the impact it can have and the improvements in 12C to help reduce that impact. People are becoming more aware of this aspect of calling PL/SQL from SQL and also the potential to lose your point-in-time view of the database, which many take for granted. I actually dropped the loss of point-in-time view out of the talk as it made the session too long – but someone asked directly about it! So I spent the last 5 mins I’d left for Q&A talking about it. If I repeat the session somewhere, perhaps I’ll add those slides back in and just talk less rubbish about me earlier on.

Oren Nakdimon actually mentioned one of the 12C improvements (PL/SQL in the WITH clause) in his excellent talk on the first day “Write less (code) with More (Oracle 12c new features)” including one aspect I had missed. So I stole his material and added a slide into my deck for Friday. I’m glad I referenced him on that slide – as he was sat in the front row in my talk 🙂

(as an aside, this is my second time at OUG Ireland and the second time I’ve met someone from Israel who I shared beers with & spent quite a bit of time with – just one of those quirky coincidences in life. Oh, and I met his wife Inbal too and we had a chat about translating Sci-Fi from English to Hebrew! It’s not easy, but Klingon is easier to handle than you’d think).

See, no beer! (it's inside the podium...)

See, no beer! (it’s inside the podium…)

My other talk was on the first day and it was the one where I cover the core Oracle technology in a single presentation slot. It’s a bit of a Big Ask in 45 minutes but I tend to throw in less anecdotes or stories than normal in order to cover the material. However, this time I could not resist the temptation to include one, in order to continue the unfair myth that I can only present with a beer in hand…

I got a lot of great feedback about this talk, several people checked that they would be able to get the slides afterwards (they have been sent to the organisers and they should be put out in due course – but I can send anyone a copy), I got a few more questions during the conference, I spent about 30 minutes talking to one chap afterwards (about a very “interesting” project he is on where sense does not seem to be prevailing) and a lot of people said they enjoyed it. For a speaker, there is no better result and I was quietly happy. I might go as far as to say I was jolly chuffed.

I managed to get to a fair few presentations by others at this conference too and I did not see a bad one in the DB or Development stream. Jonathan Lewis and Joze Senegacnik were as excellent as ever (I think Jonathan was particularly on form) and I was reminded at how good Carl Dudley is at this presenting lark. I missed out seeing my Polish ORA600 counterpart, Kamil Stawiaski, due to a clash for one session and presenting at the same time as him on the other. But I got to have beer with him and his friend Adam Bolinski and a good few chats.

On the topic of Beer, as well as being forced to have one in my presentation on the first day (cough!), we had an excellent “presenters beers” on Wednesday evening before the event, organised by Brendan Tierney. Brendan had us meet up, on a foul and wet evening, at a pub with it’s own micro-brewery (so I was able to avoid Guinness which, I have to confess, outside of the Guinness Museum is Not My Thing) and my nephew Tim Hall’s (oracle) wife, Debra Lilley {they are not really married, but sometimes you wonder at the bickering…}, organised an excellent Oracle Ace meal on the Thursday night. I had to drop out early though as sleep deprivation was killing me (and only got worse for the final day – that was one of the noisiest hotels I have ever stayed in and I won’t be rushing back) but I did get to share the taxi back to the “Hotel Of Random Noise” with Debra (she was flying somewhere at early-O’clock the next day) and catch up.

I actually see the social side of conferences as just as important as the technical side. I have made some good friends with people at these technical meetings and the more you know people in the user community, the more you get out of it.

Well, Officer, It's yeast... Honest... Honest!!!! Argh!

Well, Officer, It’s yeast… Honest… Honest!!!! Argh!

So I’ve mentioned the beer. What about the bread? Well, I make sourdough. I have a pot of “starter mixture” in my fridge, which is a live culture of yeast in a slack dough mix. It’s alive (sorry, “It’s ALIIIIIIIVEEE!”), it needs looking after. It is, basically, one of the most boring pets you can have – but at least it never needs to be taken to the vet or does its business on the carpet. There are a few of us bakers in the Speaker Circuit. Jonathan is another one and so is Joze. And Joze also has a yeast colony pet in the fridge. We talked about our sour-dough pets at a conference last year and, as a result, Joze bought me a little bag of powder – dried mixture, stuffed with yeast. I’m going to start up a new jar of dough mixture, sourced from this powder. I’ll do a post about what happens.

There is one aspect of being given a bag of dried dough when on conference. You have to take it home. In your luggage. In my case, hand luggage. And in November, for the first time ever that I can remember, they decided to check my hand luggage. What if they did that this time?
“Is this your bag, Sir?”
“Yes”
“Is this plastic bag of a pale, dried mixture yours, Sir?”
“Yes Officer, but it’s OK it’s yeast!” – Bright Smile.
“Yes, of course it is Sir. Come this way.”
“But it’s just yeast and flour, it’s nothing dodggy!!!”
“Of course it is not Sir, and now my colleague here will now examine places you don’t want examining for more such samples”.

It added an extra level of excitement into what is usually a dull experience of international air travel.

The "rose" between two "thorns". If Thorns means The Best

The “rose” between two “thorns”. If Thorns means “The Best”

So that was the dose of fear? No. I did two talks at the conference but I had one other duty. At the UKOUG Tech conference last December, Jonathan Lewis hosted a Q&A session on the Cost Based Optimizer. We had Jonathan, Chris Antognini, Maria Colgan and Nigel Bayliss, with myself and Neil Chandler as Masters of Ceremony. ie, Question Wranglers. It worked really well, with some genuine-but-known questions to start until the audience warmed up, and they warmed up.

Jonathan decided to repeat this format in Ireland and he asked me to reprise my role, which I was delighted to do. I’m more than happy to fire in the questions to the experts, so long as I’m here at the side and they are there up on the stage.

One of our experts for Ireland had to drop out due to only being there on the Thursday. Jonathan asked me who can fill in. No problem, we have skills across the agenda, “Dave” can help. Dave said yes…but then no, it turned out Dave could not do it. So I was asked again who could fill and we got Jessica – but Jessica had to leave early. You can see what is coming, can’t you? We were out of replacements. A panel of 2 does not work really, you need three or more. So, I was promoted to Panel. Oh cr4p!

Alan Kelly stepped into the breach to be Question Master (Personally, I don’t know whether to thank Alan or not! But for the sake of the session, thank you Alan) and I joined two masters of the art on stage. Don’t get me wrong, this is not false modesty, I’m pretty good at this stuff. But I’m “Pretty Good”. Calm, Breath, Relax…

It was fine. Jonathan and Joze handled the tricky couple of questions and let me talk when I had stuff to add and the whole session was a hum-dinger. How do I know that? Well, the session was for 45 minutes at the end of the last day, on a Friday with all the usual pressures to get away and get home. And yet, despite that, we finally closed the session down after 90 minutes. I have to confess, I’m deeply proud of the fact I was able to play an active role and not just sit there.

All in all, I had a fantastic time and I think it was a cracking conference. Can I come next year? (Only, next year, can I keep to being Question Master! Alan can have a go up on stage)

Getting Your Transaction SCN – USERENV(COMMITSCN) January 19, 2016

Posted by mwidlake in development, performance, SQL.
Tags: , , ,
3 comments

A few days ago I was introduced (or re-introduced) to USERENV(‘COMMITSCN’) by Jonathan Lewis. This is an internal function that allows limited access to the SCN of your transaction.

I was trying to find a way to get the actual commit SCN easily as it struck me that Oracle would have it to hand somewhere and it would be unique to the change and generated very efficiently. I could not find anything to do it so I asked Jonathan and he pointed me straight to this post he did about it a while back. What a nice chap. However, the post is from 1999 (last CENTURY!) so I thought I should just check it out first…

I’ll say up front that this is an undocumented feature of a now-deprecated function. Well, almost undocumented – older SQL reference manuals mention that ‘commitscn’ returns a NUMBER as opposed to the VARCHAR2 returned by most parameters you can use with USERENV, but it does not list it as a valid parameter for that function.
USERENV has been deprecated since Oracle 10g (see the old 10g documentation link here about USERENV) and you have been instructed to use SYS_CONTEXT(‘userenv’,’parameter‘) as described in the 12c database SQL reference manual here. However, there is no way to get the commit SCN from SYS_CONTEXT that I can find, so I thought I’d check out if USERENV(‘COMMITSCN’) still works. It does, on my version of Oracle 12.1.0.2!

There are some strict limits to this function. To begin with, you can’t select it, you can only use it on insert/update:


-- attempt a simple select of the SCN
mdw> select userenv('commitscn') from dual;
select userenv('commitscn') from dual
               *
ERROR at line 1:
ORA-01725: USERENV('COMMITSCN')  not allowed here

--But I can use in an insert,
mdw> insert into test_scn (seq_no,vc1,scn_no)
  2  values (100,'abcd',userenv('commitscn'))

1 row created.

mdw> select * from test_scn where seq_no=100

    SEQ_NO VC1            SCN_NO
---------- ---------- ----------
       100 abcd         11144739

-- Now commit my new record
mdw> commit;
Commit complete.

mdw> select * from test_scn where seq_no=100
  2  /

    SEQ_NO VC1            SCN_NO
---------- ---------- ----------
       100 abcd         11144753

--LOOK at the value for SCN_NO now! Compare to before the commit!

If you look at the above you will see a couple of things. The first is that, as I said, you cannot SELECT the function USERENV(‘COMMITSCN’).
The other is, though a value is put into the column when I insert a row using it, and I see that when I query the information back, it changes when I commit. This is because Oracle is recording something at the point of commit, not at the point of the SQL statement running – and the new SCN is only generated when you commit. A lot could have happened since I did the INSERT, I might have gone for a cup of tea and a batch job kicked off doing 1 million transactions, each with it’s own SCN. So though a placeholder of the current SCN is put into your view of the table row, the value put in the actual table is generated at the time of the commit.

Another limiting rule is that you can only reference USERENV(‘COMMITSCN’) once in a transaction, for one row. If I try and create 2 rows using the function in the same transaction I get an error, if I try to update more than 1 row I get an error:

mdw> insert into test_scn (seq_no,vc1,scn_no)
  2  values (101,'abcd',userenv('commitscn'))

1 row created.

mdw> insert into test_scn (seq_no,vc1,scn_no)
  2  values (102,'abcd',userenv('commitscn'))

insert into test_scn (seq_no,vc1,scn_no)
            *
ERROR at line 1:
ORA-01721: USERENV(COMMITSCN) invoked more than once in a transaction

-- now test updating several records
mdw> commit;
Commit complete.

mdw> select * from test_scn;

    SEQ_NO VC1            SCN_NO
---------- ---------- ----------
         1 AAAAAAA      11143743
         2 AAAAAAA      11143746
         3 AAAAAAA      11143749
         4 AAAAAAA      11143774
         5 AAAAAAA      11143777
       100 abcd         11144753
       101 abcd         11145543

mdw> update test_scn set scn_no = userenv('commitscn');
update test_scn set scn_no = userenv('commitscn')
       *
ERROR at line 1:
ORA-01721: USERENV(COMMITSCN) invoked more than once in a transaction

-- but one row works
mdw> update test_scn set scn_no = userenv('commitscn') where rownum =1;
1 row updated.

USERENV(‘COMMITSCN’) is old, undocumented and limited in use. So why am I looking at it, let alone even telling you all about it? {Apart from the fact that a lot of you *love* this sort of obscure, tid-bitty stuff}. Well, as there is no replacement for it that I am aware of. You can get the current SCN in a couple of ways, the easiest probably being to get it from V$DATABASE:

mdw> select current_scn from v$database;
any key>

CURRENT_SCN
-----------
   11146718

However, that is the last SCN used at the time you check it and is not the SCN when you commit. ie it is a different thing. I always find it irksome on those odd occasions when something is deprecated in Oracle with nothing really to replace it.

I just demonstrate again that USERENV(‘COMMITSCN’) is a little special below, and not the same as just selecting SCN from V$DATABASE. Before I go any further, I think the value USERENV(‘COMMITSCN’) puts into the table is the actual COMMIT SCN minus 1. I mostly think this as Jonathan said so :-). I do see each time I run this test that the first select from V$DATABASE and then my insert and a commit straight away results in a value in the table 2 higher than the select.

Further iterations (2nd and 3rd in this case) show the value selected from V$DATABASE and the value inserted into TEST_SCN immediately after are the same, and are 3 higher than the previous iteration. I anticipated an increase of two, once for the change to the UNDO tablespace for the insert and once for the insert. I am not sure where the third one comes in.

However, in the fourth iteration I have a PAUSE in my SQL*Plus script between checking V$DATABASE and doing my insert and, in a second session, I do some simple inserts and commits {it does not matter what, so I don’t show it}. Thus the difference between the SCN I collected from V$DATABASE and the value inserted into the table.
Finally, in the fifth iteration, I check the value in V$DATABASE, do the insert, query it back and see the two are the same. And THEN I pause so I can do some changes and commit them in my second session. After I’ve done that I continue my first session which commits my latest insert into TEST_SCN. I check the values actually stored in the table and, just as at the top of this post, you see that the value actually preserved in the table is a later SCN than the placeholder one. It is doing something special.

(the below has the noddy code to create my test table and sequence as well as the test)

-- test_scn1
--create table test_scn (seq_no number,vc1 varchar2(10),scn_no number)
-- create sequence scn_seq;
select current_scn from v$database;
insert into test_scn values (scn_seq.nextval,'AAAAAAA',userenv('commitscn'));
commit;
select * from test_scn;
select current_scn from v$database;
insert into test_scn values (scn_seq.nextval,'AAAAAAA',userenv('commitscn'));
commit;
select * from test_scn;
select current_scn from v$database;
insert into test_scn values (scn_seq.nextval,'AAAAAAA',userenv('commitscn'));
commit;
select * from test_scn;
select current_scn from v$database;
pause 'commit some stuff in second session and then press any key'
insert into test_scn values (scn_seq.nextval,'AAAAAAA',userenv('commitscn'));
commit;
select * from test_scn;
select current_scn from v$database;
insert into test_scn values (scn_seq.nextval,'AAAAAAA',userenv('commitscn'));
pause 'commit some stuff in second session again and then press any key'
select * from test_scn;
commit;
select * from test_scn;
select current_scn from v$database;

-- the output of the test
mdw> @test_scn1

--check V$DATABASE SCN
CURRENT_SCN
-----------
   11147809

-- Create and commit 1 row
1 row created.
Commit complete.

    SEQ_NO VC1            SCN_NO
---------- ---------- ----------
        11 AAAAAAA      11147811
-- note that the inserted SCN is 2 higher than the current SCN.

--Same steps, 2nd iteration
CURRENT_SCN
-----------
   11147814

1 row created.
Commit complete.

    SEQ_NO VC1            SCN_NO
---------- ---------- ----------
        11 AAAAAAA      11147811
        12 AAAAAAA      11147814
-- select SCN and inserted SCN are the same, 3 higher than first iteration

-- same steps, 3rd iteration
CURRENT_SCN
-----------
   11147817

1 row created.
Commit complete.

    SEQ_NO VC1            SCN_NO
---------- ---------- ----------
        11 AAAAAAA      11147811
        12 AAAAAAA      11147814
        13 AAAAAAA      11147817
-- select SCN and inserted SCN are the same, 3 higher than first iteration

-- 4th iteration, a pause
CURRENT_SCN
-----------
   11147820

'commit some stuff in second session and then press any key'
-- I did indeed change and commit some stuff in second session, before I create my record in test_scn

1 row created.
Commit complete.

    SEQ_NO VC1            SCN_NO
---------- ---------- ----------
        11 AAAAAAA      11147811
        12 AAAAAAA      11147814
        13 AAAAAAA      11147817
        14 AAAAAAA      11147831
-- larger gap in SCN (11147817 to 11147831

-- 5th iteration, pause now after insert and before commit
CURRENT_SCN
-----------
   11147834

1 row created.
'commit some stuff in second session again and then press any key'
-- I did indeed change and commit some stuff in second session 

    SEQ_NO VC1            SCN_NO
---------- ---------- ----------
        11 AAAAAAA      11147811
        12 AAAAAAA      11147814
        13 AAAAAAA      11147817
        14 AAAAAAA      11147831
        15 AAAAAAA      11147834

-- Notice the current_scn from V$DATABASE and the last row in the table match, 11147834..

Commit complete.

    SEQ_NO VC1            SCN_NO
---------- ---------- ----------
        11 AAAAAAA      11147811
        12 AAAAAAA      11147814
        13 AAAAAAA      11147817
        14 AAAAAAA      11147831
        15 AAAAAAA      11147842

-- But after the commit the SCN in row "15" has increased to 11147842.

CURRENT_SCN
-----------
   11147851

-- and the next check of V$DATABASE SCN shows the usual increase of 3 by the commit.

As you can see from the above, USERENV(‘COMMITSCN’) is doing something a little special and, despite all the limitations, I might actually have a use for it…

Pragma UDF – Some Current Limitations November 11, 2015

Posted by mwidlake in performance, PL/SQL, SQL, Testing.
Tags: , , , ,
12 comments

There are currently some limitations to when pragma UDF will speed up your calls to PL/SQL functions from SQL.

In my post introducing the new pragma UDF feature of Oracle 12c I explained how it can be used to reduce the impact of context switching when you call a PL/SQL function from SQL.

In my example I showed how running a SQL-only SELECT statement that formatted a name for display over 100,000 records took 0.03 seconds went up to 0.33 seconds when the formatting SQL was put in a user defined PL/SQL function. This impact on performance is a shame as it is so beneficial to encapsulate business logic in one single place with PL/SQL. Stating that the PL/SQL function is a user defined one with the pragma UDF option reduced the run time to 0.08 seconds – which is removing most of the context switching overhead. Check out the prior post for full details.

This improvement in performance is great news and is as good, and sometimes better, than using the other new capability of 12c – allowing you to state a function as part of a SQL statement using the WITH clause, if you know about that (I plan to do a further post on that). As a quick example, here is my display name function code expressed within a WITH clause:

with
  function l_disp_name(p_sn      in varchar2
                      ,p_fn1     in varchar2
                      ,p_fn2     in varchar2 :=null
                      ,p_title   in varchar2 :=null )
return varchar2
is
v_return     varchar2(1000);
begin
  v_return := case when p_title is null then ''
                   else initcap(p_title)||' '
              end
            ||initcap(p_fn1)||' '
            ||case when p_fn2 is null then ''
                   else substr(p_fn2,1,1)||' '
              end
            ||initcap(p_sn);
return v_return;
end l_disp_name;
select  max(l_disp_name(p_sn =>surname           ,p_fn1  =>first_forename
                       ,p_fn2=>second_forename   ,p_title=>pers_title)        ) text_output
       ,count(*)
from pers
/

The above runs in 0.10 seconds, just slightly slower than 0.08 for my pragma UDF function

However, I need to warn you of some current limitations to pragma UDF. Nearly all the examples on the web so far are

  • using very, very simple functions that take in a number and return a number
  • Use a stand-alone stored function

And they work fine. However, I had real trouble getting a performance gain when I was working with my function that took in four varchar2 inputs and returned a varchar2 value. No error was given when I marked the function with pragma UDF but there was no performance gain (or loss).

I eventually worked out some limitations to pragma UDF on my version of Oracle – 12.1.0.2.0

  1. It gives a performance boost when the inputs and return values are NUMBER, VARCHAR2, multiple VARCHAR2 IN parameters
  2. There is no performance boost when either or both the IN parameter or RETURN value is a DATE
  3. There is no performance boost if there are any default values for VARCHAR2 IN parameters
  4. If the function gains a performance benefit from pragma UDF as a standalone stored function, it appears to also gain an advantage if it is a function defined as pragma UDF within a package – so you can still keep all your functions in packages.

You might notice that my example of using the WITH clause states a function that has default values. The WITH option gains the performance advantage of that feature just fine with IN parameter defaults.

The take-home message is that, at present, pragma UDF only seems to help functions with certain types of IN or RETURN values and is nullified by default values – so if you see no performance gain for your functions, this might be why. I need to stress that my tests were not exhaustive, I have not investigated many other combinations.

I’ve discussed the issue with a couple of people within Oracle and the relevant Product Manager is looking to investigate further for me, which is jolly decent of the fellow.

My investigation is of course only by empirical testing, it does not reveal how pragma UDF works. But, as I said in my first post, it seems to aid how information is passed between the PL/SQL and SQL engines as it is variation in those that seem to nullify the benefit of pragma UDF. If you want to duplicate my tests, you can do with the below scripts. I show my test output first, with comments produced with PROMPT commands. I then give you the SQL to create the test table, the functions and package I used and the test script. Feel free to take, expand and let me know of anything different or further you may find. I say nothing of interest after the scripts, so this is in effect the end of the post 🙂

The output of my test, with prompts:

running udf_tests
investigating why pragma udf helps some simple functions and not others
---------------------------------------------------------------------- --

simple number in-number out function
NUMBER_OUTPUT   COUNT(*)
------------- ----------
  10000000000     100000
Elapsed: 00:00:00.12

NUMBER_OUTPUT   COUNT(*)
------------- ----------
  10000000000     100000
Elapsed: 00:00:00.03
** udf helps

simple varchar in-varchar out function
TEXT_OUTPUT                      COUNT(*)
------------------------------ ----------
XYYYY                              100000
Elapsed: 00:00:00.12

TEXT_OUTPUT                      COUNT(*)
------------------------------ ----------
XYYYY                              100000
Elapsed: 00:00:00.04
** udf helps

two varchar in-varchar out function, is the issue with more than one in parameter?
TEXT_OUTPUT                      COUNT(*)
------------------------------ ----------
XYYYYYYYYY                         100000
Elapsed: 00:00:00.14

TEXT_OUTPUT                      COUNT(*)
------------------------------ ----------
XYYYYYYYYY                         100000
Elapsed: 00:00:00.04
** udf helps

simple date in-date out function
DATE_OUTPUT            COUNT(*)
-------------------- ----------
14-MAY-2010 13:11        100000
Elapsed: 00:00:00.15

DATE_OUTPUT            COUNT(*)
-------------------- ----------
21-NOV-2004 13:11        100000
Elapsed: 00:00:00.15
***************************************************************SIMILAR TIME!!!
udf does not help

is date out the issue,  date in-num out function
NUMBER_OUTPUT   COUNT(*)
------------- ----------
      2454431     100000
Elapsed: 00:00:00.17

NUMBER_OUTPUT   COUNT(*)
------------- ----------
      2454231     100000
Elapsed: 00:00:00.18
***************************************************************SIMILAR TIME!!!
udf does not help

is date in the issue,  num in-date out function
DATE_OUTPUT            COUNT(*)
-------------------- ----------
07-AUG-2018 18:11        100000
Elapsed: 00:00:00.21

DATE_OUTPUT            COUNT(*)
-------------------- ----------
11-NOV-2015 17:57        100000
Elapsed: 00:00:00.21
***************************************************************SIMILAR TIME!!!
udf does not help

so back to my original function I had issues with
a difference with the multiple vcs in func and my orig func is my orig had defaults
thus I will try a version with no defaults
TEXT_OUTPUT                      COUNT(*)
------------------------------ ----------
Ms Wyyyyyyyyy W Wyyyyyyyyy         100000
Elapsed: 00:00:00.19

TEXT_OUTPUT                      COUNT(*)
------------------------------ ----------
Ms Wyyyyyyyyy W Wyyyyyyyyy         100000
Elapsed: 00:00:00.08
****************************************************************UDF has an IMPACT

now with one of the parameters set to a default
TEXT_OUTPUT                      COUNT(*)
------------------------------ ----------
Ms Wyyyyyyyyy W Wyyyyyyyyy         100000
Elapsed: 00:00:00.32

TEXT_OUTPUT                      COUNT(*)
------------------------------ ----------
Ms Wyyyyyyyyy W Wyyyyyyyyy         100000
Elapsed: 00:00:00.32
****************************************************************UDF has NO IMPACT
****************************************************************ALSO defaults cause both versions to be slower

now call the simple disp_name_udf function that benefits standalone from within a package

standalone
TEXT_OUTPUT                      COUNT(*)
------------------------------ ----------
Ms Wyyyyyyyyy W Wyyyyyyyyy         100000
Elapsed: 00:00:00.08

within package
TEXT_OUTPUT                      COUNT(*)
------------------------------ ----------
Ms Wyyyyyyyyy W Wyyyyyyyyy         100000
Elapsed: 00:00:00.08
*********************************************** WORKS - so long as neither spec of body have prm defaults

and just to round of, using a subquery factored function which my prior tests showed reduced overhead
TEXT_OUTPUT                      COUNT(*)
------------------------------ ----------
Ms Wyyyyyyyyy W Wyyyyyyyyy         100000
Elapsed: 00:00:00.10

the WITH function benefits even with defaults
TEXT_OUTPUT                      COUNT(*)
------------------------------ ----------
Ms Wyyyyyyyyy W Wyyyyyyyyy         100000
Elapsed: 00:00:00.10

Creating the test table

drop table pers;
create table pers
(pers_id             number(8)    not null
,surname             varchar2(30) not null
,first_forename      varchar2(30) not null
,second_forename     varchar2(30)
,pers_title          varchar2(10)
,sex_ind             char(1)      not null
,dob                 date
,addr_id             number(8)
,pers_comment        varchar2(2000)
)
/
insert into pers
select rownum
      ,rpad(chr(65+mod(rownum,24)),10,chr(65+mod(rownum,25)))
      ,rpad(chr(65+mod(rownum,24)),10,chr(65+mod(rownum,25)))
      ,rpad(chr(65+mod(rownum,24)),10,chr(65+mod(rownum,25)))
      ,decode(mod(rownum,4),0,'MR',1,'MRS',2,'Ms',3,'MR','DR')
      ,decode(mod(rownum,2),0,'M',1,'F')
      ,sysdate - (3000+mod(rownum,30000))
      ,rownum +1001
      ,rpad(chr(65+mod(rownum,24)),200,chr(65+mod(rownum,25)))
from dual
connect by level < 100001
/

Creating the functions and a small package

--num_num
CREATE OR REPLACE FUNCTION normal_num_num(p_id IN NUMBER) RETURN NUMBER IS
v_num number;
BEGIN
  v_num:=p_id*p_id;
  RETURN v_num;
END;
/
CREATE OR REPLACE FUNCTION udf_num_num(p_id IN NUMBER) RETURN NUMBER IS
PRAGMA UDF;
v_num number;
BEGIN
  v_num:=p_id*p_id;
  RETURN v_num;
END;
/
--
-- vc_vc
CREATE OR REPLACE FUNCTION normal_vc_vc(p_id IN varchar2) RETURN varchar2 IS
v_vc varchar2(100);
BEGIN
  v_vc:=substr(p_id,1,5);
  RETURN v_vc;
END;
/
CREATE OR REPLACE FUNCTION udf_vc_vc(p_id IN varchar2) RETURN varchar2 IS
PRAGMA UDF;
v_vc varchar2(100);
BEGIN
  v_vc:=substr(p_id,1,5);
  RETURN v_vc;
END;
/
-- vc_vc_2
CREATE OR REPLACE FUNCTION normal_vc_vc_2(p_id1 IN varchar2,p_id2 IN varchar2) RETURN varchar2 IS
v_vc varchar2(100);
BEGIN
  v_vc:=substr(p_id1,1,5)||substr(p_id2,2,5);
  RETURN v_vc;
END;
/
CREATE OR REPLACE FUNCTION udf_vc_vc_2(p_id1 IN varchar2,p_id2 IN varchar2) RETURN varchar2 IS
PRAGMA UDF;
v_vc varchar2(100);
BEGIN
  v_vc:=substr(p_id1,1,5)||substr(p_id2,2,5);
  RETURN v_vc;
END;
/
--
-- dt_dt
CREATE OR REPLACE FUNCTION normal_dt_dt(p_id IN date) RETURN date IS
v_dt date;
BEGIN
  v_dt:=p_id+1000;
  RETURN v_dt;
END;
/
CREATE OR REPLACE FUNCTION udf_dt_dt(p_id IN date) RETURN date IS
PRAGMA UDF;
v_dt date;
BEGIN
  v_dt:=p_id-1000;
  RETURN v_dt;
END;
/
-- dt_num
CREATE OR REPLACE FUNCTION normal_dt_num(p_id IN date) RETURN number IS
v_num number;
BEGIN
  v_num:=to_char(p_id,'J')+100;
  RETURN v_num;
END;
/
CREATE OR REPLACE FUNCTION udf_dt_num(p_id IN date) RETURN number IS
PRAGMA UDF;
v_num number;
BEGIN
  v_num:=to_char(p_id,'J')-100;
  RETURN v_num;
END;
/
-- num_dt
CREATE OR REPLACE FUNCTION normal_num_dt(p_id IN number) RETURN DATE IS
v_dt date;
BEGIN
  v_dt:=sysdate+(p_id/100);
  RETURN v_dt;
END;
/
CREATE OR REPLACE FUNCTION udf_num_dt(p_id IN number) RETURN DATE IS
PRAGMA UDF;
v_dt date;
BEGIN
  v_dt:=sysdate-(p_id/100);
  RETURN v_dt;
END;
/
create or replace function normal_disp_name (p_sn      in varchar2
                       ,p_fn1     in varchar2
                       ,p_fn2     in varchar2
                       ,p_title   in varchar2  ) return varchar2 is
v_return     varchar2(1000);
begin
  v_return := case when p_title is null then ''
                   else initcap(p_title)||' '
              end
            ||initcap(p_fn1)||' '
            ||case when p_fn2 is null then ''
                   else substr(p_fn2,1,1)||' '
              end
            ||initcap(p_sn);
return v_return;
end;
/
create or replace function udf_disp_name (p_sn      in varchar2
                       ,p_fn1     in varchar2
                       ,p_fn2     in varchar2
                       ,p_title   in varchar2  ) return varchar2 is
PRAGMA UDF;
v_return     varchar2(1000);
begin
  v_return := case when p_title is null then ''
                   else initcap(p_title)||' '
              end
            ||initcap(p_fn1)||' '
            ||case when p_fn2 is null then ''
                   else substr(p_fn2,1,1)||' '
              end
            ||initcap(p_sn);
return v_return;
end;
/
create or replace function normal_disp_name_defaults (p_sn      in varchar2
                       ,p_fn1     in varchar2
                       ,p_fn2     in varchar2
                       ,p_title   in varchar2 :=null  ) return varchar2 is
v_return     varchar2(1000);
begin
  v_return := case when p_title is null then ''
                   else initcap(p_title)||' '
              end
            ||initcap(p_fn1)||' '
            ||case when p_fn2 is null then ''
                   else substr(p_fn2,1,1)||' '
              end
            ||initcap(p_sn);
return v_return;
end;
/
create or replace function udf_disp_name_defaults (p_sn      in varchar2
                       ,p_fn1     in varchar2
                       ,p_fn2     in varchar2
                       ,p_title   in varchar2 :=null ) return varchar2 is
PRAGMA UDF;
v_return     varchar2(1000);
begin
  v_return := case when p_title is null then ''
                   else initcap(p_title)||' '
              end
            ||initcap(p_fn1)||' '
            ||case when p_fn2 is null then ''
                   else substr(p_fn2,1,1)||' '
              end
            ||initcap(p_sn);
return v_return;
end;
/
create or replace package t_pkg as
function udf_disp_name (p_sn      in varchar2
                       ,p_fn1     in varchar2
                       ,p_fn2     in varchar2
                       ,p_title   in varchar2 ) return varchar2;
end t_pkg;
/
create or replace package body t_pkg as
function udf_disp_name (p_sn      in varchar2
                       ,p_fn1     in varchar2
                       ,p_fn2     in varchar2
                       ,p_title   in varchar2 ) return varchar2 is
PRAGMA UDF;
v_return     varchar2(1000);
begin
  v_return := case when p_title is null then ''
                   else initcap(p_title)||' '
              end
            ||initcap(p_fn1)||' '
            ||case when p_fn2 is null then ''
                   else substr(p_fn2,1,1)||' '
              end
            ||initcap(p_sn);
return v_return;
end;
end t_pkg;
/

The test script

-- udf_tests
set lines 100 pages 50
set feed off
col text_output     form a30
col number_output   form 99999999999
col date_output     form a20
spool udf_tests.lst
prompt  running udf_tests
prompt
set pause off
set autotrace off
set timi on
prompt investigating why pragma udf helps some simple functions and not others
prompt -----------------------------------------------------------------------
--
prompt
--
prompt simple number in-number out function
select /* mdw_16a */       max(normal_num_num(pers_id)) number_output
       ,count(*) from pers
/
select /* mdw_16b */        max(udf_num_num(pers_id)) number_output
       ,count(*)  from pers
/
prompt
prompt ** udf helps
prompt
--
prompt simple varchar in-varchar out function
select /* mdw_16c */       max(normal_vc_vc(surname)) text_output
       ,count(*) from pers
/
select /* mdw_16d */       max(udf_vc_vc(surname)) text_output
       ,count(*)  from pers
/
prompt
prompt ** udf helps
prompt
--
--
prompt  two varchar in-varchar out function, is the issue with more than one in parameter?
select /* mdw_16e */       max(normal_vc_vc_2(surname,first_forename)) text_output
       ,count(*) from pers
/
select /* mdw_16f */       max(udf_vc_vc_2(surname,first_forename)) text_output
       ,count(*)  from pers
/
prompt
prompt ** udf helps
prompt
--
prompt simple date in-date out function
select /* mdw_16g */       max(normal_dt_dt(DOB)) date_output
       ,count(*) from pers
/
select /* mdw_16h */       max(udf_dt_dt(DOB)) date_output
       ,count(*)  from pers
/
prompt ***************************************************************SIMILAR TIME!!!
prompt udf does not help
prompt

--
prompt is date out the issue,  date in-num out function
select /* mdw_16i */       max(normal_dt_num(DOB)) number_output
       ,count(*) from pers
/
select /* mdw_16j */       max(udf_dt_num(DOB)) number_output
       ,count(*)  from pers
/
prompt ***************************************************************SIMILAR TIME!!!
prompt udf does not help

--
prompt is date in the issue,  num in-date out function
select /* mdw_16k */       max(normal_num_dt(pers_id)) date_output
       ,count(*) from pers
/
select /* mdw_16l */       max(udf_num_dt(pers_id)) date_output
       ,count(*)  from pers
/
prompt ***************************************************************SIMILAR TIME!!!
prompt udf does not help
--
--
prompt
prompt so back to my original function I had issues with
prompt a difference with the multiple vcs in func and my orig func is my orig had defaults
prompt thus I will try a version with no defaults
prompt
select /* mdw_16m */
        max(normal_disp_name(surname,first_forename,second_forename,pers_title)) text_output
       ,count(*) from pers
/
select /* mdw_16n */
        max(udf_disp_name(surname,first_forename,second_forename,pers_title)) text_output
       ,count(*)  from pers
/
prompt ****************************************************************UDF has an IMPACT
prompt
prompt
prompt now with one of the parameters set to a default
select /* mdw_16o */
        max(normal_disp_name_defaults(surname,first_forename,second_forename,pers_title)) text_output
       ,count(*) from pers
/
select /* mdw_16p */
        max(udf_disp_name_defaults(surname,first_forename,second_forename,pers_title)) text_output
       ,count(*)  from pers
/
prompt ****************************************************************UDF has NO IMPACT
prompt ****************************************************************ALSO defaults cause both versions to be slower
prompt
prompt now call the simple disp_name_udf function that benefits standalone from within a package
prompt
prompt standalone
select /* mdw_16q */
        max(udf_disp_name(surname,first_forename,second_forename,pers_title)) text_output
       ,count(*)  from pers
/
prompt
prompt within package
select /* mdw_16r */
        max(t_pkg.udf_disp_name(surname,first_forename,second_forename,pers_title)) text_output
       ,count(*)  from pers
/
prompt *********************************************** WORKS - so long as neither spec of body have prm defaults
prompt
prompt and just to round of, using a subquery factored function which my prior tests showed reduced overhead
with
  function l_disp_name(p_sn      in varchar2
                      ,p_fn1     in varchar2
                      ,p_fn2     in varchar2
                      ,p_title   in varchar2 )
return varchar2
is
v_return     varchar2(1000);
begin
  v_return := case when p_title is null then ''
                   else initcap(p_title)||' '
              end
            ||initcap(p_fn1)||' '
            ||case when p_fn2 is null then ''
                   else substr(p_fn2,1,1)||' '
              end
            ||initcap(p_sn);
return v_return;
end l_disp_name;
select /*mdw_16s */
        max(l_disp_name(p_sn =>surname           ,p_fn1  =>first_forename
                       ,p_fn2=>second_forename   ,p_title=>pers_title)        ) text_output
       ,count(*)
from pers
/

prompt the WITH function benefits even with defaults
with
  function l_disp_name(p_sn      in varchar2
                      ,p_fn1     in varchar2
                      ,p_fn2     in varchar2 :=null
                      ,p_title   in varchar2 :=null )
return varchar2
is
v_return     varchar2(1000);
begin
  v_return := case when p_title is null then ''
                   else initcap(p_title)||' '
              end
            ||initcap(p_fn1)||' '
            ||case when p_fn2 is null then ''
                   else substr(p_fn2,1,1)||' '
              end
            ||initcap(p_sn);
return v_return;
end l_disp_name;
select /*mdw_16t */
        max(l_disp_name(p_sn =>surname           ,p_fn1  =>first_forename
                       ,p_fn2=>second_forename   ,p_title=>pers_title)        ) text_output
       ,count(*)
from pers
/
--
spool off