jump to navigation

Memory Changes Everything July 12, 2010

Posted by mwidlake in Architecture, performance.
Tags: , , ,

I’ve got this USB memory stick which I use to carry around my scripts, documents, presentations, Oracle manuals and enough music to keep me going for a few days. It is on an 8GB Gizzmo Junior and it is tiny. By tiny I mean as wide as my little finger, the length of a matchstick and about the same thickness of said matchstick. So small that I did indeed lose the damn thing for 6 months before I realised it had got trapped behind a credit card in my wallet.

It cost me ten British pounds about 15 months ago (less than most 4GB USB sticks seem to cost now, but then it is nothing more than the memory chip and connectors wrapped in plastic) and it highlights how cheap solid-state “storage” is becoming.

Connected to this, I was looking at buying a new PC this week and this machine comes with 10 USB slots, if you include the ones on the supplied monitor and stubs on the motherboard.
10 USB slots, 8GB gizzmo memory sticks… That would be 80GB of cheap and fast storage. Now get a few USB hubs and bulk-buy a few dozen cheap USB2 sticks and you could soon have a solid-state database of a few hundred GB for a thousand pounds. Then of course you can have fun seeing where the pinch-points in the system are (USB2 has a maximum speed per port and going USB3 right now is going to break that 1 grand barrier. But give it a year…).

This really started me thinking about when memory-based storage would take over from spinning disk as the best option for enterprise-level storage and my gut feeling is in about 5 years. I think it will be both technically possible and financially viable in much less than that, say as little as 2 years, but the cost of solid-state storage per MB will still be higher than disk by then but potentially much faster. A few considerations going through my mind were:-

  • Disk is getting a lot slower in relation to acreage. By this I mean that, for a single disc drive, capacity is doubling about every 18 months but seek time has hardly reduced in a decade and transfer rate (reading from the physical platters to the units buffer) is again almost stationary, at about 120MB/s for 10,000rpm disk and up towards 180 for those very expensive and noisy 15,000 rpm disks. Being a tad ridiculous to make the point, with modern 3TB disks you could build most Oracle database on one disc. Let’s make it two in a raid 10 configuration for redundancy. My point is, your 3TB database could well be being run right now, for real, across say 5 physical disks with a total sustainable physical throughput of around 500MB a second.
  • Solid state storage seems to be halving in price in more like 8-10 months.
  • IO subsystems are made faster by using RAID so that several physical discs can contribute to get towards the 300MB or so speed of the interface – but solid state is already that fast.
  • IO subsystems are made faster by building big caches into them and pre-fetching data that “might” be requested next. Oh, that is kind of solid state storage already.
  • Solid state storage, at least the cheap stuff in your USB stick, has the problem that you can only write to each bit a thousand or so times before it starts to get unreliable. But physical disk has exactly the same issue.
  • There are new methods of solid-state memory storage coming along – “New Scientist” had a nice article on it a few months ago, and these versions will be even higher density and more long-term reliable.
  • Seek time on solid-state memory is virtually zero, so random IO is going to be particularly fast compared to spinning disk.

Solid state memory needs less power, and thus less cooling, is silent, is potentially denser and is less vulnerable to temperature and humidity fluctuations. I can see it not needing to be kept in a specialist server room with the need for all that air con and ear defenders when you go in the room.
Just somewhere with normal air con and a lock on the door should suffice.
We do not need Solid State storage to match the size of current disks or even be as cheap to take over. As I have already pointed out, it is not acreage you need with physical disks but enough spindles and caches to make it fast enough in relation to the space. Further, we can afford to pay more for solid state if we do not need to keep it in such expensive clean-room like environments.

I can see that in a couple of years for a given computer system, say a mixed-workload order processing system, to support the storage needs we will have maybe a dozen solid-state chunks of storage, perhaps themselves consisting of several small units of memory in some sort of raid for resilience, all able to flood the IO channels into our processing server and the issue will be getting the network and io channels into the server to go fast enough. So don’t, stick all the storage directly into the server. You just got rid of half your SAN considerations.

I’m going to stop there. Partly because I have run out of time and partly because, in checking out what I am writing, I’ve just spotted someone did a better job of this before me. Over to James Morle who did a fantastic post on this very topic back in May. Stupid me for not checking out his blog more often. Jame also mentions that often it is not total throughput you are interested in at all but IOPS. That zero latency of solid-state memory is going to be great for supporting very high IOPS.

Friday Philosophy – The power of cooperation June 27, 2010

Posted by mwidlake in Friday Philosophy, Perceptions, performance.
Tags: , ,

Being the person responsible for the performance of an Oracle-based system can be an oddly lonely working experience. It can also be a very gregarious position as you get to meet a lot of people and discuss lots of different aspects of many systems – all systems seem to have performance issues and so people come along to you, hoping you can identify a “work faster” database setting for them.

But you are often the only person who specialises in Oracle performance. You generally need to be in an organisation that is very large or where performance is key to success for there to be justification for dedicating a person to the topic. To have more than one person dedicated to performance your organisations has to have a very strong focus on getting the best performance out of the Oracle and related systems {or really, really atrocious performance issues 🙂 }. So usually there is no one else around who is as experienced (or more so) as yourself to discuss such things over with or ask for a second opinion.

Which is why I am very lucky at the moment. I’m working in a team of oracle performance people. There are 2.5 of us (one is a manager with other responsibilities, so he only counts as half). Being able to say “Hey, Dave, What do you think of the wait times on scattered reads?” or “how in heck do I force this nested subquery on a view to use a hash join?” and get some other ideas is very valuable.

What is also interesting is how opinions and preferred techniques on tuning can be different and just as valid. As an example, last week I was working on a poorly performing statement. I was at home and it was the evening, so I was not communicating with the rest of the team. I managed to get the code down from 40 minutes to just under 20 by using a combination of a LEADING and USE_HASH hint. I sent the code back to the user. Only to see that within thirty seconds of each other my colleague Graeme had also sent the user a response, again getting the code down to around 20 minutes. Graeme had pulled a chunk of the code into a subquery factoring “WITH” clause and added cardinality hints. Totally different changes.

So Graeme and I then had a “philosophical” discussion about the different changes {“Mine is best” – “No! Mine is, yours is all bloated and complex”- “Your hint is less future-flexible!!!”}. Only joking, we actually discussed the changes and why we each chose what we did. Graeme commented that is showed that tuning was an art and not a science and I countered that it was a science, as we had both identified where we felt the execution plan could be improved but used different techniques to get there. The thing is, Oracle is so complex and has so many options to influence the optimiser that you have flexibility to chose different tools and techniques.

We had both identified the same main improvement but had each come up with different tweaks for later in the plan.

The end result was that we went with Graeme’s main plan {he is bigger than me} but we pulled in my tweak. That bought the execution time down to around 10 minutes, so about four times faster over all and twice as fast of either of us alone. That is one of the advantages of not working alone.

We also then discussed how we could get this code down to seconds with the use of either Materialized views or changing the process that generated the report to do so incrementally and store the daily results. Until one of us realised we had reached the boundary of compulsive tuning disorder. The report running in 10 minutes was enough improvement to satisfy the business, the report was only going to be run over the next couple of months, so spending a day or two re-working it further was going to be fun – but of no advantage to the business. We told each other to finish for the day. So another advantage of not working alone is that not only do you get more technical input but your help prevent each other losing sight of the overall aim.

It really does help to have two people working on the same area.

{There is a sneaky way of getting beyond being a lone performance specialist. If you are in an organisation long enough you can usually find some other idiot who is silly enough to want to learn more about performance and you can train them up. It does not take long before they know enough to start coming up with things you never thought of. Oracle is, after all, full of many ways to do the same thing and you can’t know it all}.

dbms_stats.set_table_stats “defaults” June 21, 2010

Posted by mwidlake in internals, statistics.
Tags: , ,
add a comment

What happens if you call dbms_stats.set_table_stats without passing in any of the values to set?

I know, why would you do it anyway? Well, I did so by accident. If I want to gather quick stats on a test table I execute something like:

exec dbms_stats.gather_table_stats(OWNNAME =>user,TABNAME =>’TEST_TAB1′)
If I am feeling generous I might state ESTIMATE_PERCENT too.

I was doing some testing work and was gathering stats and also setting stats manually. Then I started to see several of my test tables all had 2000 rows and were 100 blocks in size – at least according to the stats. I knew this was not possible. It turned out to be Cut ‘n’ Paste fingerf the trouble and I was issuing.

exec dbms_stats.set_table_stats(OWNNAME =>user,TABNAME =>’TEST_TAB1′)

If the table did not already have stats this set the stats on the table to default values of 2000 rows, 100 blocks. If the table already had stats then they were left as they were.

If those figures ring a bell, then that is because they are the default values used if a table has no stats and you have no dynamic sampling. See this table of defaults

Anyway, below is a little worked example of these default values being set. Oh, version is

TDB> drop table TEST1 purge
  2  /

TDB> select sysdate from dual;
21-JUN-2010 16:52


  4  /

TABLE_NAME                       NUM_ROWS     BLOCKS SAMPLE_SIZE
------------------------------ ---------- ---------- -----------

-- New table, no stats yet gathered, the columns hold null

-- Call dbms_stats.SET_TABLE_STATS, setting nothing
TDB> EXEC dbms_stats.set_table_stats(ownname => user,tabname => 'TEST1')


TABLE_NAME                       NUM_ROWS     BLOCKS SAMPLE_SIZE
------------------------------ ---------- ---------- -----------
TEST1                                2000        100        2000

-- The columns are set to defaults

-- Gather proper stats
TDB> exec dbms_stats.gather_table_stats(ownname =>user,tabname =>'TEST1',estimate_percent=>10)

TABLE_NAME                       NUM_ROWS     BLOCKS SAMPLE_SIZE
------------------------------ ---------- ---------- -----------
TEST1                              205430        956       20543

-- Now use SET_TABLE_STATS as intended, setting numrows to a value
TDB> EXEC dbms_stats.set_table_stats(ownname => user,tabname => 'TEST1',numrows=>5000)

TABLE_NAME                       NUM_ROWS     BLOCKS SAMPLE_SIZE
------------------------------ ---------- ---------- -----------
TEST1                                5000        956       20543

-- Try the naked SET_TABLE_STATS
TDB> EXEC dbms_stats.set_table_stats(ownname => user,tabname => 'TEST1')

TABLE_NAME                       NUM_ROWS     BLOCKS SAMPLE_SIZE
------------------------------ ---------- ---------- -----------
TEST1                                5000        956       20543

TDB> select sysdate from dual;
Any Key>
21-JUN-2010 16:52

-- And let us see how the stats have changed over the last few minutes.
TDB> select table_name,stats_update_time
  2  from dba_tab_stats_history
  3  where table_name = 'TEST1'
  4  /
Any Key>
TABLE_NAME                     STATS_UPDATE_TIME
------------------------------ --------------------------------
TEST1                          21-JUN-10 +00:00
TEST1                          21-JUN-10 +00:00
TEST1                          21-JUN-10 +00:00
TEST1                          21-JUN-10 +00:00

What is AUDSID June 17, 2010

Posted by mwidlake in internals.
Tags: , ,

If you look at v$session you see SID and SERIAL#, which most of us DBA-types know uniquely identify a session and allow you to kill it (*with the relevant permissions and knowledge you are not about to compromise a business process).

But what is AUDSID?

desc v$session
 Name                                      Null?    Type
 ----------------------------------------- -------- ------------
 SADDR                                              RAW(8)
 SID                                                NUMBER
 SERIAL#                                            NUMBER
 AUDSID                                             NUMBER
 PADDR                                              RAW(8)
 USER#                                              NUMBER
 USERNAME                                           VARCHAR2(30)
 COMMAND                                            NUMBER
 OWNERID                                            NUMBER
 TADDR                                              VARCHAR2(16)
 LOCKWAIT                                           VARCHAR2(16)
 STATUS                                             VARCHAR2(8)
 SERVER                                             VARCHAR2(9)
 SCHEMA#                                            NUMBER
 SCHEMANAME                                         VARCHAR2(30)
 OSUSER                                             VARCHAR2(30)

AUDSID is a unique identifier for the session and is used in sys.aud$ , as the SESSIONID column. It is the leading column of the only index on sys.aud$

IND_OWNER   IND_NAME           TAB_NAME           PSN       COL_NAME
----------- ------------------ ------------------ --------- ------------
SYS         I_AUD1             AUD$               1         SESSIONID
                                                  2         SES$TID

All audit records for a session are recorded with the same SESSIONID/AUDSID. Activity can be audited at session level and statement level. If audited at session level, only a single record for that audited thing is recorded in the log. This is updated to indicate any new audited actions that occur on that thing for the duration of the session, so it is important that Oracle can efficiently identify and update that record, as the impact of SQL AUDIT on database functionality wants to be kept to a minimum.

How does AUDSID relate to SID, SERIAL#, USERNAME etc? Let’s have a quick look.

select sid, serial# ,audsid ,username,OSUSER,schemaname
from v$session
       SID    SERIAL#     AUDSID USERNAME             OSUSER                    SCHEMANAME
---------- ---------- ---------- -------------------- ------------------------- --------------
       485       4745     165550 LEX_AAAAAEX          oracle                    LEX_AAAAAEX
       487      23712     165546 LEX_AAAAAEX          oracle                    LEX_AAAAAEX
       497      40388          0                      oracle                    SYS
       498      19269          0                      oracle                    SYS
       502      13362     165432 EAAAAA               govanii                   EAAAAA
       505        407     163821 LEXAAAAA             oracle                    LEXAAAAA
       506       6302 4294967295 SYS                  widlake                   SYS
       511      11702     165518 OAAAAA               backerella                OAAAAA
       512      17076     165490 LEX_AAAAAEX          oracle                    LEX_AAAAAEX
       518       9066     165507 OAAAAA               Manoled                   OAAAAA
       519       6956     163976 LEX_AAAAAEX          oracle                    LEX_AAAAAEX
       521      15272 4294967295 SYS                  widlake                   SYS
       523       4825     163975 LEX_AAAAAEX          oracle                    LEX_AAAAAEX
       536      63941 4294967295 SYS                  backerella                SYS
       524      19740     165548 LEX_AAAAAEX          oracle                    LEX_AAAAAEX
       525        850     165549 LEX_AAAAAEX          oracle                    LEX_AAAAAEX
       526      45112     165516 XXX                  backerella                XXX
       527      11086     163963 LEX_AAAAAEX          oracle                    LEX_AAAAAEX
       529       1662     163818 LEX_AAAAA            oracle                    LEX_AAAAA
       530      62788 4294967295 SYS                  widlake                   SYS
       537          1          0                      oracle                    SYS
       540          1          0                      oracle                    SYS

It does not. But it is interesting to note that this database does not have auditing enabled but the AUDSID is still populated.

It is also interesting that all SYS accounts have either an AUDSID of 0 or the same AUDSID, 4294967295 {which is 1 less than 4294967295…1024*1024*4}. I might come back to that shared AUDSID at a later date…

All the “normal” AUDSIDs are unique and about the same value, 163900 to 165560.

If you look at sys.aud$ the SESSIONID increments as time goes on. By this I do not mean it is in order of SESSIONID, it is not, but as time goes by the SESSIONID increments. That sounds a lot like what you would see with a sequence to me. And in fact that is because the value IS from a sequence:

select * from dba_sequences where sequence_name = 'AUDSES$'

--------------- --------------- ---------- ---------- ------------ ----- - ---------- -----------
SYS             AUDSES$                  1 2000000000            1 Y     N      10000      173695

So the AUDSID is a unique value that comes from a sequence and is used in the audit trail. All SYS accounts have the same AUDSID or have it set to zero.

Why do many sessions have the AUDSID set to zero? Because they are internal Oracle processes:

select sid, serial# ,audsid ,username,OSUSER,schemaname,program
from v$session
where AUDSID = 0
----- ---------- ---------- ------------ ------------ ------------ ----------------------------
  506       6887          0              oracle       SYS          oracle@db25.eng.ham.uk. (q001)
  526      45682          0              oracle       SYS          oracle@db25.eng.ham.uk. (q002)
  531         91          0 YYYYY        oracle       DLPP1        oracle@db25.eng.ham.uk. (J000)
  533          3          0              oracle       SYS          oracle@db25.eng.ham.uk. (CJQ0)
  537          1          0              oracle       SYS          oracle@db25.eng.ham.uk. (q000)
  540          1          0              oracle       SYS          oracle@db25.eng.ham.uk. (QMNC)
  546          1          0              oracle       SYS          oracle@db25.eng.ham.uk. (MMNL)
  547          1          0              oracle       SYS          oracle@db25.eng.ham.uk. (MMON)
  548          1          0              oracle       SYS          oracle@db25.eng.ham.uk. (RECO)
  549          1          0              oracle       SYS          oracle@db25.eng.ham.uk. (SMON)
  550          1          0              oracle       SYS          oracle@db25.eng.ham.uk. (CKPT)
  551          1          0              oracle       SYS          oracle@db25.eng.ham.uk. (LGWR)
  552          1          0              oracle       SYS          oracle@db25.eng.ham.uk. (DBW0)
  553          1          0              oracle       SYS          oracle@db25.eng.ham.uk. (MMAN)
  554          1          0              oracle       SYS          oracle@db25.eng.ham.uk. (PSP0)
  555          1          0              oracle       SYS          oracle@db25.eng.ham.uk. (PMON)

The activity of PMON, SMON, CKPT is not audited of course. And job schedulers doing nothing do not have their activity audited either.

Oh, you remember I said AUDSID was unique? Well, check back at the sequence. It wraps and has a maximum value of 2,000,000,000. It is going to start repeating values after 2 billion sessions. Make a note to ensure you have cleaned up your audit trail before you hit your 2 billion and first session…

INTERNAL_FUNCTION() Impact April 21, 2010

Posted by mwidlake in internals, performance.
Tags: , ,

You may occasionally see something like the following in the filter predicates of an Explain Plan:


What is INTERNAL_FUNCTION, what is the use of INTERNAL_FUNCTION() in Oracle explain plans and why am I putting things like ORACLE INTERNAL_FUNCTION partition exclusion and meaning of internal_function performance almost as random strings in this blog?

Well, I want to get the hit rate up for this post as when I did a quick search on INTERNAL_FUNCTION the current top hit is This unfortunately misleading posting that tells you “It is not a significant overhead in SQL execution performance”. This internal function IS potentially a significant overhead in SQL Execution performance. You can’t add comments to the above site {just book a course or order a book}, so I’m having to create my own page…

INTERNAL_FUNCTION is, as the above posting says, where oracle does an implicit data conversion. In the shown case, as well as in my example in this blog and in most cases I have seen, from timestamp to date.

Why is it an issue? Because it is a function on a column and as such it can have a disasterous impact on sql execution performance.

This code below ( which has most select columns removed for simplicity and to protect the innocent) was performing very poorly, taking over 30 minutes to complete. This is all on Enterprise edition of Oracle.

SELECT trans0_.ID as ID3_1_
, round(transfers0_.AMOUNT,2) as formula10_0_
FROM VW_PMTT trans0_
WHERE  (trans0_.MODIFIED_DT between :p0 AND :p1
        AND trans0_.PERS_ID = :p2)
and transfers0_.ACCOUNT_ID=:p3

This is the plan and the significant filter/access predicates:

Plan hash value: 3261981189
| Id  | Operation                         | Name                 | Rows  |Bytes| Cost   |Pstart|Pstop|
|   0 | SELECT STATEMENT                  |                      |       |     | 67841  |      |     |
|   1 |  SORT ORDER BY                    |                      |     8 | 920 | 67841  |      |     |
|   2 |   VIEW                            | VW_PMTP              |     8 | 920 | 67840  |      |     |
|   3 |    UNION-ALL                      |                      |       |     |        |      |     |
|*  4 |     FILTER                        |                      |       |     |        |      |     |
|   5 |      NESTED LOOPS                 |                      |     4 | 280 | 28490  |      |     |
|*  6 |       TABLE ACCESS BY INDEX ROWID | W_TBLP               |     1 |  14 |     3  |      |
|*  7 |        INDEX UNIQUE SCAN          | PK_W_TBLP            |     1 |     |     2  |      |     |
|   8 |       PARTITION RANGE ALL         |                      |     4 | 224 | 28487  |    1 | 1580|
|*  9 |        TABLE ACCESS BY LOC INd RID| W_PAYMENT            |     4 | 224 | 28487       1 | 1580|
|* 10 |         INDEX RANGE SCAN          | IDX_W_P_TRANS        | 66936 |     |  3345  |    1 | 1580|

Predicate Information (identified by operation id):

   6 - filter("PLY"."ACCOUNTID"=:P3)
   7 - access("PLY"."PERS_ID"=:P2)

As you can see, the view is translated into a simple two-table join {and unioned with a second two-table join, which I removed again for clarity, but if anyone wants the full plan and statement, email me} where:

  • one table is accessed on a unique index (via bind variable P2).
  • Row filtered for P3
  • This table is then joined to a Partitioned table in a nested loop 
  • The table being accessed via a local index range scan.

At first glance, the plan may look fine, but Look at the pstart and pstop. 1 to 1580. That is every partition in the table.
The predicate information shows that the INTERANAL_FUNCTION(“PT”.”MODIFIED”) column is being compared to P0 and P1.

The partition key on the table is:-

Name of Table : W_PAYMENT

------------------------------ ---------- --- -----------
USER.W_PAYMENT                 TABLE       1 MODIFIED

ie the column that is being flitered by. Why no partition pruning?

Partition pruning is not occuring because of the “INTERNAL_FUNCTION” being applied to that column. The CBO is not able to understand how the result of a function will match to the partition values of the column. Not even it’s own, internally used function 😛

For the above, the “date” bind variables were defined as

p0 timestamp := to_timestamp(’01-FEB-2010 00:00′,’DD-MON-YYYY HH24:MI’);
p1 timestamp := to_timestamp(’03-FEB-2010 00:00′,’DD-MON-YYYY HH24:MI’);

I then altered the code such that the bind variables were defined as dates.

p0 date := to_date(’01-FEB-2010′,’DD-MON-YYYY’);
p1 date := to_date(’03-FEB-2010′,’DD-MON-YYYY’);

The plan and significant predicates were now:

Plan hash value: 346162108
| Id  | Operation                         | Name                 | Rows  |Bytes| Cost   |Pstart|Pstop|
|   0 | SELECT STATEMENT                  |                      |       |     | 67841  |      |     |
|   1 |  SORT ORDER BY                    |                      |     4 | 460 | 67841  |      |     |
|   2 |   VIEW                            | VW_PMTP              |     4 | 460 | 67840  |      |     |
|   3 |    UNION-ALL                      |                      |       |     |        |      |     |
|*  4 |     FILTER                        |                      |       |     |        |      |     |
|   5 |      NESTED LOOPS                 |                      |     2 | 140 | 28490  |      |     |
|*  6 |       TABLE ACCESS BY INDEX ROWID | W_TBLP               |     1 |  14 |     3  |      |     |
|*  7 |        INDEX UNIQUE SCAN          | PK_W_TBLP            |     1 |     |     2  |      |     |
|   8 |       PARTITION RANGE ALL         |                      |     2 | 112 | 28487  |  KEY | KEY |
|*  9 |        TABLE ACCESS BY LOC INd RID| W_PAYMENT            |     2 | 112 | 28487  |  KEY | KEY |
|* 10 |         INDEX RANGE SCAN          | IDX_W_P_TRANS        | 66936 |     |  3345  |  KEY | KEY |

Predicate Information (identified by operation id):

   6 - filter("PLY"."ACCOUNTID"=:P3)
   7 - access("PLY"."PERS_ID"=:P2)
   9 - filter(("PT"."TYPE"=1184769 AND "PT"."MODIFIED">=:P0 AND "PT"."MODIFIED"<:P1

The expected rows and bytes drop but the key difference in the plan is Pstart and Pstop are now KEY KEY. ie the CBO will evaluate at run time the values coming back from the Table access of W_TBLP and partition prune the access of the table.

In this situation, the original query took over 30 minutes to return. The new code took 18 seconds.

I tried one final test. I set the bind variables to be varchar2 so that implicit data conversion would occur:
p0 varchar2(20) := ’01-FEB-2010 00:00′;
p1 varchar2(20) := ’03-FEB-2010 00:00′;

With VARCHAR2 the CBO was able to do the implicit conversion without the INTERNAL_FUNCTION and the plan/filter predicates were exactly as for native dates.

It would seem this issue with INTERNAL_FUNCTION blights timestamp to date implicit data conversion but not some other implicit data conversions. Without testing further, I can’t be sure which.

This is a documented feature by Oracle. Unusually (and much to their credit) Oracle even document the potential negative impact of this particular construct with timestamps and dates. See this section in chapter 5 of the Oracle 10g data warehousing guide.

I better balance the fact I linked to an inaccurate posting on the subject with a few good ones.
This is a nice one from McLaud on the OraFAQ site where he has come across it as a problem and with help shows how it is due to implicit conversion of a timestamp from Hibernate.
This blog on the issue arising with Java is a nice example of the fix (use a date to hold a date, not a timestamp).

Finally, this link is to an older but very, very valid rant by Tom Kyte about implicit data conversion and using the correct data type for your data. {Don’t go using numbers or varchars to hold dates, just don’t, and similarly you should avoid implicit data conversion}.

Performance Tipping Points April 13, 2010

Posted by mwidlake in performance, statistics.
Tags: , ,
add a comment

This week I came across a nice example of a performance tipping point. This is where Everything is OK until you reach a point where it all quickly cascades to Not OK.

The below shows the timings for a regulalry run “Alert”. Four times an hour we want to know if something quite important has happened on the system.

---- - ------------- ------------- --------- ------------- -------------------- -----------------
292 S 0410 03:06:00 0410 03:06:31 .000359 0410 03:06:00 20100410 02:51:00 20100410 03:06:00
292 S 0410 03:21:00 0410 03:21:35 .000405 0410 03:21:00 20100410 03:06:00 20100410 03:21:00
292 S 0410 03:36:00 0410 03:36:38 .000440 0410 03:36:00 20100410 03:21:00 20100410 03:36:00
292 S 0410 03:51:00 0410 03:51:33 .000382 0410 03:51:00 20100410 03:36:00 20100410 03:51:00
292 S 0410 04:06:00 0410 04:06:28 .000324 0410 04:06:00 20100410 03:51:00 20100410 04:06:00
292 S 0410 04:21:00 0410 04:21:44 .000509 0410 04:21:00 20100410 04:06:00 20100410 04:21:00
292 S 0410 04:36:00 0410 04:36:27 .000313 0410 04:36:00 20100410 04:21:00 20100410 04:36:00
292 S 0410 04:51:00 0410 04:51:34 .000394 0410 04:51:00 20100410 04:36:00 20100410 04:51:00
292 S 0410 05:06:00 0410 05:06:44 .000509 0410 05:06:00 20100410 04:51:00 20100410 05:06:00
292 S 0410 05:21:00 0410 05:21:43 .000498 0410 05:21:00 20100410 05:06:00 20100410 05:21:00
292 S 0410 05:36:00 0410 05:37:01 .000706 0410 05:36:00 20100410 05:21:00 20100410 05:36:00
292 S 0410 05:51:00 0410 05:52:21 .000938 0410 05:51:00 20100410 05:36:00 20100410 05:51:00
292 S 0410 06:06:00 0410 06:08:09 .001493 0410 06:06:00 20100410 05:51:00 20100410 06:06:00
292 S 0410 06:21:01 0410 06:24:00 .002072 0410 06:21:01 20100410 06:06:00 20100410 06:21:01
292 S 0410 06:36:00 0410 06:40:12 .002917 0410 06:36:00 20100410 06:21:01 20100410 06:36:00
292 S 0410 06:51:00 0410 06:56:54 .004097 0410 06:51:00 20100410 06:36:00 20100410 06:51:00
292 S 0410 07:06:00 0410 07:13:17 .005058 0410 07:06:00 20100410 06:51:00 20100410 07:06:00
292 S 0410 07:21:00 0410 07:29:42 .006042 0410 07:21:00 20100410 07:06:00 20100410 07:21:00
292 S 0410 07:36:00 0410 07:47:48 .008194 0410 07:36:00 20100410 07:21:00 20100410 07:36:00
292 S 0410 07:51:00 0410 08:08:07 .011887 0410 07:51:00 20100410 07:36:00 20100410 07:51:00
292 S 0410 08:08:07 0410 08:29:43 .015000 0410 08:08:07 20100410 07:51:00 20100410 08:08:07
292 S 0410 08:29:43 0410 08:50:10 .014201 0410 08:29:43 20100410 08:08:07 20100410 08:29:43
292 S 0410 08:50:10 0410 09:22:28 .022431 0410 08:50:10 20100410 08:29:43 20100410 08:50:10
292 S 0410 09:22:28 0410 10:27:11 .044942 0410 09:22:28 20100410 08:50:10 20100410 09:22:28
292 S 0410 10:27:11 0410 12:57:16 .104225 0410 10:27:11 20100410 09:22:28 20100410 10:27:11
292 F 0410 12:57:16 0410 14:50:26 .078588 0410 12:57:16 20100410 10:27:11 20100410 12:57:16
292 F 0410 14:50:26 0410 16:49:42 .082824 0410 14:50:26 20100410 10:27:11 20100410 14:50:26
292 F 0410 16:49:42 0410 19:06:10 .094769 0410 16:49:42 20100410 10:27:11 20100410 16:49:42

The first half dozen records show the program ID 292 running in half aminute or so. It varies a little, from just under 30 seconds to 45 seconds. Each run kicks off 15 minutes after the previous and, if you check the START_PARAM and END_PARAM on the right of the listing, you can see that the report runs for the previous 15 minutes. ie the period since the last run ( including the running time of the last itteration).

Then, at 05:36 something happens. The execution takes a full minute. The next run takes 1 minute and 20 seconds. The next run takes over two minutes and each following execution takes a little longer and longer.

At 07:51, 1.5 hours later, something significant happens. The execution takes more than 15 minutes to run. This report is run every 15 minutes. I’m sure you can appreciate that this is a problem. What happens now depends on your architecture.

If you have a scheduler that simply kicks off the report every 15 minutes, at the next scheduled time (08:06 in my case) a new execution will start and you will have two version of the report running for a couple of minutes. As time progresses this overlap increases. When the run time reaches 30 minutes you will now start having 3 reports running at the same time. And soon you will get three, four, five etc version running at the same time. Depending on the number of CPUs and IO bandwidth of your system, how long will it be until it will be doing nothing but running this code?
If the code is doing somthing more than simply selecting data, the chance of the concurrent versions competing for locks or resources is high and can mean that the slow-down of the code escalates rapidly and in a very, very short time, your system is on it’s knees.

In our case, we have some protection against this. The scheduler detects that the previous version has not finished and it waits until it has done so before starting the next execution. so the next execution is delayed. In my example the 08:06 execution is delayed until 08:08, 2 minutes late.

We still have two possible situations. The report could simply continue to slow down at the previous rate and just be delayed longer and longer. In my example it has been slowing down at around 1 minute 20 seconds per run for the last 5 runs.

However, The next execution is further delayed until 08:29 – which is 6 minutes beyond the 15 minute window from 08:08. This latets run of the code is 5 minutes 30 seconds slower than the previous execution, not 1 minute 20 seconds slower. We still have a tipping point. If you remember, this report runs to cover the period since the last execution. As the last execution took more than 15 minutes, the next run has to report on more than 15 minutes. 21 minutes in this case.

The slow-down now rapidly escalates due to not just whatever was slowing the report down initially but also the growing reporting period.

Just 5 runs and 5 hours later, at 12:57:16, the report fails. It now runs for so long that it cannot recreate the data as it looked at the start of the run and we get snapshot-too-old errors. In the listing you just see the second column go from S to F.

It is game over. Thankfully only one copy of the report is running at any time (A version is kicked off as soon as the previous one fails) and so, worst case, is using only one process on the box, one CPU maximum and the IO that one thread can demand.

What was the root cause? Time and statistics and data volume.
Statistics because the job that collects statistics failed earlier.
Time because at 05:50 or so, the run window of 15 minutes was far enough out of the range of known date-times for a partition that the CBO decided only a few rows would be found and swapped to a nested-loop execution.
Data volume as data was coming in quicker as the day progressed and the nested loop plan performance was highly susceptible to increases in data volume.

For an excellent posting on why code suddenly changes it’s execution plan when nothing has happened but time passing, seethis excellent post and comments on the topic on Richard Foote’s blog

{And isn’t his blog so much skinnier than mine? 🙂 }

Friday Philosophy – I killed a presentation April 1, 2010

Posted by mwidlake in Friday Philosophy, performance, statistics.
Tags: , , ,

Well, this week saw the latest Management and Infrastructure SIG. I won’t go into the SIG meeting itself just now, maybe later this weekend. If you want an opinion on it, you might like to see what Graham Oaks said.

Being Chair of the SIG I have to find presenters. Being also one of those rare people who enjoys presenting, I know I can always tap one person to present. Me. {If only I could sing, play an instrument, act, do impression, put both my ankles behind my ears(*) or anything that was of any general interest, getting an audience would be so much easier}.

I decided to tackle a topic I have tried before and which I previously did a very poor show of: “General principles on gathering database statistics”.
It is not a title to strike joy into the soul, I know, but it is really important to getting good and reliable performance out of any modern Oracle database. Even other Oracle DBA types tend to find the topic worthy but boring, but I seem to have been forced to know a lot about it and I’ve become sadly very passionate about it.

So, I tried again. I mean, how long should it take to describe the issues with database statistics and the general principles to gathering them? I took my old presentation and stripped out all code, all SQL syntax, all listing of tables and got it down to describing the process. Then I started adding the bits and pieces you need to know to get this half-right. 132 slides. Hmmmm

I tried again and took it higher level. 48 slides. I can do 48 slides in something between 45 minutes and an hour. I was ready.

I had the last presentation slot of the day. I figure if I am going to foist myself on the audience, they should have the right to leave before I start, without missing any of the good stuff. I had 3/4 of the day’s attendees still with me when I started.

I think I killed my audience. Actually, I know I killed my audience. I lost a few as the slot progressed {I kidded myself they had to get back to the office/catch a train} and I made the fatal mistake of not shutting up when I had done 45 minutes. You see, there was all this important stuff I had not mentioned yet! Thankfully, I had a friend in the second row and when I saw him lose the will to live, I stopped. I’d been at it for 70 minutes. What we really needed was the Chair to shut me up but I was the chair.

Anyway, I came away from the talk with two realisations.

  • The topic of gathering database statistics is too large to cover in one presentation session.
  • It really is very boring, even when you are passionate about it.

I think I have no choice but to make this a blog thread (as I threatened to do so about 1 month ago). But I have also promised myself to finish off SQL Audit before I start a new thread and that was 3 months ago.

So I think I am going to have to put some time into this blogging lark. It’s OK though, I seem to go on about database statistics so much that even my cat is staring to avoid me. I’ll just stop watching Star Trek and drinking wine in the evenings and switch to preparing technical blogs. And drinking wine of course.

(*) Up until 2007 I could do this. I get out more now. Despite the stats fixation

Which Hints to Use March 18, 2010

Posted by mwidlake in performance.
Tags: , ,

When tuning Oracle SQL code, which are the best Hints to use? Are there good and bad Hints? I believe so.

Firstly, I wish Oracle had called Hints something else. As many have commented, Hints are actually directives to the CBO. If the Hint is syntactically correct and can be applied, it will be applied. If you include a “USE_INDEX” Hint to tell the CBO to use an existing index then, no matter how crazy it would be to do so, the CBO will use the hint if doing so does not prevent the query from being logically possible.

That leads onto one of the reasons I do not actually like Hints {and these reasons lead to my definition of good and bad hints}. So long as the Hint remains possible, it will be followed. It has become a vital part of the logic of the code but Hints are rarely reviewed.
Another reason, often mentioned, is that if new functionality is introduced to the Oracle Optimizer, Hints may prevent it being used {eg you might have hinted the use of an index but now Oracle can convert two other, less selective B-tree indexes into bitmaps and merge them together to get a far more selective path – but the Hint forces the use of the original index}.
The above is possible but is rare compared to a far more common issue – You hinted a specific index be used, but if you now create another, more suitable index for that sql statement, the new index will not be used. The hinted one will still be used. Similarly If you drop the index that the Hint references, now the Hint is invalid and the CBO will chose a new access path. You are given no warning when you drop an index that hints reference {and it would be very tricky for Oracle to do this for you in a reliable way}.

A final problem with Hints is that the access path may need to change as data volumes and relationships change. When there are very few rows in a driving table, a nested loop access path may make sense. But if that driving table grows and has many more rows in it, then a hash join becomes more efficient. A Hint can fix this path.

I prefer Hints that give the CBO more information but allow it to still choose the path and to vary as data volumes change.

I like the DYNAMIC_SAMPLING Hint as it is generally just telling oracle to look harder {more intelligently} at the statistical information, at the cost of spending a little longer on the parse . Most systems have OPTIMIZER_DYNAMIC_SAMPLING set at 1 or 2 so by default tables with no stats will have stats gathered. Hinting at level 3 and 4 instructs the CBO to verify estimate guesses for predicates it has made and check correlation between rows. It is probably the Hint I am most happy using.

In version 10.1 I encountered lots and lots of issues with the CBO trying to unnest and merge elements of the SQL into the main body of the query. And getting it wrong. Though it is potentially stopping the CBO from examining useful access paths, I do use NO_MERGE and NO_UNNEST quite often and I “like” them as it leaves the rest of the decisions up to the optimizer. You are basically saying “leave that subquery alone and satisfy it in isolation”. I still encounter lots of such issues on 10.2, but I also use UNNEST more often, to push a subquery into the body of the code.

I am more happy using a CARDINALITY hint than USE_NL or USE_HASH as the CARDINALITY hint is self documenting (it says exactly what correction {or lie} you are giving the optimiser). A USE_NL Hint is enforcing an access path and not really saying why.

If specific hints are going to be used, I like to be very specific. USE_HASH should state both tables that are to be hashed together {It might even be that modern versions of Oracle insist on the form USE_HASH (taba tab) and not just USE_HASH (taba), as I never use the less specific Hint}.
{ NB see comment by Jonathan Lewis as to why I am utterly wrong on this one – USE_HASH basically says “Use a hash join on the tables listed” but does not force a hash between the two stated tables}.

I only Hint specific use of an index if I can’t fix the problem with better gathered stats. I don’t LIKE adding INDEX Hints, even though they are commonly used and easy to understand. For the reasons stated above, I do not like ordering the CBO to use one of the currently existing indexes.

I really do not like using the RULE hint. In Version 9 it was an uphill struggle to get people to not use it as it so often fixed the immediate problem and, of course, oracle used it so much for internal SQL (and still do in 10 and even in 11). How many current DBAs and Developers know what the Rule Based Optimizer rules are? {hands DOWN you over-40 lot}. Using the RULE hint is bordering on homeopathy for databases. It seems to work, you have no idea why and, in reality, it may well be doing nothing, as you are using a feature of Oracle that is incompatible wiht the RBO.

I am very, very uncomfortable about some hints. The bypass_ujvc hint is one of them. It basically tells Oracle it can do a MERGE INTO statement without having the unique constraint in place to support the where clause on the target table that allows it to work reliably. You are telling the optimizer “just trust me”. IE you can lie horribly to the CBO.

All in all, I try and get the stats right rather than hint. I’ll spend ten times as long trying to understand and fix (if I can) why the estimated costs and cardinalites in an Explain Plan are wrong than slapping in an INDEX Hint. I will use Hints if I can’t fix the plan via the stats, but I try and use the more generic Hints. I know from experience that fixing the stats {or at least understanding why I can’t} fixes more code than adding one hint to one SQL statement.

A good rule of thumb is, if the Cardinality is accurate for a step, the plan will be acceptable. {This is a Rule of thumb, not a cast-iron truth!}.

DBMS SIG March 10, 2010

Posted by mwidlake in internals, Meeting notes, performance.
Tags: , ,

I went to the DBMS SIG today {DBMS Special Interest Group meeting of the UK Oracle User Group}. Don’t worry, I am not going to run through the set of presentations and make comments on them – although I like reading such entries by others on their blogs, I generally like them due to the style of writing as opposed to getting information out of them. But there is the odd tidbit that can be worth disseminating to as wide an audience as possible and I like to do my bit.

Having said I won’t go through all the talks… :-). No, I just want to comment that all the talks had merit at this meeting and, quite rightly, the meeting was full. This is nice to see as last year SIG attendance fell across the board, due to the “economic climate”. Very daft, in my opinion, as I see SIGs as free training plus networking opportunities (sorry to use the “networking” word” plus different viewpoints, all of which are invaluable at any time and especially valuable when things are hard.

Go to SIGs (or whatever is available in your country) as it is always worth the day invested. Where else can you see half a dozen experts give opinions and also corner them and ask them more stuff as well?

Anyway, the tidbits. First, Jonathan Lewis demonstrated how he goes about solving issues with complex SQL statements. It was terribly feindish and extremly clever and needs incredible in-depth knowledge of the oracle RDBMS… He draws a pseudo Entity Relationship Diagram of the tables involved, adds in some figures on what filtering will achieve and what ratio of records will be involved in table joins and asks the question “what will probably happen” a couple of dozen times. Yes, I lied, it does not need vast knowledge. It needs a clear, simple approach to solving the problem. And an ERD. I love ERDs and rue their general demise. I use exactly the same method myself to investigate complex SQL performance issues {I think I could be accused of trying to ride on shirt-tails here, but honestly, the step I take if an Explain Plan does not help me is to ERD the statement and look at the indexes and ratios between tables to see how I , as a human, would solve the query. Chatting to a few other old lags, it is a relatively universal approach by those of us who have used ERDs}. If you are a member of the UKOUG I strongly recommend downloading the slides to Jonathan’s talk. If you are not a member, maybe Jonathan will present it again at another venue, or you could get him to come along and do a course on tuning. {Jonathan, if you get a gig as a result if this, I want a pint of Marston’s Pedigree, OK?}

{And thanks to Sean malloy for commenting to provide a link to a published version of Jonathan’s method – Jonathan did mention this, highlighting the fact that it is his first real foray into SQL*Server. However, the method is database agnostic. This is the article}

Second tidbit. Adrian Dodman and Owen Ireland (who both look remarkably like Hollywood hearthrobs in their pictures, but different as their in-the-flesh selves, though very decent chaps they are too.) did an excellent talk on VLDB physical standbys, a topic that has particular resonance for myself. They mentioned parallel_execution_message_size. This defaults to 2k, on 10g at least. It is a rubbish setting. No, let me not beat about the bush, it is an utterly rubbish setting. If you use parallel query, parallel recovery or parallel anything-at-all, check out this parameter and, due dilligence allowing, increase it. Try 8k as opposed to 2k and even 16k. The manual on it says the default of 2k/4k is fine. It ain’t. Increasing the value just takes some memory out of the shared pool and, these days, if you can’t afford a few extra KB out of your shared pool, you need to replace your server with something costing about twice as much as a top-end desktop PC. { Why am I so vigorous in my opinion on this parameter? Well, I had a situation a few months back of trying to migrate a database to a new geographic location for a client in Germany. We did a backup/recovery type operation to do this. Applying the redo logs was proving to be a performance issue so Oracle Corp suggested parallel redo log application. It ran a LOT slower than single thread, about 300% slower. However, increasing the parallel_execution_message_size from 2k to 8k made the parallel log application about 400% faster than single thread. ie a dozen times faster. I know from presentations by Christian Antognini and discussions with others that it is a key parameter to getting parallel query to perform well too.}

Last tidbit. Don’t drop the OUTLN user. Yes, I know, why would you? Just don’t, OK? Especially on Oracle 11. If you do, for whatever reason, DO NOT SHUT DOWN THE DATABASE. Call Oracle Support and pray. Thanks go to Peter Mahaffey for that one. Yes he did. It all went terribly wrong for him.

Fun with Filters March 9, 2010

Posted by mwidlake in performance.
Tags: , ,

{Please note – this post has been used to test changes to layout, thus the duplicated sections}

This post is about the importance of filter statements in execution plans and how they can indicate problems that are not always apparent.

I had a problem recently with some code that had gone rogue – it had been running in a few seconds, being executed every 10 minutes or so. Now it ran until either someone killed it off or it got “snapshot too old” errors. I pretty much knew it was prompted by stats being gathered on the tables in question as we had just gathered stats on the tables in that schema.

The code was something like this (it is not too important what exactly the code was). Oh, and this is on, enterprise edition with partitioning.

select  accounted, max(recorded_date),count(*)
where accountid||DATA_SRC_COUNTRY_ID in
   and 1= 1 – this is pointless, it replaces a check of “run”=”run”
   group by accountid,DATA_SRC_COUNTRY_ID

Those tables are quite large, several million rows in each. The code is basically scanning all of the rows in the table as there are no indexes to support the query and it is written in a way that, well, I would not write it.

Digging around in some monitoring tools I use I confirmed that the code has swapped plan

{original layout, having hacked it to fit}

inst SQL_ID        Plan Hash   First Load   execs   
---- ------------- ----------- ------------ ------ 
  TOT_ROWS       TOT_BG      TOT_DR        TOT_CPU    CREATED_DT
---------- ------------ ----------- --------------  ------------
   3 1t7ma4xn3rw46  3183284037 100203 09:10     125
     7,854     6,457,885   1208,149    816,476,585  091020 10:27
   3 1t7ma4xn3rw46  3127554972 100206 09:13       2
         0  1260,936,642        980  94599,678,960  100205 09:19

{newly possible layout, with wider screen}

inst SQL_ID        Plan Hash   First Load   execs    TOT_ROWS       TOT_BG      TOT_DR        TOT_CPU    CREATED_DT

---- ------------- ----------- ------------ ------ ---------- ------------ ----------- --------------  ------------
   3 1t7ma4xn3rw46  3183284037 100203 09:10    125      7,854    6,457,885    1208,149    816,476,585  091020 10:27
   3 1t7ma4xn3rw46  3127554972 100206 09:13      2          0 1260,936,642         980  94599,678,960  100205 09:19

Version 1 comes back in 5 or 6 seconds. Version 2 does not effectively come back, which is why it records 0 rows. You can see that the Disk Gets are pretty low in version 2 (allowing that it was never left to finish) but Buffer Gets and CPU are both massively up. So much so, it exceeds where I format the numbers with comas (as, if they are THAT big, they need fixing anyway).

I looked at the plans and, though they were different, nothing jumped out at me. So I trimmed down the code and built it up section by section until I saw something significant change. This is my general mode of tuning code if nothing initially occurs to me.
As usual, I started with any scalar or inline code, in this case that SELECT…MINUS…SELECT in the where clause.


This came back in 3.145 seconds, with all of 62 rows. That is fine.

Now I added back the use of the MINUS code as a WHERE clause to the outer query.

select accountid
where accountid||DATA_SRC_COUNTRY_ID in

This came back in 5.22 seconds with 4468 records. Again, I can live with that, we do not need sub-second response, we need sub-minute response.

So I now added back in the group by accountid…

select accountid,count(*)
where accountid||DATA_SRC_COUNTRY_ID in
group by accountid

This does not come back before I have fetched and consumed a cup of tea. That is a significant change.

I was not expecting this, why would sorting under 5 thousand rows stress a database that can scan multi-million row tables in seconds? There is all that CPU being burned up and huge numbers of buffer gets, but not much in the way of disc IO, so it is probably doing something that can be done in memory a very, very large number of times.

What is the difference in plan? {Oh HELL, they look rubbish – sorry, click on the images to blow them up, I’d suggest right-click and open in a new tab, and, YES, I am working on making my blog a little more “wide and friendly”.}. Good plan first, bad plan second

{Original compressed – width=”459″ height=”122″}

{Largest size I can use – width=”800″ height=”212″}

{Natural “overflow” size – width=”866″ height=”230″}

{ was width=”460″ height=”132″ now 800*242. Orig is 839* 242}

The view step has disappeared, which was an expensive step so should help not hinder.
The “SORT GROUP BY” has appeared, which it would do as we introduced the “GROUP BY” clause.
The hash join has been replaced with a filter. In a more complex plan you might miss this replacement, you would maybe notice the hash join disappearing but a filter, well, it is checking some simple “WHERE” clause isn’t it?

Well, the hash join was joining the results coming from the two steps feeding into it, the PARTITION RANGE FULL and the VIEW (which is the in-memory construct of the SELECT..MINUS…SELECT statement).

Now the Filter is filtering the results from the PARTITION RANGE ALL with the results from the MINUS. At this point I’d like to highlight that the predicted cardinality and bytes coming back for the steps within the union have reduced by a factor of 100 from the good and bad plans. And I’ll also admit I hid some key detail in the screen shot. I am not showing the access and filter predicates.

{image is 989*246, I am trying 800*246. Original 460*114}

The above is a screen shot showing that in the new code there are no filter predicates but an access predicate, for the access to the view (ie the select…minus…select). For easier reading, the full access predicate is below, rather than in the screen shot:


However, for the slow code, there are no access predicated but are filter predicates. Again, the screen shot shows that there are predicates and I show the full text below. (Screen shots are from PL/SQL developer, btw).

{image 953*238, trying 800*238, original 460*114}




Basically, the filter is to run the union query for every row from the driving query, passing in the relevant filter clause as an extra predicate to each of the individual queries of the SELECT…MINUS…SELECT

Clever, but not of much help to us as the performance is awful.

I showed this to a colleague and their immediate response was “why does that not show up as a nested loop? How am I supposed to spot that the “filter” is doing so much?” There were a couple of robust Saxon words mixed in with that statement.

So, Beware Filters replacing joins and get in the habit of checking out the filter and access predicates

If you use an old-style template for showing plans {like I do, a guilty sin of mine}, or a GUI where you have not selected that the filter and access predicates be shown, you may well not get them displayed. If you use autotrace in SQL*Plus, you will though:

db3_mw> select accountid,count(*)
  2  from W_LCG_OPENING_
  3  where accountid||DATA_SRC_COUNTRY_ID in
  4       (
  6         minus
  8        )
  9  group by accountid
 10  /

Execution Plan
Plan hash value: 397856216

| Id  | Operation                | Name             | Rows  | Bytes | Cos
t (%CPU)| Time     | Pstart| Pstop |
|   0 | SELECT STATEMENT         |                  |   911 |  7288 |
15M  (7)| 17:25:41 |       |       |
|   1 |  SORT GROUP BY           |                  |   911 |  7288 |
15M  (7)| 17:25:41 |       |       |
|*  2 |   FILTER                 |                  |       |       |
        |          |       |       |
|   3 |    PARTITION RANGE ALL   |                  |  3153K|    24M|  83
11   (4)| 00:00:34 |     1 |   840 |
|   4 |     TABLE ACCESS FULL    | W_LCG_OPENING_   |  3153K|    24M|  83
11   (4)| 00:00:34 |     1 |   840 |
|   5 |    MINUS                 |                  |       |       |
        |          |       |       |
|   6 |     SORT UNIQUE NOSORT   |                  | 31535 |   246K|  86
15   (7)| 00:00:35 |       |       |
|   7 |      PARTITION RANGE ALL |                  | 31535 |   246K|  86
11   (7)| 00:00:35 |     1 |   840 |
|*  8 |       TABLE ACCESS FULL  | W_LCG_OPENING_   | 31535 |   246K|  86
11   (7)| 00:00:35 |     1 |   840 |
|   9 |     SORT UNIQUE NOSORT   |                  |   132 |  1056 |
17  (18)| 00:00:01 |       |       |
|* 10 |      INDEX FAST FULL SCAN| W_LCG_CLIENT__PK |   132 |  1056 |
16  (13)| 00:00:01 |       |       |

Predicate Information (identified by operation id):

RC_")=TO_CHAR(:B3)||TO_CHAR(:B4) )))
   8 - filter(TO_CHAR("ACCOUNTID")||TO_CHAR("DATA_SRC_")=TO_CHAR(:B1)|
  10 - filter(TO_CHAR("ACCOUNTID")||TO_CHAR("DATA_SRC_")=TO_CHAR(:B1)|

{re-worked layout}

db3_mw> select accountid,count(*)
  where accountid||DATA_SRC_COUNTRY_ID in
  group by accountid

Execution Plan
Plan hash value: 397856216

| Id  | Operation                | Name             | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
|   0 | SELECT STATEMENT         |                  |   911 |  7288 |15M  (7)    |17:25:41  |       |       |
|   1 |  SORT GROUP BY           |                  |   911 |  7288 |15M  (7)    |17:25:41  |       |       |
|*  2 |   FILTER                 |                  |       |       |            |          |       |       |
|   3 |    PARTITION RANGE ALL   |                  |  3153K|    24M|  8311   (4)|00:00:34  |     1 |   840 |
|   4 |     TABLE ACCESS FULL    | W_LCG_OPENING_   |  3153K|    24M|  8311   (4)| 00:00:34 |     1 |   840 |
|   5 |    MINUS                 |                  |       |       |            |          |       |       |
|   6 |     SORT UNIQUE NOSORT   |                  | 31535 |   246K|  8615   (7)| 00:00:35 |       |       |
|   7 |      PARTITION RANGE ALL |                  | 31535 |   246K|  8611   (7)| 00:00:35 |     1 |   840 |
|*  8 |       TABLE ACCESS FULL  | W_LCG_OPENING_   | 31535 |   246K|  8611   (7)| 00:00:35 |     1 |   840 |
|   9 |     SORT UNIQUE NOSORT   |                  |   132 |  1056 |17  (18)    | 00:00:01 |       |       |
|* 10 |      INDEX FAST FULL SCAN| W_LCG_CLIENT__PK |   132 |  1056 |16  (13)    | 00:00:01 |       |       |

Predicate Information (identified by operation id):

              FROM "W_LCG_CLIENT_"  "W_LCG_CLIENT_" 
              WHERE TO_CHAR ("ACCOUNTID")||TO_CHAR("DATA_SRC_")=TO_CHAR(:B3)||TO_CHAR(:B4)              )))
  10 - filter(TO_CHAR("ACCOUNTID")||TO_CHAR("DATA_SRC_")=TO_CHAR(:B1)||TO_CHAR(:B2)) 

What did I do to fix the problem? Well, even though the code originally went bad due to stats being gathered, I could not force it back to a nice plan after an hour or two playing with gathering new stats so, in this case, I used an UNNEST hint.

select accountid,count(*)
where accountid||DATA_SRC_COUNTRY_ID in
       SELECT /*+ unnest */ accountid||DATA_SRC_COUNTRY_ID FROM W_LCG_OPENING_
group by accounted

And it forced the plan back to the version using a HASH join.

The hinted plan

I’m a little unhappy about that hint, but the code needed fixing and the hint did what I wanted it to {I think it was poor of me to only hint one of the two minus statements and I do not like resorting to hints just because I can’t work out what is “wrong” with the stats – If I work out what is wrong and still need the hint, fair enough}. However, I had to be pragmatic and get the code fixed and working in Live, so it has gone in with a hint