jump to navigation

Min and Max do not Mix May 27, 2009

Posted by mwidlake in performance.
Tags: , ,
10 comments

If you want to select the min and max value for an indexed column in a table, it may not perform as well as you like, despite some nice tricks Oracle can do with min and max. I’ll show you what I mean and a little trick, which might be the fastest way to get the min and max values in Oracle.

You can use this scipt here to replicate the below. It creates a test table of 99,999 rows. It was written on oracle 10 and should work on 11 unchanged {post a comment if you try and find anything is different} and you just need to remove the “purge” part of the drop command to get it to work on 9 {I think}.

This is the basic environment and setup.

NAME                           VALUE
------------------------------ --------------------
compatible                     10.2.0.1.0
cpu_count                      1
db_block_size                  8192
db_file_multiblock_read_count  16
optimizer_mode                 ALL_ROWS
sga_target                     612368384
sort_area_size                 65536

test102>desc test_1
 Name                             Null?    Type
 ------------------------------------------------------
 ID                               NOT NULL NUMBER
 NUM_1                                     NUMBER
 ID_2                                      VARCHAR2(41)
 NUM_2                                     NUMBER
 NUM_3                                     NUMBER
 NUM_4                                     NUMBER
 VC_1                                      VARCHAR2(50)
 VC_2                                      VARCHAR2(50)
 VC_3                                      VARCHAR2(250)

select count(*) from test_1;

  COUNT(*)
----------
     99999
Elapsed: 00:00:00.01

Execution Plan
-----------------------------------------------------
| Id  | Operation             | Name | Rows  | Cost (%CPU)|
-----------------------------------------------------
|   0 | SELECT STATEMENT      |      |     1 | 49   (5)|
|   1 |  SORT AGGREGATE       |      |     1 |         |
|   2 |   INDEX FAST FULL SCAN| T_PK | 99999 | 49   (5)|
-----------------------------------------------------

Statistics
-----------------------------------------------------
          0  recursive calls
          0  db block gets
        215  consistent gets
          0  physical reads

You can see that to count the 99,999 rows the CBO expects to fast full scan index T_PK for a cost of 49 {which equates roughly to the number of expected I/O operations} and the execution results in 215 buffer gets, which is the number of times a block was read from memory.

Let’s select the MAX(ID):

select max(id) from test_1;

   MAX(ID)
----------
     99999

Execution Plan
-----------------------------------------------------
| Id  | Operation                  | Name | Rows  | Bytes | Cost (%CPU)|
-----------------------------------------------------
|   0 | SELECT STATEMENT           |      |     1 |    4 |2(0)|
|   1 |  SORT AGGREGATE            |      |     1 |    4 |    |
|   2 |   INDEX FULL SCAN (MIN/MAX)| T_PK | 99999 |  390K|2(0)|
-----------------------------------------------------

Statistics
-----------------------------------------------------
          0  recursive calls
          0  db block gets
          2  consistent gets

The plan shows that Oracle will use an INDEX FULL SCAN (MIN/MAX) to get the result. I find it a little misleading that it is called a “full scan” and it shows as processing 99,999 rows when in fact what Oracle does is nip down the ‘edge’ of the index to get the result. It reads the root block and one leaf block to get the result {Oracle does not need to visit the table, the data for column ID is in the index}.
If you have any doubts, look at the consistent gets – 2.
It’s fast and efficient.
It works with MIN too:-

select min(id) from test_1;

   MIN(ID)
----------
         1

Execution Plan
--------------------------------------------------
| Id  | Operation                  | Name | Rows  |Bytes| Cost (%CPU)|
--------------------------------------------------
|   0 | SELECT STATEMENT           |      |     1 |   4 |2 (0)|
|   1 |  SORT AGGREGATE            |      |     1 |   4 |     |
|   2 |   INDEX FULL SCAN (MIN/MAX)| T_PK | 99999 |390K|2 (0)|
---------------------------------------------------

Statistics
----------------------------------------------------
          0  recursive calls
          0  db block gets
          2  consistent gets

It does not work for the other group functions as they need to gather all the data to, for example, calculate the average value for ID:

select avg(id) from test_1;

AVG(ID)
———-
50000

Execution Plan
——————————————————————————
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)|
—————————————————–
| 0 | SELECT STATEMENT | | 1 | 4 | 49 (5)|
| 1 | SORT AGGREGATE | | 1 | 4 | |
| 2 | INDEX FAST FULL SCAN| T_PK | 99999 | 390K| 49 (5)|
—————————————————–

Statistics
—————————————————–
0 recursive calls
0 db block gets
215 consistent gets

Here Oracle returns to the index fast full scan and the same workload as count(*).

Now for the “gotcha” bit. What happens if you select MIN and MAX?

select max(id),min(id) from test_1;

   MAX(ID)    MIN(ID)
---------- ----------
     99999          1


Execution Plan
--------------------------------------------------
| Id  | Operation             | Name | Rows  | Bytes |Cost (%CPU)|
--------------------------------------------------
|   0 | SELECT STATEMENT      |      |     1 |     4 |49   (5)|
|   1 |  SORT AGGREGATE       |      |     1 |     4 |        |
|   2 |   INDEX FAST FULL SCAN| T_PK | 99999 |   390K|49   (5)|
---------------------------------------------------


Statistics
----------------------------------------------------
          0  recursive calls
          0  db block gets
        215  consistent gets
          0  physical reads
          0  redo size
        472  bytes sent via SQL*Net to client
        381  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed

Damn! Oracle goes back to the index fast full scan and 215 consistent gets!

Way back when I was beta testing Oracle 10 I found this and reported it as a bug. Oracle Corp replied that it was not a bug, it was expected behaviour. It would scan the whole index even though, as a human, you could see a better way. The reason given, and I have some sympathy with this, is that spotting that only min and max supported by an index had been asked for and there was a more efficient way to do this would have to be added to the CBO.
Where we differed is they thought this would be a rare event, I thought it would be a more common event and worth their effort.

Never mind, you can do it yourself {I won’t line this up like the previous examples, they need too much editing to fit!}:

– now together
select max(id) from test_1
union
select min(id) from test_1;

MAX(ID)
———-
1
99999

Execution Plan
————————————————————–
| Id | Operation | Name | Rows | Bytes |TempSpc| Cost (%CPU)|
————————————————————-
| 0 | SELECT STATEMENT | | 2 | 8 | | 601 (53)| 00:00:08 |
| 1 | SORT UNIQUE | | 2 | 8 | 4753K| 601 (53)| 00:00:08 |
| 2 | UNION-ALL | | | | | | |
| 3 | SORT AGGREGATE | | 1 | 4 | | 301 (5)| 00:00:04 |
| 4 | INDEX FULL SCAN (MIN/MAX)| T_PK | 99999 | 390K| | 2 (0)| 00:00:01 |
| 5 | SORT AGGREGATE | | 1 | 4 | | 301 (5)| 00:00:04 |
| 6 | INDEX FULL SCAN (MIN/MAX)| T_PK | 99999 | 390K| | 2 (0)| 00:00:01 |
————————————————————————-

Statistics
———————————————————-
0 recursive calls
0 db block gets
4 consistent gets
0 physical reads
0 redo size
451 bytes sent via SQL*Net to client
381 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
1 sorts (memory)
0 sorts (disk)
2 rows processed

There are times when a human beats the computer. 4 consistent gets. I’ve played with this a little and on my test system the select min(id),max(id) version takes around 4ms. The union method takes 0-1ms.
I see a couple of oddities.

- Firstly, the plan has 3 sort operations, but the statistics show only 1 sort. I am not fully sure what is going on, it might be that the plan is a little duff, it might be that at execution time Oracle realises it really does have only 1 record to sort and so does not. {I’ll buy the person who tells me the cause of the remaining sort and how to get rid of it a pint at the next UKOUG – if you do so before I blog about it}

- Secondly, if you look at the plan, the CBO estimates a lot of effort for the sorts and thus the total cost of the statement is high relatively high, 601. I need to look into that.

I’ve used this trick of splitting “select min,max” into two statements unioned together several times over the years and I find it very useful. Below I go into a few more oddities, but I know I ramble on so feel free to stop now.

In previous postings I have looked at the fastest way to select the number of records in a table. Well, if your tables have a surrogate primary key consisting of an ascending number (generated from a sequence and a very common occurrence) and you know there are few or no skipped values, you can use the above trick to get the MIN(ID) and MAX(ID) and thus the difference is the number of rows. You can do it in one statement, as shown below:

-- diff between min and max in one statement 
select a.result-b.result id_number
from (select max(id)+1 result,1 eric from test_1) a
   ,(select min(id) result,1 eric from test_1) b
where a.eric=b.eric
/

ID_NUMBER
-----------------
            99999


Execution Plan
-------------------------------------------------------
| Id  | Operation                    | Name | Rows  | Bytes | Cost (%CPU)|
-------------------------------------------------------
|   0 | SELECT STATEMENT             |      |     1 |    32 |     4   (0)|
|   1 |  NESTED LOOPS                |      |     1 |    32 |     4   (0)|
|   2 |   VIEW                       |      |     1 |    16 |     2   (0)|
|   3 |    SORT AGGREGATE            |      |     1 |     4 |            |
|   4 |     INDEX FULL SCAN (MIN/MAX)| T_PK | 99999 |   390K|     2   (0)|
|*  5 |   VIEW                       |      |     1 |    16 |     2   (0)|
|   6 |    SORT AGGREGATE            |      |     1 |     4 |            |
|   7 |     INDEX FULL SCAN (MIN/MAX)| T_PK | 99999 |   390K|     2   (0)|
------------------------------------------------------

Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
          4  consistent gets
          0  physical reads
          0  redo size
        422  bytes sent via SQL*Net to client
        381  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed

With this version the CBO recognises that the statement cost is low (look, cost 4), the odd sorts are still there in the plan on a single row but the statement takes 4 consistent gets.

I used to think, wrongly, that Oracle would only use the INDEX FULL SCAN (MIN/MAX) for getting the absolute minimum and maximum values. I don’t know why I thought this but check out the next two examples:

select max(id) from test_1
where id7500;

MAX(ID)
———-
99999

Execution Plan
———————————————————-
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)|
———————————————————-
| 0 | SELECT STATEMENT | | 1 | 4 | 2 (0)|
| 1 | SORT AGGREGATE | | 1 | 4 | |
| 2 | FIRST ROW | | 92500 | 361K| 2 (0)|
|* 3 | INDEX RANGE SCAN (MIN/MAX)| T_PK | 92500 | 361K| 2 (0)|
———————————————————-

Statistics
———————————————————-
0 recursive calls
0 db block gets
2 consistent gets
0 physical reads

As you can see, Oracle knows it can get the Max (and min) value using the INDEX RANGE SCAN MIN/MAX efficiently, even for the second case where I am asking for the max value ABOVE a stated point.
I half expected oracle to do a range scan from the value I gave it. But it has the sense to know that, logically, the maximum value below X can be found by looking for X and then it is the record before that (or where it would be if it does not exist). The maximum above X is the maximum. In restrospect, I was asking a daft question there :-)

I could play with this longer, but enough for now.

Follow

Get every new post delivered to your Inbox.

Join 161 other followers