jump to navigation

Assisting Partition Exclusion – Partitions for Performance November 23, 2009

Posted by mwidlake in performance, VLDB.
Tags: , ,
trackback

<Previous PostNext Post>
Partitions are often used to help increase the performance of SQL select activity via the concept of partition exclusion – the CBO will be able to identify which of the partitions could hold the data of interest and ignore the others. This is their main benefit from a SQL Select performance perspective.

The problem is, the partitioning key is not always included in your SQL statements.

I discussed this potential negative impact of partitioning on SQL select performance in this post and this one also. I did not give any proof then, so the first half of this post provides that.

As an example, I have created a table with 45 partitions, partitioned on the column ID, an ascending numeric primary key (traditionally sourced from a sequence). There is an index on the ID column, locally partitioned. The table also has a column CRE_DATETIME, which is the date the record was created. This column is also indexed with a locally partitioned index.

NAME                           VALUE
------------------------------ -------------
compatible                     10.2.0.3.0
cpu_count                      8
db_block_size                  8192
db_file_multiblock_read_count  16
optimizer_mode                 ALL_ROWS
sga_target                     0
sort_area_size                 65536

create table test_p3
(id           number(10) not null
,cre_datetime date not null
,status        number(1) not null
,num_1         number(4) not null -- random 20
,num_2         number(4) -- random 500
,num_3         number(5) -- cycle smoothly
,num_4         number(5) -- Random 10000
,vc_1          varchar2(10)
,vc_2          varchar2(10)
,vc_pad        varchar2(2000))
tablespace users
partition by range (id)
(partition id_01k values less than (1000)
tablespace users
,partition id_02k values less than (2000)
tablespace users
,partition id_03k values less than (3000)
tablespace users
...
,partition id_45k values less than (45000)
tablespace users
--
,partition id_max values less than (maxvalue)
tablespace users
)
/
select table_name,partitioning_type p_type
,partition_count pt_count
from dba_part_tables
where owner=user and table_name = 'TEST_P3'
TABLE_NAME                     P_TYPE    PT_COUNT
------------------------------ ------- ----------
TEST_P3                        RANGE           46

>@ind_cols
IND_NAME           TAB_NAME           PSN       COL_NAME
------------------ ------------------ --------- ------------
TP3_PK             TEST_P3            1         ID
TP_CRE_DT          TEST_P3            1         CRE_DATETIME

{This is a very typical senario, as is the opposite situation where the table is partitioned on date but has a numeric column holding eg order number or customer ID.}

The below shows a select of a single record by ID.

select id,cre_datetime,num_1,vc_1
from test_p3 where id=37123
/
        ID CRE_DATETIME           NUM_1 VC_1
---------- ----------------- ---------- ----------
     37123 12-OCT-2009 02:52         19 RMZAN
1 row selected.

Execution Plan
----------------------------------------------------------
----------------------------------------------
| Id  | Operation                          | Name    | Rows  | 
Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
----------------------------------------------------------------
----------------------------------------------
|   0 | SELECT STATEMENT                   |         |     1 |
  22 |     2   (0)| 00:00:01 |       |       |
|   1 |  PARTITION RANGE SINGLE            |         |     1 |
  22 |     2   (0)| 00:00:01 |    38 |    38 |
|   2 |   TABLE ACCESS BY LOCAL INDEX ROWID| TEST_P3 |     1 |
  22 |     2   (0)| 00:00:01 |    38 |    38 |
|*  3 |    INDEX UNIQUE SCAN               | TP3_PK  |     1 |
     |     1   (0)| 00:00:01 |    38 |    38 |
----------------------------------------------------------------
----------------------------------------------
Statistics
----------------------------------------------------------
          1  recursive calls
          0  db block gets
          3  consistent gets
          0  physical reads

Note the PARTITION RANGE SINGLE and pstart/pstop are both 38. Consistent gets are only 3. The query is very efficient as the CBO identified that records matching the WHERE clause could only exist in the one partition. The relevant local index is examined for 2 consisted gets (the BLEVEL of the local index partitions is 1) and then one consistent get against the table.

Now I’ll scan for records for a range of IDs:

select id,cre_datetime,num_1,vc_1
from test_p3 where id between 30980 and 31019
/
        ID CRE_DATETIME           NUM_1 VC_1
---------- ----------------- ---------- ----------
     30980 25-SEP-2009 01:20         15 JQLYA
     30981 25-SEP-2009 01:24          5 WYWCR
     30982 25-SEP-2009 01:28         10 QZSHD
     30983 25-SEP-2009 01:32         18 KQBSU
     30984 25-SEP-2009 01:36          2 HRPMA
....
     31018 25-SEP-2009 03:52          3 YOPJO
     31019 25-SEP-2009 03:56         10 GNGKG

40 rows selected.

Execution Plan
----------------------------------------------------------
----------------------------------------------
| Id  | Operation                          | Name    | Rows  | 
Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
----------------------------------------------------------------
----------------------------------------------
|   0 | SELECT STATEMENT                   |         |    40 |
 880 |     9   (0)| 00:00:01 |       |       |
|   1 |  PARTITION RANGE ITERATOR          |         |    40 |
 880 |     9   (0)| 00:00:01 |    31 |    32 |
|   2 |   TABLE ACCESS BY LOCAL INDEX ROWID| TEST_P3 |    40 |
 880 |     9   (0)| 00:00:01 |    31 |    32 |
|*  3 |    INDEX RANGE SCAN                | TP3_PK  |     2 |
     |     2   (0)| 00:00:01 |    31 |    32 |
----------------------------------------------------------------

Statistics
---------------------------------------------
          0  recursive calls
          0  db block gets
         52  consistent gets
          0  physical reads

Again, you can see partition exclusion, this time as a PARTITION RANGE ITERATOR and Pstart/Pstop of 31 and 32. Consistent gets were higher, but I did fetch several records.

Now a query is issued for a record with a specific value for CRE_DATETIME.

select id,cre_datetime,num_1,vc_1
from test_p3 where cre_datetime = 
   to_date('06-OCT-2009 12:20','DD-MON-YYYY HH24:MI')
/
        ID CRE_DATETIME           NUM_1 VC_1
---------- ----------------- ---------- ----------
     35105 06-OCT-2009 12:20         11 JPQHO
1 row selected.

Execution Plan
----------------------------------------------------------
------------------------------------------------
| Id  | Operation                          | Name      | Rows  |
 Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
----------------------------------------------------------------
------------------------------------------------
|   0 | SELECT STATEMENT                   |           |     1 |
    22 |    48   (0)| 00:00:01 |       |       |
|   1 |  PARTITION RANGE ALL               |           |     1 |
    22 |    48   (0)| 00:00:01 |     1 |    46 |
|   2 |   TABLE ACCESS BY LOCAL INDEX ROWID| TEST_P3   |     1 |
    22 |    48   (0)| 00:00:01 |     1 |    46 |
|*  3 |    INDEX RANGE SCAN                | TP_CRE_DT |     1 |
       |    47   (0)| 00:00:01 |     1 |    46 |
----------------------------------------------------------------

Statistics
---------------------------------------
          1  recursive calls
          0  db block gets
         90  consistent gets
          0  physical reads

Notice that partition exclusion does not occur, you see PARTITION RANGE ALL, Pstart/Pstop of 1 – 46. The CBO cannot identify which partitions may or may not hold records with that CRE_DATETIME. Consistent gets are now a lot higher, 90 (as compared to 3 for the situation where partition exclusion can occur}, as oracle has to prob each and every local index partition to identify if any records for the given value exist.

Of course, you probably would not look for a record with a specific datetime, but in the opposite senario of having partitioned on datetime, it would not be at all unusual to look for a record with a given ID and partition exclusion not being possible.

As I mentioned earlier, this need to scan all of the local indexes because no partition exclusion is possible is something I discussed a couple of weeks back.

Now I will select for the range of dates that my range scan in ID identified:

select id,cre_datetime,num_1,vc_1
from test_p3 
where cre_datetime between to_date('25-SEP-2009 01:20','DD-MON-YYYY HH24:MI')
                   and     to_date('25-SEP-2009 03:56','DD-MON-YYYY HH24:MI')

        ID CRE_DATETIME           NUM_1 VC_1
---------- ----------------- ---------- ----------
     30980 25-SEP-2009 01:20          3 BGAFO
     30981 25-SEP-2009 01:24         15 PXGJD
     30982 25-SEP-2009 01:28         12 PGQHJ
     30983 25-SEP-2009 01:32         17 TIBZG
     30984 25-SEP-2009 01:36         11 EQQQV
...
     31018 25-SEP-2009 03:52         18 FSNVI
     31019 25-SEP-2009 03:56         16 LJWNO
40 rows selected.

Execution Plan
----------------------------------------------------------
------------------------------------------------
| Id  | Operation                          | Name      | Rows  |
 Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
----------------------------------------------------------------
------------------------------------------------
|   0 | SELECT STATEMENT                   |           |    42 |
   924 |    54   (0)| 00:00:01 |       |       |
|   1 |  PARTITION RANGE ALL               |           |    42 |
   924 |    54   (0)| 00:00:01 |     1 |    46 |
|   2 |   TABLE ACCESS BY LOCAL INDEX ROWID| TEST_P3   |    42 |
   924 |    54   (0)| 00:00:01 |     1 |    46 |
|*  3 |    INDEX RANGE SCAN                | TP_CRE_DT |    42 |
       |    47   (0)| 00:00:01 |     1 |    46 |
----------------------------------------------------------------
------------------------------------------------

Statistics
----------------------------------------------------------
          1  recursive calls
          0  db block gets
        136  consistent gets
          0  physical reads

That actually selects back the same records, but now a PARTITION RANGE ALL is invoked, Pstart/Pstop os 1-46 and the number of consistent gets goes up to 136.

I discussed the impact of partitions on range scans in this second post about performance issues with partitions.

Thankfully, you can often alleviate this problem of considering all partitions.

The below shows a scan for records for a range of dates.

select count(*)
from test_p3
where cre_datetime between to_date('18-OCT-2009','DD-MON-YYYY')
                   and     to_date('20-OCT-2009','DD-MON-YYYY')

  COUNT(*)
----------
       721
1 row selected.

Execution Plan
----------------------------------------------------------
----------------------------------
| Id  | Operation            | Name      | Rows  | Bytes | Cost
(%CPU)| Time     | Pstart| Pstop |
----------------------------------------------------------------
----------------------------------
|   0 | SELECT STATEMENT     |           |     1 |     8 |    49
   (0)| 00:00:01 |       |       |
|   1 |  SORT AGGREGATE      |           |     1 |     8 |
      |          |       |       |
|   2 |   PARTITION RANGE ALL|           |   731 |  5848 |    49
   (0)| 00:00:01 |     1 |    46 |
|*  3 |    INDEX RANGE SCAN  | TP_CRE_DT |   731 |  5848 |    49
   (0)| 00:00:01 |     1 |    46 |
----------------------------------------------------------------
----------------------------------

Statistics
----------------------------------------------------------
          1  recursive calls
          0  db block gets
         90  consistent gets
          0  physical reads

As you can see, no partitiion exclusion is possible, every partition is considered (PARTITION RANGE ALL, Pstart/Pstop of 1-46). 90 consistent gets are required.

The trick is to introduce a second WHERE clause, limiting the query by the partitioning key as well.
As a human, you can recognise that the ID and the CRE_DATETIME columns are going to increase pretty much in synchronisation. Records with a CRE_DATETIME a month ago are going to be found in partitions with a lower ID range than those from a week ago. Oracle does not know this business logic, so you have to tell it.

Let us say you never have more than 1,000 records created a day. so if you want “today’s” data, you need only consider IDs that are between the max(ID) and the max(ID) less 1,000. To be safe, you would increase this range substantially. The idea is to exclude most partitions without risking missing data:

select count(*)
from test_p3
where cre_datetime between to_date('18-OCT-2009','DD-MON-YYYY')
                   and     to_date('20-OCT-2009','DD-MON-YYYY')
and  id between 39000 and 41000

  COUNT(*)
----------
       721
1 row selected.

Execution Plan
----------------------------------------------------------
------------------------------------------------
| Id  | Operation                   | Name             | Rows  |
 Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
----------------------------------------------------------------
------------------------------------------------
|   0 | SELECT STATEMENT            |                  |     1 |
    13 |    14   (8)| 00:00:01 |       |       |
|   1 |  SORT AGGREGATE             |                  |     1 |
    13 |            |          |       |       |
|*  2 |   VIEW                      | index$_join$_001 |    34 |
   442 |    14   (8)| 00:00:01 |       |       |
|*  3 |    HASH JOIN                |                  |       |
       |            |          |       |       |
|   4 |     PARTITION RANGE ITERATOR|                  |    34 |
   442 |     7  (15)| 00:00:01 |    40 |    42 |
|*  5 |      INDEX RANGE SCAN       | TP_CRE_DT        |    34 |
   442 |     7  (15)| 00:00:01 |    40 |    42 |
|   6 |     PARTITION RANGE ITERATOR|                  |    34 |
   442 |     9  (23)| 00:00:01 |    40 |    42 |
|*  7 |      INDEX RANGE SCAN       | TP3_PK           |    34 |
   442 |     9  (23)| 00:00:01 |    40 |    42 |
----------------------------------------------------------------
------------------------------------------------

Statistics
----------------------------------------------------------
          1  recursive calls
          0  db block gets
         18  consistent gets
          0  physical reads

In the above I have managed to exclude most of the partitions and still get the correct data, by adding the clause and id between 39000 and 41000 . There is something a little unusal {if you have not seen it before} with the plan being scans of two indexes and then hashed together and viewed, but both are PARTITION RANGE ITERATOR scans and Pstart/Pstop values are 40-42. Consistent gets are down to 18.

You do have to take great care with this method that the extra WHERE clause you add on the partitioning key will never exclude records you want to find. But so long as you do, it is a very powerful technique.

I’ll go into that a little more in the next posting {which hopefully will not be as delayed as this one!}

Comments»

1. Bernard Polarski - November 24, 2009

Since column ‘id’ and ‘cre_dateime’ are related using an absolutely true increasing sequential transitive closure, we could rewrite your final query as :

with v1 as
(select max(id) id
from tpart_id
where
cre_datetime EQUAL_OR_INFERIOR to_date(’18-OCT-2009 12:20′,’DD-MON-YYYY HH24:MI’)
),
v2 as
(select min(id) id
from tpart_id
where
cre_datetime EQUAL_OR_SUPERIOR to_date(’20-OCT-2009 12:20′,’DD-MON-YYYY HH24:MI’)
)
select count(*) from tpart_id, v1, v2
where tpart_id.id EQUAL_OR_SUPERIOR V1.id and tpart_id.id EQUAL_OR_INFERIOR v2.id
/

mwidlake - November 24, 2009

OK Bernard, I think I’ve tidied up the comment, does it look OK now?

You comment sums up a nice chunk of what is/was going to be my next post, but more concisely than I usually manage 🙂

2. Bernard Polarski - November 24, 2009

Except that I fall in the trap you described earlier :

The 2 first view, v1 and v2, whose only objective is to translate a date into an ID fall on the trap of ‘I will scan each local index of _each partitions’:

1 select max(cre_datetime) id
2 from tpart_id
3 where
4* cre_datetime /

ID
——————-
2009-10-30 12:20:00

Execution Plan
———————————————————-
Plan hash value: 1347639117

—————————————————————————————————-
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Pstart| Pstop |
—————————————————————————————————-
| 0 | SELECT STATEMENT | | 1 | 8 | 32 (4)| 00:00:01 | | |
| 1 | SORT AGGREGATE | | 1 | 8 | | | | |
| 2 | PARTITION RANGE ALL | | 42501 | 332K| 32 (4)| 00:00:01 | 1 | 46 |
|* 3 | INDEX FAST FULL SCAN| TPAR_DATA | 42501 | 332K| 32 (4)| 00:00:01 | 1 | 46 |
—————————————————————————————————-
Statistics
———————————————————-
1 recursive calls
0 db block gets
318 consistent gets
25 physical reads

thanks to your previous post, I quickly understood what happening.But I have no idea how to solve but create a global index on cre_datetime which is horrible.

mwidlake - November 24, 2009

Busy with the day job, but I’ll try and answer that one in my next post. I too dislike Global indexes…

3. Bernard Polarski - November 24, 2009

Arf again eaten by wordpress html parser: trying to escape the INF:

select max(cre_datetime) id
from tpart_id
where
cre_datetime \<= to_date('30-OCT-2009 12:20','DD-MON-YYYY HH24:MI')
/

mwidlake - November 26, 2009

Bernard, enail me, I’ll try to sort out the comments and do the next post this weekend. Work life is a bit trying at the mo, so my blog is not getting the care and attention it deserves!

4. More on Assisting Partition Exclusion « Martin Widlake’s Yet Another Oracle Blog - December 6, 2009

[…] Tags: partitions, performance, VLDB trackback A couple of weeks ago I posted about how you could help the oracle CBO carry out partition exclusion and thus aid performance. In essence, if you have a table partitioned on ID (based on an ascending […]

5. Ensuring Correlated Partition Exclusion « Martin Widlake’s Yet Another Oracle Blog - December 7, 2009

[…] going to call it Correlated Partition Exclusion. In essence, you have partitioned the table on a key, in my case the primary key ID, which is an […]

6. Blogroll Report 20/11/2009-27/11/2009 « Coskan’s Approach to Oracle - December 18, 2009

[…] 9-Make most out of partitions Martin Widlake – Assisting Partition Exclusion – Partitions for Performance […]

7. Ensuring Correlated Partition Exclusion #2 « Martin Widlake’s Yet Another Oracle Blog - December 20, 2009

[…] few days ago {oh dear, it is now two weeks!} I showed how you could better control Correlated Partition Exclusion by creating a lookup table that showed, for each partition, the minimum and maximum ID (the […]


Leave a reply to mwidlake Cancel reply