jump to navigation

Partitions are Still Not for Performance – Sometimes November 5, 2009

Posted by mwidlake in performance.
Tags: ,
6 comments

This is a follow up to my first posting on why Partitions are not for Performance where I discuss how lookups against a partitioned table with a locally partitioned index and no partition pruning can lead to performance problems. Basically, the CBO ends up scanning all local indexes to find the record(s) you want, which with a hundred partitions will probably result in several hundred buffer gets instead of 4 or 5.

The first posting dealt with the simple situation where a single record is being selected but this lack of partition pruning also crops up with range scans of course.

{warning, this is a long post, but if you use partitions you need to understand this}.
{Caveat, it has been pointed out the last comment, and by association this one, appears partition-negative. I am not partition-negative, partitions are great but for reasons generally other than performance and, to improve performance, you need to understand how partitions fit in with the CBO}

Imagine you have an unpartitioned ORDERS table with 10 million rows, the primary key is the traditional ascending numeric, derived from a sequence.
About 5,000 orders a day are received, the table goes back around 10 years (as in the past, less orders per day were received).
There is also an ORDER_DATETIME column, a DATE type column holding the data and time the order was placed. This column has a traditional index on it, which has a B-level of 3.

You want to query for all ORDERS placed on a date, say 1st November 2009, so you issue a query with the following WHERE clause:
WHERE ORDER_DATETIME >= TO_DATE(’01-NOV-2009′,’DD-MON-YYYY’)
AND ORDER_DATETIME < TO_DATE(’01-NOV-2009′,’DD-MON-YYYY’)

That will select around 5,000 out of 10 million rows and so a nested loop lookup on the index is the most suitable {just accept this for now, OK?}

The CBO will work down the levels on the index on ORDER_DATETIME to the first record of interest, taking 4 I/Os. It will then scan the index for all entries in that range, so maybe 10 leaf blocks, and then read the database blocks for each one – 5000 I/Os against the table, but as the data will be clustered well, most reads will be to the same say 100 blocks, so they will be cached in the block buffer cache. {Under Oracle – to 11.1 anyway – all IO is treated as physical, so the default optimizer will calculate the cost based on this, but in reality most IO will be logical IO, not physical IO}.

So the real cost will be 4 IOs to find the start of the index to scan, 10 I/Os to scan the index and 5000 IOs to get the table data – 5014 logical I/O’s, with between 0 and (4+10+100 =114) physical IOs to get the data into the cache, depending on what was cached already.

OK, remember those figures…

Now we partition the table by the ORDER_DATETIME column into 100 partitions and the index on ORDER_DATETIME is locally partitioned of course. Your same WHERE clause on ORDER_DATETIME will now resolve to a single index partition.

If you are lucky the local index partition will have a B-level of 2, 1 less than the standard non-partitioned index. So, with the same WHERE clause, the CBO will work down the local index partition to the first record of interest, for 3 I/Os. Then it will scan the index for the same 10 leaf blocks and find 5,000 records to check in the table partition, with the same 5,000 I/Os. You have saved….Yep, 1 logical IO. Physical IOs are between 0 and (3+10+100 =113). Let’s be kind and assume you saved a physical IO.

In this case, partitioning saved 1 out of 5104 logical and 1 out of 114 physical IOs.

Partitioning, in this case, is performance agnostic. It matters very, very little.

In reality, if your application is working on the same day of data over and over again, using the one partition, then that set of data will be cached and a nested-loop access to it will find the data in memory -but that would be true with the standard table and index too :-) So still no gain….

I’ve assume a nested-loop access path is still used by the CBO when accessing your partition. This is where things can be very different. {and it is also where I should swap from theory and start creating some test data to prove this, but I’ll hang myself out first and then look like an idiot next week when I am back with my test system :-)}.

Let’s estimate that the unpartitioned table is 100,000 blocks (100 rows per block).
With 5,104 I/Os against a 10 million row table/index , a nested loop lookup is likely to be chosen as the CBO estimated the number of I/Os to be less than scanning the whole table (100,000 blocks divided by the actual multi block read count {which is not the db_file_multi_block_read_count, but I’m skipping over that detail}, let’s say 12, for say 8,500 I/Os ).

With the table split into 100 partitions, there is 1,000 blocks in each partition. The whole table can be scanned with 1,000/real-multi-block-read-count {12}. So 80 or so I/Os

So in reality, your partitioning MAY be beneficial as the CBO decided to forget the index and nested lookups and simply scans the whole partition into memory for maybe 80-100 I/Os using multi block scans and processes the data in memory, burning some CPU as well, to find the records you want.

So, instead of 5,103 IOs the CBO decides to do 80-100 I/Os and some extra memory/cpu work, which is cheap given the power of modern CPUs.

So, partitions may help performance as the CBO swaps from nested loop lookups with an index to simply full scanning the partition.

You may start to wonder if that index on ORDER_DATETIME helps much… If it is to support range scans of a significant number of records, maybe not, but if it helps single or small-number record lookups, the index still helps.

I don’t know about you, but my brain is full for the night. I’ll leave this for another day….

Follow

Get every new post delivered to your Inbox.

Join 160 other followers