jump to navigation

Partitions are Not for Performance October 29, 2009

Posted by mwidlake in Architecture, performance, VLDB.
Tags: , , ,
16 comments

There is a myth that Partitions in Oracle magically aid SQL Query performance, even a general assumption that the main role of partitioning is to aid such query performance. This is not so. There are certainly many cases where Partitioning can aid performance but in my experience they just as often hinder rather than help overall performance.

The myth is so strong that when asked at interview, most DBAs {and even some performance specialists} will site query performance as the main (and occasionally only) benefit of partitioning.

Why can partitions hinder SQL query performance? Let’s say for example that you have a 10 million row table and an index on that table. That index has a B-Level of 3, which means it has a root block, one level of branch nodes, a second layer of branch nodes and then the leaf-node level. To access a row in the table via the index Oracle needs to read the root block, two branch blocks and then a leaf block to get the rowid of the record. This allows the table block {and from that the row} to be read. This is depicted in the below diagram, the numbered orange squares represent the blocks as selected in turn from the index and then table:

index_access_global

That is 5 I/O operations to access that row.

Now we partition the table into 5 partitions, 2 million rows each. The index is locally partitioned. If you are lucky, you may, just may, end up with local index partitions with a B-level 1 less then the original table, so in our case a B-level of 2. The often suggest process is now that one partition is considered and thet the CBO will read one root node, a branch level block, a leaf block and then the block from the partition.

single_index_partition_access

4 I/Os and a saving of 20% {I’m ignoring for now caching and whether it is physical or logical IO}.

A saving of 20% IF {and only if} you have local index partitions with a lower B-Level than the equivalent non-partitioned index. And the access is to one partition alone.

However, I keep seeing situations where the index look up does not include the partition key. So you get the below situation:

index_lookup_partition

Lacking the partition key, the CBO cannot exclude any partitions – so it has to scan each one. For most partitions, maybe for all but one, no records are found, but the index has to be checked with 3 IO operations each. so in my example 16 I/Os are done to get the one record of interest.

16 I/O operations instead of 5.

The situation is often not spotted, at least initially, as the time taken to carry out the extra local index partition scans is “small”, especially for specific lookups. Usually any testing is done on a table with only a small numer of partitions.

I remember well the first time I came across this {on an Oracle 9.0 database I think}, there was well over 100 partitions and a process that checked many thousands of individual records had slowed down significantly, taking 4 or 5 times as long as before.

An indicator that the problem is occurring is when a single record lookup against the index and then table is taking perhaps several dozen to several hundred consistent gets rather than the 5 or 6 it should. Also, you will see the partition iterator in the explain plan. In that first case where I came across the issue, consistent gets of about 380 per record fetched were being seen for a select that returned 1 record 99% of the time from a single table lookup. I’ve since seen the same problem on tables with over a thousand partitions, each local index being scanned for a total of almost 5000 consistent gets per record.

You may think that this would be an unusual situation as access against very large tables is either full/partial table scans or lookups on the PK/with a WHERE clause including the partitioning key – but it is actually very common. Partitioned tables are being used more and more in large but generally OLTP applications or sets of records are identified in a datawarehouse that are then checked more specifically with generally row-specific logic. With VLDBs which have many, many partitioned tables with many, many partitions each, the problem is common and often not recognized in the fog of other issues and massive I/O levels.

I’ve only covered a general performance issue with partitions here, I’ll expand on the theme and this simplistic example in the next post.

And yes, there are many ways partitioning CAN aid performance. I aim to get to those too. I really love partitioning.

Follow

Get every new post delivered to your Inbox.

Join 161 other followers