jump to navigation

Partitions are Still Not for Performance – Sometimes November 5, 2009

Posted by mwidlake in performance.
Tags: ,
trackback

This is a follow up to my first posting on why Partitions are not for Performance where I discuss how lookups against a partitioned table with a locally partitioned index and no partition pruning can lead to performance problems. Basically, the CBO ends up scanning all local indexes to find the record(s) you want, which with a hundred partitions will probably result in several hundred buffer gets instead of 4 or 5.

The first posting dealt with the simple situation where a single record is being selected but this lack of partition pruning also crops up with range scans of course.

{warning, this is a long post, but if you use partitions you need to understand this}.
{Caveat, it has been pointed out the last comment, and by association this one, appears partition-negative. I am not partition-negative, partitions are great but for reasons generally other than performance and, to improve performance, you need to understand how partitions fit in with the CBO}

Imagine you have an unpartitioned ORDERS table with 10 million rows, the primary key is the traditional ascending numeric, derived from a sequence.
About 5,000 orders a day are received, the table goes back around 10 years (as in the past, less orders per day were received).
There is also an ORDER_DATETIME column, a DATE type column holding the data and time the order was placed. This column has a traditional index on it, which has a B-level of 3.

You want to query for all ORDERS placed on a date, say 1st November 2009, so you issue a query with the following WHERE clause:
WHERE ORDER_DATETIME >= TO_DATE(’01-NOV-2009′,’DD-MON-YYYY’)
AND ORDER_DATETIME < TO_DATE(’01-NOV-2009′,’DD-MON-YYYY’)

That will select around 5,000 out of 10 million rows and so a nested loop lookup on the index is the most suitable {just accept this for now, OK?}

The CBO will work down the levels on the index on ORDER_DATETIME to the first record of interest, taking 4 I/Os. It will then scan the index for all entries in that range, so maybe 10 leaf blocks, and then read the database blocks for each one – 5000 I/Os against the table, but as the data will be clustered well, most reads will be to the same say 100 blocks, so they will be cached in the block buffer cache. {Under Oracle – to 11.1 anyway – all IO is treated as physical, so the default optimizer will calculate the cost based on this, but in reality most IO will be logical IO, not physical IO}.

So the real cost will be 4 IOs to find the start of the index to scan, 10 I/Os to scan the index and 5000 IOs to get the table data – 5014 logical I/O’s, with between 0 and (4+10+100 =114) physical IOs to get the data into the cache, depending on what was cached already.

OK, remember those figures…

Now we partition the table by the ORDER_DATETIME column into 100 partitions and the index on ORDER_DATETIME is locally partitioned of course. Your same WHERE clause on ORDER_DATETIME will now resolve to a single index partition.

If you are lucky the local index partition will have a B-level of 2, 1 less than the standard non-partitioned index. So, with the same WHERE clause, the CBO will work down the local index partition to the first record of interest, for 3 I/Os. Then it will scan the index for the same 10 leaf blocks and find 5,000 records to check in the table partition, with the same 5,000 I/Os. You have saved….Yep, 1 logical IO. Physical IOs are between 0 and (3+10+100 =113). Let’s be kind and assume you saved a physical IO.

In this case, partitioning saved 1 out of 5104 logical and 1 out of 114 physical IOs.

Partitioning, in this case, is performance agnostic. It matters very, very little.

In reality, if your application is working on the same day of data over and over again, using the one partition, then that set of data will be cached and a nested-loop access to it will find the data in memory -but that would be true with the standard table and index too :-) So still no gain….

I’ve assume a nested-loop access path is still used by the CBO when accessing your partition. This is where things can be very different. {and it is also where I should swap from theory and start creating some test data to prove this, but I’ll hang myself out first and then look like an idiot next week when I am back with my test system :-)}.

Let’s estimate that the unpartitioned table is 100,000 blocks (100 rows per block).
With 5,104 I/Os against a 10 million row table/index , a nested loop lookup is likely to be chosen as the CBO estimated the number of I/Os to be less than scanning the whole table (100,000 blocks divided by the actual multi block read count {which is not the db_file_multi_block_read_count, but I’m skipping over that detail}, let’s say 12, for say 8,500 I/Os ).

With the table split into 100 partitions, there is 1,000 blocks in each partition. The whole table can be scanned with 1,000/real-multi-block-read-count {12}. So 80 or so I/Os

So in reality, your partitioning MAY be beneficial as the CBO decided to forget the index and nested lookups and simply scans the whole partition into memory for maybe 80-100 I/Os using multi block scans and processes the data in memory, burning some CPU as well, to find the records you want.

So, instead of 5,103 IOs the CBO decides to do 80-100 I/Os and some extra memory/cpu work, which is cheap given the power of modern CPUs.

So, partitions may help performance as the CBO swaps from nested loop lookups with an index to simply full scanning the partition.

You may start to wonder if that index on ORDER_DATETIME helps much… If it is to support range scans of a significant number of records, maybe not, but if it helps single or small-number record lookups, the index still helps.

I don’t know about you, but my brain is full for the night. I’ll leave this for another day….

About these ads

Comments»

1. Jeff Moss - November 7, 2009

…not sure if you meant me when you said it was pointed out that your previous post “appears partition negative”…but I think you mean negative about performance benefits when using partitions…nobody is saying you’re partition negative at all…I think we all agree that partitions are great, for many reasons, including performance.

I still don’t like your statement that:

“…partitions are great but for reasons generally other than performance…”

I just don’t agree.

I think most people use partitions for performance or scalability (which could be thought of as the achievement of consistent performance over time) or manageability (which given that the main method for achieveing this is Partition Exchange Loading, one can consider that this is also a performance aid, since if the table was not partitioned to enable PEL, then you would need to use either complete rebuilds or DELETE+INSERT statements on the target table, both of which would be far worse in terms of performance)….I think this is probably why many people cite Performance as a key reason for using partitioning – it’s kind of intrinsic in many of the other reasons for using partitioning.

Cheers
Jeff

mwidlake - November 7, 2009

I did mean you Jeff and I don’t see it as a personal attack, just a view/comment on the blog. As I said in my reply to your latest comment on the previous blog (only just written it), I agree with you on this, I’ll explain later

2. Bernard Polarski - November 9, 2009

Who cares if single table range scanning a 10 millions rows sets is only 000.1% more efficient with partitioning. First time you join 2 tables 10 millions rowset with a nested loop, you will endure heavy pain.

I would prefer an article where you describe how CBO abandoned a Nested LOOP join in favor of 2 partitions full tablescan conciliated by a hash join (wise join). Joining tables is very popular these days. There are too few articles on partitions by dates, sub partitioned by hash to peak performances in SQL that levy partition wise join.

mwidlake - November 9, 2009

“Who cares if single table range scanning a 10 millions rows sets is only 000.1% more efficient with partitioning”

That’s kind of my point Bernard. This and the previous posting were intended to highlight that partitioning can damage select performance and can also be select peformance neutral – but as you may notice, towards the end of this posting I discuss the swap from index lookup to full partition scanning….You may get your preferred article soon. :-)

3. PB - December 1, 2009

I guess that partitiong is one of the most decisive issue for many people for choosing EE vs STD, so Oracle is earning a lot of money due to this “possible misconception”.

I’d like to know how many records are too many for STD with actual machines (8 cores, as much memory as you want…, FC 4 Gb, 15K disks…) 10 millions, 50 millions, 100???

mwidlake - December 1, 2009

I’ve certainly come across sites where partitioning was the reason they went for EE, just as I have come across sites who desperatly want partitioning but just can’t afford EE :-(

I would say that “too many rows” is the point where you.
– cannot either carry out housekeeping on the table (or it’s indexes) in a timely way
– can’t avoid full table scans/index range scans/nested loop lookups that do not respond in time to support the business (and partition exclusion would solve it)
– something else breaks (like you hit a bug as the table is larger than the biggest one Oracle tested with…)

The number of rows will very depenging on how wide the row is too.

I have seen tables over 100 million rows and I’ve implemented partitioned tables with the partition key being in the ID, 10 million rows per partition. So I would hazard a guess that for a table holding rows less than 200 bytes per row, a small number of millions will be the limit.

I wonder what the biggest single table other people have been happy using?


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Follow

Get every new post delivered to your Inbox.

Join 166 other followers

%d bloggers like this: