jump to navigation

Oracle’s greatest performance feature July 14, 2009

Posted by mwidlake in performance.
Tags: ,

With oracle 10 onwards, you get what I feel is the greatest single step forward for general oracle database performance since…Ohhhh, I give up. I think it is more significant than anything else I’ve seen and I started with Oracle 6.

What is it? It’s the automated statistics gathering job. If you don’t know what this is, I’ll give you a brief summary in a few moments.
If you don’t agree with me, I’d love to know what you would nominate as the best single performance improvement since Oracle 6. {Or even before, for those of you even greyer and more world-weary than I}. This is after all, just an opinion and I’m open to changing my mind.

Now for that promised and brief {fairly brief} description of the automated stats gathering and why I love it {but please do not interpret this as an attempt to stop you telling me you alternatives, I really would like to know}.

By default, every work-day night, an Oracle database that is 10g or higher will spend from 10pm collecting table, index and column stats for you. This stats gathering job will keep going until it has collected all that it thinks it needs or the clock hits 6am, at which point it stops. At the weekend, your busy little databases will work even harder, from midnight Saturday {ie 1 second after Friday ended} to the end of Sunday gathering stats, if need be. Again, the job will stop once it has gathered the stats it thinks your system needs.

The job gathers stats on tables/indexes in all schemas, including SYSTEM, SYS and those other odd internal ones like DBSNMP and WMSYS, as well as all your own schemas. It gathers stats only on tables that have changed by 10% or more since last stats were gathered, or that have been truncated or created since the last run. For each such table, this job will sample a proportion of said table that it thinks is needed to give reliable stats. For each table it also gathers stats on each index {I’ll skim over a slight issue in respect of sample size for indexes}. When a table has it’s stats gathered, Oracle will even make a stab at gathering the correct level a column statistics for each column, based on whether you ever use the column in joins or where clause {ie there is some sense to collecting detailed histogram stats only for those columns it would help}.

Finally in my brief description, and something that not all DBAs and developers appreciate, is that this automatic job stores the stats as they were before it gathered new ones, so you can go back to previous stats if you so wish {and it was not more than a month ago}.

This process, this nightly job, has flaws. Some of it’s decisions can be poor. It can gather stats you might not want it it and it can mess up. BUT! It does run regularly and it does gather generally beneficial stats for all tables, indexes, partitions, sub-partitions, columns. All of them. {mostly} .

Prior to this automatic job, many sites’ object stats were in very poor health.

  • Some sites did not gather table/index stats at all.
  • Many sites gathered stats only occasionally.
  • Some sites would gather stats on a few tables and none on most.

This last situation, stats on some tables none on many others, condemned the cost based optimiser to having to make decisions based on a very poor information and was/is very common under 9. I like to think of it as taking a map of the UK and removing all the roads and then trying to drive from London to Manchester. So long as you keep going generally North and west, you will get there. Eventually. 

{I have a private theory that the number of hints in code is in proportion to the three situations above, the most being on systems with “stats on only a few tables”.}

This is why I think that job is the biggest step forward for performance. It ensures there is at least some information for every table and every index and that for most of them the data is correct to within 10%. Thus it allows the CBO to work with pretty accurate information and all that clever maths has a good chance of working.

Yes, if you know more about stats and performance and your system, you can do better than the automated job alone, but for the majority of sites it is a step forward. Especially for those sites that lack strong DBA/Developer expertise. Ie, the majority :-).

I know from talking to some Oracle support people that their lives have got quieter. I’m told that they get a lot fewer calls about SQL performance from Oracle 10 and 11 systems. Mind you, the usual “fix” of getting the customer to just gather some stats has gone out the window, which is a shame as it fixed most issues. All in all, it has been a massive boon to them.

So, I think it is the greatest step forward.

I also hate it. I hate that automated stats job. It makes silly decisions, it breaks, it is poorly documented and it makes a dog’s dinner of very big, very active or very odd systems. In fact it is rubbish. But very, very, very beneficial rubbish. I love that pile of rubbish.