Richard Foote on the Impact of stats staying the same. February 16, 2010
Posted by mwidlake in performance.Tags: Blogging, performance, statistics
add a comment
I just wanted to highlight this very good posting by Richard Foote. He is talking about how SQL execution plans can change when “nothing else does”. Not the table structures, not the code, note the initialisation parameters, not even the table and index stats.
But something does change, which is what day it is (or what hour it is or what week it is). Time moves on and our data does too. If the statistics on the tables does NOT move on, then the CBO thinks that the range of data in the table does not change. So, the CBO thinks your queries are getting further and further “out of range” and so would expect to find less and less data in the tables to bring back. That will lead to plan changes.
If you have noticed my preoccupation with identifying the contents of histograms and high/low values in column stats, you may appreciate that this topic is one I have been finding is a large part of my day job.
Richard explains the point very well, as always, so go have a look.
Stats Need Stats to Gather Stats February 16, 2010
Posted by mwidlake in performance.Tags: partitions, performance, statistics
10 comments
Did you know that you sometimes need good stats so that the stats-gathering package can gather stats in an efficient way? This is a recent, quite extreme example.
I’ve been forced to learn a lot about gathering Oracle stats using DBMS_STATS over the last 4 or 5 years. But no matter how much I learn about the “challengingly bizarre” way in which it works, it seems every week or two there is a new oddity. I plan a whole series on the topic “soon”.
This particular example is from a 10.2.0.3 system.
I am gathering partition-only table stats as we are using the ability of Oracle to roll up Partition stats to Global stats under certain specific conditions. One of the conditions is that you need stats for every partition. Plus, to get global column stats, each partition must have stats for each column. Some of our partitions lacked or had very bad stats.
So I quickly knocked up a script-generating script to create DBMST_STATS.GATHER_TABLE_STATS statements that collected, for those partitions:
- ONLY partition stats.
- NO cascade down to indexes
- BLOCK sampling so it is fast {and poor, but there you go}
- ESTIMATE_PERCENT of 2, which is quite low for block sampling
- collect histograms as we decided the devil of having them was better than the devil of not having them.
the above is not ideal to get “good stats”, but it is quick and gets OK stats which is what we need right now. An example statement is:
begin -- part SD_INFO-DY07032004 rows 34554 dbms_stats.gather_table_stats(ownname => 'ERIC',tabname=> 'SD_INFO' ,partname=> 'DY07032004' ,granularity=> 'PARTITION', estimate_percent => 2,block_sample=> true ,cascade=> FALSE ,method_opt=> 'FOR ALL COLUMNS SIZE AUTO' ,no_invalidate=> false); END; /
Not a big partition (34554 rows and not very wide rows) and so did not take long to gather:
PARTITION_NAME LAST_ANALYZED ------------------------------ -------------------- DY04012004 12-FEB-2010 16:26:57 DY05012004 12-FEB-2010 16:27:00 DY06012004 12-FEB-2010 16:27:04 DY07012004 12-FEB-2010 16:27:07 DY08012004 12-FEB-2010 16:27:11 -- 4 seconds difference DY09012004 12-FEB-2010 16:27:15
I’ve collected statistics for a few thousand partitions over the last couple of days and the time taken is anything between just under half a second to 10 seconds per partition, the odd unusually large partition taking a minute or so. {I believe it takes half a second to gather stats on an empty partition, on our system at least, due to the time it takes for the internal housekeeping, including copying the old statistics information to the SYS.WRI$_OPSTAT_… tables to support restoring stats}. Sorry, I drift away from my main point.
This partition took a lot longer than 10 seconds:
begin -- part W_ACTIVITY_FAILURE-DY02092008 rows 49425 dbms_stats.gather_table_stats(ownname => 'ERIC',tabname=> 'W_ACTIVITY_FAILURE' ,partname=> 'DY02092008' ,granularity=> 'PARTITION', estimate_percent => 2,block_sample=> true ,cascade=> FALSE ,method_opt=> 'FOR ALL COLUMNS SIZE AUTO' ,no_invalidate=> false); END; /
After 10 minutes it was still running. WHY? I quickly checked the number of rows in the partition and then the size of the partition segment, incase either was much larger than I expected. Neither were.
select count(*) from eric.W_ACTIVITY_FAILURE partition (DY07092008)
COUNT(*)
———-
42182
From dba_segments.
BYTES BLOCKS
———- ———-
2621440 320
There is one advantage of a DBMS_STATS statement running for a long time – you can grab from the SGA the actual code being executed for the DBMS_STATS statement. I saw this.
select substrb(dump(val,16,0,32),1,120) ep, cnt
from (select /*+ no_parallel(t) no_parallel_index(t) dbms_stats cursor_sharing_exact
use_weak_name_resl dynamic_sampling(0) no_monitoring */
"ACCOUNT_ID" val,count(*) cnt
from "ERIC"."W_ACTIVITY_FAILURE" t
where TBL$OR$IDX$PART$NUM("ERIC"."W_ACTIVITY_FAILURE",0,4,0,"ROWID") =
bjn
and "ACCOUNT_ID" is not null
group by "ACCOUNT_ID")
order by val
DBMS_STATS is scanning the ACCOUNT_ID column on this table and it is taking a long time about it. The only index on the table is on ACCOUNT_ID. It then struck us.
The index is a global index.
INDEX_NAME PARTITIONED
—————————— ————-
API_XXXXXXXXXX_ACCOUNTS_IND NO
And that global index actually lacked stats {another feature of DBMS_STATS and rolling up partition stats had led to that}.
INDEX_NAME TYP UNQ BL L_BLKS DIST_KEYS CLUSTF LB_KEY DB_KEY LST_ANL
————— ——— — —- ———- ———– ———– ———- ———- ——–
API_XXXXXXX_ NOR NON
ACCOUNTS_IND
By this time 3 partitions for this table had been processed by my code, taking around 15 minutes each one. Incredibly slow.
I did a very quick 0.01% sample size DBMS_STATS.GATHER_INDEX_STATS on that index which took about 1 minute. As soon as the partition DBMS_STATS.GATHER_TABLE_STATS statement that was in flight finished, the following similar statements on that table’s partitions took under 3 seconds each. I’ll buy a pint for the first person to guess (within 10 minutes) WHEN I collected the global index stats {You can collect from any pub in central London if you give me a few day’s notice
}.
PARTITION_NAME NUM_ROWS BLOCKS LAST_ANALYZED -------------- ---------- ---------- --------------------- DY01092008 31461 452 16-FEB-2010 09:51:41 DY02092008 49425 686 16-FEB-2010 10:03:44 DY03092008 54472 719 16-FEB-2010 10:16:31 DY04092008 35762 491 16-FEB-2010 10:30:52 DY05092008 42182 587 16-FEB-2010 10:44:24 DY06092008 21186 312 16-FEB-2010 10:56:13 DY07092008 20898 313 16-FEB-2010 11:12:59 DY08092008 469500 1233 16-FEB-2010 11:13:02 DY09092008 480300 741 16-FEB-2010 11:13:04 DY10092008 15724 223 16-FEB-2010 11:31:01 DY11092008 55671 732 16-FEB-2010 11:31:06 DY12092008 820100 1779 16-FEB-2010 11:31:08 DY13092008 80215 1113 16-FEB-2010 11:31:16 DY14092008 10094 155 16-FEB-2010 11:31:18 DY15092008 10268 158 16-FEB-2010 11:31:20 DY16092008 24578 330 16-FEB-2010 11:31:22 DY17092008 21012 290 16-FEB-2010 11:31:24 DY18092008 22755 318 16-FEB-2010 11:31:27 DY19092008 21276 293 16-FEB-2010 11:31:29 DY20092008 20882 281 16-FEB-2010 11:31:31 DY21092008 24131 323 16-FEB-2010 11:31:34
I will investigate this oddity further if I get time this week, but the lesson to myself so far is:
Global indexes lacking stats can result in very long stats gathering times for partitions EVEN WHEN YOU DO NOT CASCADE DOWN TO INDEXES.
