Stats Need Stats to Gather Stats February 16, 2010
Posted by mwidlake in performance.Tags: partitions, performance, statistics
trackback
Did you know that you sometimes need good stats so that the stats-gathering package can gather stats in an efficient way? This is a recent, quite extreme example.
I’ve been forced to learn a lot about gathering Oracle stats using DBMS_STATS over the last 4 or 5 years. But no matter how much I learn about the “challengingly bizarre” way in which it works, it seems every week or two there is a new oddity. I plan a whole series on the topic “soon”.
This particular example is from a 10.2.0.3 system.
I am gathering partition-only table stats as we are using the ability of Oracle to roll up Partition stats to Global stats under certain specific conditions. One of the conditions is that you need stats for every partition. Plus, to get global column stats, each partition must have stats for each column. Some of our partitions lacked or had very bad stats.
So I quickly knocked up a script-generating script to create DBMST_STATS.GATHER_TABLE_STATS statements that collected, for those partitions:
- ONLY partition stats.
- NO cascade down to indexes
- BLOCK sampling so it is fast {and poor, but there you go}
- ESTIMATE_PERCENT of 2, which is quite low for block sampling
- collect histograms as we decided the devil of having them was better than the devil of not having them.
the above is not ideal to get “good stats”, but it is quick and gets OK stats which is what we need right now. An example statement is:
begin -- part SD_INFO-DY07032004 rows 34554 dbms_stats.gather_table_stats(ownname => 'ERIC',tabname=> 'SD_INFO' ,partname=> 'DY07032004' ,granularity=> 'PARTITION', estimate_percent => 2,block_sample=> true ,cascade=> FALSE ,method_opt=> 'FOR ALL COLUMNS SIZE AUTO' ,no_invalidate=> false); END; /
Not a big partition (34554 rows and not very wide rows) and so did not take long to gather:
PARTITION_NAME LAST_ANALYZED ------------------------------ -------------------- DY04012004 12-FEB-2010 16:26:57 DY05012004 12-FEB-2010 16:27:00 DY06012004 12-FEB-2010 16:27:04 DY07012004 12-FEB-2010 16:27:07 DY08012004 12-FEB-2010 16:27:11 -- 4 seconds difference DY09012004 12-FEB-2010 16:27:15
I’ve collected statistics for a few thousand partitions over the last couple of days and the time taken is anything between just under half a second to 10 seconds per partition, the odd unusually large partition taking a minute or so. {I believe it takes half a second to gather stats on an empty partition, on our system at least, due to the time it takes for the internal housekeeping, including copying the old statistics information to the SYS.WRI$_OPSTAT_… tables to support restoring stats}. Sorry, I drift away from my main point.
This partition took a lot longer than 10 seconds:
begin -- part W_ACTIVITY_FAILURE-DY02092008 rows 49425 dbms_stats.gather_table_stats(ownname => 'ERIC',tabname=> 'W_ACTIVITY_FAILURE' ,partname=> 'DY02092008' ,granularity=> 'PARTITION', estimate_percent => 2,block_sample=> true ,cascade=> FALSE ,method_opt=> 'FOR ALL COLUMNS SIZE AUTO' ,no_invalidate=> false); END; /
After 10 minutes it was still running. WHY? I quickly checked the number of rows in the partition and then the size of the partition segment, incase either was much larger than I expected. Neither were.
select count(*) from eric.W_ACTIVITY_FAILURE partition (DY07092008)
COUNT(*)
———-
42182
From dba_segments.
BYTES BLOCKS
———- ———-
2621440 320
There is one advantage of a DBMS_STATS statement running for a long time – you can grab from the SGA the actual code being executed for the DBMS_STATS statement. I saw this.
select substrb(dump(val,16,0,32),1,120) ep, cnt from (select /*+ no_parallel(t) no_parallel_index(t) dbms_stats cursor_sharing_exact use_weak_name_resl dynamic_sampling(0) no_monitoring */ "ACCOUNT_ID" val,count(*) cnt from "ERIC"."W_ACTIVITY_FAILURE" t where TBL$OR$IDX$PART$NUM("ERIC"."W_ACTIVITY_FAILURE",0,4,0,"ROWID") = :objn and "ACCOUNT_ID" is not null group by "ACCOUNT_ID") order by val
DBMS_STATS is scanning the ACCOUNT_ID column on this table and it is taking a long time about it. The only index on the table is on ACCOUNT_ID. It then struck us.
The index is a global index.
INDEX_NAME PARTITIONED
—————————— ————-
API_XXXXXXXXXX_ACCOUNTS_IND NO
And that global index actually lacked stats {another feature of DBMS_STATS and rolling up partition stats had led to that}.
INDEX_NAME TYP UNQ BL L_BLKS DIST_KEYS CLUSTF LB_KEY DB_KEY LST_ANL
————— ——— — —- ———- ———– ———– ———- ———- ——–
API_XXXXXXX_ NOR NON
ACCOUNTS_IND
By this time 3 partitions for this table had been processed by my code, taking around 15 minutes each one. Incredibly slow.
I did a very quick 0.01% sample size DBMS_STATS.GATHER_INDEX_STATS on that index which took about 1 minute. As soon as the partition DBMS_STATS.GATHER_TABLE_STATS statement that was in flight finished, the following similar statements on that table’s partitions took under 3 seconds each. I’ll buy a pint for the first person to guess (within 10 minutes) WHEN I collected the global index stats {You can collect from any pub in central London if you give me a few day’s notice 🙂 }.
PARTITION_NAME NUM_ROWS BLOCKS LAST_ANALYZED -------------- ---------- ---------- --------------------- DY01092008 31461 452 16-FEB-2010 09:51:41 DY02092008 49425 686 16-FEB-2010 10:03:44 DY03092008 54472 719 16-FEB-2010 10:16:31 DY04092008 35762 491 16-FEB-2010 10:30:52 DY05092008 42182 587 16-FEB-2010 10:44:24 DY06092008 21186 312 16-FEB-2010 10:56:13 DY07092008 20898 313 16-FEB-2010 11:12:59 DY08092008 469500 1233 16-FEB-2010 11:13:02 DY09092008 480300 741 16-FEB-2010 11:13:04 DY10092008 15724 223 16-FEB-2010 11:31:01 DY11092008 55671 732 16-FEB-2010 11:31:06 DY12092008 820100 1779 16-FEB-2010 11:31:08 DY13092008 80215 1113 16-FEB-2010 11:31:16 DY14092008 10094 155 16-FEB-2010 11:31:18 DY15092008 10268 158 16-FEB-2010 11:31:20 DY16092008 24578 330 16-FEB-2010 11:31:22 DY17092008 21012 290 16-FEB-2010 11:31:24 DY18092008 22755 318 16-FEB-2010 11:31:27 DY19092008 21276 293 16-FEB-2010 11:31:29 DY20092008 20882 281 16-FEB-2010 11:31:31 DY21092008 24131 323 16-FEB-2010 11:31:34
I will investigate this oddity further if I get time this week, but the lesson to myself so far is:
Global indexes lacking stats can result in very long stats gathering times for partitions EVEN WHEN YOU DO NOT CASCADE DOWN TO INDEXES.
Statistics on Partitioned Tables – Part 1…
If you’ve ever worked on large databases that use partitioned and subpartitioned tables, you’ll be aware that there are significant challenges in maintaining up-to-date/appropriate statistics. We’ve encountered a few problems at work recently and I …
[…] later – whilst digging out a link to Martin’s blog, I noticed that he’s planning a whole DBMS_STATS series soon. Sigh. Keep an eye out for that, because it will be as in-depth as always. I’ll stick to the […]
[…] 4-Slow statistic gathering on partitions caused by missing stats on global index Martin Widlake-Stats Need Stats to Gather Stats […]
There is one advantage of a DBMS_STATS statement running for a long time – you can grab from the SGA the actual code being executed for the DBMS_STATS statement.
Martin,
Can you please tell how you do this?
Hi Narendra,
Sure.
Whilst the stats gather is running, I pull out some of the columns from V$SQLAREA where upper(sql_text) like upper(‘%name_of_table%’):-
select
first_load_time
,parse_calls prse
,executions excs
,buffer_gets buffs
,disk_reads discs
,rows_processed rws
,hash_value hash_value
,sql_text
from v$sqlarea
where upper(sql_text) like upper(‘%’||nvl(‘&sql_txt’,’whoops’)||’%’)
order by (greatest(buffer_gets,1)/greatest(rows_processed,1)) desc
You can get the script {chk_sga_txt.sql} from my web site. Try http://www.ora600.org.uk/scripts.shtml
The site needs an overhaul so don’t look at it too closely!
Thanks Martin. I will check your other site.
One more question. Is there a way (query) to find out which object is being processed by DBMS_STATS.GATHER_SCHEMA_STATS call?
Hmmm, now that is a good one Narendra.
There is a little-known feature of dbms_stats.gather_schema_stats/gather_database_stats where you can call it with OPITIONS=>LIST AUTO/LIST STALE /LIST EMPTY and it will pass out a list of objects it will gather into our paramerter OBJLIST, type ObjectTab. So you could run that and dump it out and then run the actual gather.
But to see what the stats job really is collecting, you need to identify the process and then get the current SQL_ID out of v$session. With that you can then pull out the sql itself. OEM (or toad or any other DBA GUI) would make that easier to do of course.
Hope that helps,
Martin
Thanks Martin.
v$session is what first came to my mind. But then I thought there might be a better way to achieve the same.
As for OEM (or any other GUI tool), I have turned into text-based tools’ advocate ever since I started working on Oracle. 🙂
[…] השני שעסק פחות או יותר באותו עניין אך בגרסה אחרת היה בבלוג של Martin Widlake. במקרה הזה הוא מתאר מצב שאיסוף הסטטיסטיקות על […]
ביצועים גרועים באיסוף סטטיסטיקות על פרטישנים…
בהמשך לפוסט הקודם, קראתי שני פוסטים בבלוגים שמתארים מצבים שבהם איסוף סטטיסטיקות על פרטישנים לוקח הרבה זמן. הפוסט הראשון שקראתי היה בבלוג של Doug Burns שבו הוא מתאר מצב שבו העדר סטטיסטיקות גלובליות על הטבלה הביא לזמן איסוף סטטיסטיקות גרוע על פרטישנים בא…