Stats Need Stats to Gather Stats

Stats Need Stats to Gather Stats February 16, 2010

Posted by mwidlake in performance.
Tags: partitions, performance, statistics
trackback

Did you know that you sometimes need good stats so that the stats-gathering package can gather stats in an efficient way? This is a recent, quite extreme example.

I’ve been forced to learn a lot about gathering Oracle stats using DBMS_STATS over the last 4 or 5 years. But no matter how much I learn about the “challengingly bizarre” way in which it works, it seems every week or two there is a new oddity. I plan a whole series on the topic “soon”.

This particular example is from a 10.2.0.3 system.

I am gathering partition-only table stats as we are using the ability of Oracle to roll up Partition stats to Global stats under certain specific conditions. One of the conditions is that you need stats for every partition. Plus, to get global column stats, each partition must have stats for each column. Some of our partitions lacked or had very bad stats.

So I quickly knocked up a script-generating script to create DBMST_STATS.GATHER_TABLE_STATS statements that collected, for those partitions:

ONLY partition stats.
NO cascade down to indexes
BLOCK sampling so it is fast {and poor, but there you go}
ESTIMATE_PERCENT of 2, which is quite low for block sampling
collect histograms as we decided the devil of having them was better than the devil of not having them.

the above is not ideal to get “good stats”, but it is quick and gets OK stats which is what we need right now. An example statement is:

begin                                                                                              
-- part SD_INFO-DY07032004 rows 34554                                                              
dbms_stats.gather_table_stats(ownname => 'ERIC',tabname=> 'SD_INFO'
,partname=> 'DY07032004'
,granularity=> 'PARTITION', estimate_percent => 2,block_sample=> true ,cascade=> FALSE
,method_opt=> 'FOR ALL COLUMNS SIZE AUTO' ,no_invalidate=> false);
END;
/

Not a big partition (34554 rows and not very wide rows) and so did not take long to gather:

PARTITION_NAME                 LAST_ANALYZED
------------------------------ --------------------
DY04012004                     12-FEB-2010 16:26:57
DY05012004                     12-FEB-2010 16:27:00
DY06012004                     12-FEB-2010 16:27:04
DY07012004                     12-FEB-2010 16:27:07
DY08012004                     12-FEB-2010 16:27:11 -- 4 seconds difference
DY09012004                     12-FEB-2010 16:27:15

I’ve collected statistics for a few thousand partitions over the last couple of days and the time taken is anything between just under half a second to 10 seconds per partition, the odd unusually large partition taking a minute or so. {I believe it takes half a second to gather stats on an empty partition, on our system at least, due to the time it takes for the internal housekeeping, including copying the old statistics information to the SYS.WRI$_OPSTAT_… tables to support restoring stats}. Sorry, I drift away from my main point.

This partition took a lot longer than 10 seconds:

begin
-- part W_ACTIVITY_FAILURE-DY02092008 rows 49425
dbms_stats.gather_table_stats(ownname => 'ERIC',tabname=> 'W_ACTIVITY_FAILURE'
,partname=> 'DY02092008'
,granularity=> 'PARTITION', estimate_percent => 2,block_sample=> true ,cascade=> FALSE
,method_opt=> 'FOR ALL COLUMNS SIZE AUTO' ,no_invalidate=> false);
END;
/

After 10 minutes it was still running. WHY? I quickly checked the number of rows in the partition and then the size of the partition segment, incase either was much larger than I expected. Neither were.
select count(*) from eric.W_ACTIVITY_FAILURE partition (DY07092008)
COUNT(*)
———-
42182

From dba_segments.
BYTES BLOCKS
———- ———-
2621440 320

There is one advantage of a DBMS_STATS statement running for a long time – you can grab from the SGA the actual code being executed for the DBMS_STATS statement. I saw this.

select substrb(dump(val,16,0,32),1,120) ep, cnt 
from (select /*+ no_parallel(t) no_parallel_index(t) dbms_stats cursor_sharing_exact
use_weak_name_resl dynamic_sampling(0) no_monitoring */
"ACCOUNT_ID" val,count(*) cnt 
from "ERIC"."W_ACTIVITY_FAILURE" t 
where TBL$OR$IDX$PART$NUM("ERIC"."W_ACTIVITY_FAILURE",0,4,0,"ROWID") = :objn
and "ACCOUNT_ID" is not null 
group by "ACCOUNT_ID") 
order by val

DBMS_STATS is scanning the ACCOUNT_ID column on this table and it is taking a long time about it. The only index on the table is on ACCOUNT_ID. It then struck us.

The index is a global index.
INDEX_NAME PARTITIONED
—————————— ————-
API_XXXXXXXXXX_ACCOUNTS_IND NO

And that global index actually lacked stats {another feature of DBMS_STATS and rolling up partition stats had led to that}.

INDEX_NAME TYP UNQ BL L_BLKS DIST_KEYS CLUSTF LB_KEY DB_KEY LST_ANL
————— ——— — —- ———- ———– ———– ———- ———- ——–
API_XXXXXXX_ NOR NON
ACCOUNTS_IND

By this time 3 partitions for this table had been processed by my code, taking around 15 minutes each one. Incredibly slow.

I did a very quick 0.01% sample size DBMS_STATS.GATHER_INDEX_STATS on that index which took about 1 minute. As soon as the partition DBMS_STATS.GATHER_TABLE_STATS statement that was in flight finished, the following similar statements on that table’s partitions took under 3 seconds each. I’ll buy a pint for the first person to guess (within 10 minutes) WHEN I collected the global index stats {You can collect from any pub in central London if you give me a few day’s notice 🙂 }.

PARTITION_NAME   NUM_ROWS     BLOCKS LAST_ANALYZED
-------------- ---------- ---------- ---------------------
DY01092008          31461        452 16-FEB-2010 09:51:41
DY02092008          49425        686 16-FEB-2010 10:03:44
DY03092008          54472        719 16-FEB-2010 10:16:31
DY04092008          35762        491 16-FEB-2010 10:30:52
DY05092008          42182        587 16-FEB-2010 10:44:24
DY06092008          21186        312 16-FEB-2010 10:56:13
DY07092008          20898        313 16-FEB-2010 11:12:59
DY08092008         469500       1233 16-FEB-2010 11:13:02
DY09092008         480300        741 16-FEB-2010 11:13:04
DY10092008          15724        223 16-FEB-2010 11:31:01
DY11092008          55671        732 16-FEB-2010 11:31:06
DY12092008         820100       1779 16-FEB-2010 11:31:08
DY13092008          80215       1113 16-FEB-2010 11:31:16
DY14092008          10094        155 16-FEB-2010 11:31:18
DY15092008          10268        158 16-FEB-2010 11:31:20
DY16092008          24578        330 16-FEB-2010 11:31:22
DY17092008          21012        290 16-FEB-2010 11:31:24
DY18092008          22755        318 16-FEB-2010 11:31:27
DY19092008          21276        293 16-FEB-2010 11:31:29
DY20092008          20882        281 16-FEB-2010 11:31:31
DY21092008          24131        323 16-FEB-2010 11:31:34

I will investigate this oddity further if I get time this week, but the lesson to myself so far is:

Global indexes lacking stats can result in very long stats gathering times for partitions EVEN WHEN YOU DO NOT CASCADE DOWN TO INDEXES.

Comments»

1. Doug's Oracle Blog - February 17, 2010: Statistics on Partitioned Tables – Part 1…

If you’ve ever worked on large databases that use partitioned and subpartitioned tables, you’ll be aware that there are significant challenges in maintaining up-to-date/appropriate statistics. We’ve encountered a few problems at work recently and I …

Reply
2. Statistics on Partitioned Tables – Part 1 | Oracle - February 18, 2010: […] later – whilst digging out a link to Martin’s blog, I noticed that he’s planning a whole DBMS_STATS series soon. Sigh. Keep an eye out for that, because it will be as in-depth as always. I’ll stick to the […]

Reply
3. Blogroll Report 12/02/2009 – 19/02/2010 « Coskan’s Approach to Oracle - March 18, 2010: […] 4-Slow statistic gathering on partitions caused by missing stats on global index Martin Widlake-Stats Need Stats to Gather Stats […]

Reply
4. Narendra - July 12, 2010: There is one advantage of a DBMS_STATS statement running for a long time – you can grab from the SGA the actual code being executed for the DBMS_STATS statement.
Martin,

Can you please tell how you do this?

Reply
mwidlake - July 12, 2010: Hi Narendra,

Sure.

Whilst the stats gather is running, I pull out some of the columns from V$SQLAREA where upper(sql_text) like upper(‘%name_of_table%’):-

select
first_load_time
,parse_calls prse
,executions excs
,buffer_gets buffs
,disk_reads discs
,rows_processed rws
,hash_value hash_value
,sql_text
from v$sqlarea
where upper(sql_text) like upper(‘%’||nvl(‘&sql_txt’,’whoops’)||’%’)
order by (greatest(buffer_gets,1)/greatest(rows_processed,1)) desc

You can get the script {chk_sga_txt.sql} from my web site. Try http://www.ora600.org.uk/scripts.shtml
The site needs an overhaul so don’t look at it too closely!

Reply
5. Narendra - July 13, 2010: Thanks Martin. I will check your other site.
One more question. Is there a way (query) to find out which object is being processed by DBMS_STATS.GATHER_SCHEMA_STATS call?

Reply
mwidlake - July 13, 2010: Hmmm, now that is a good one Narendra.

There is a little-known feature of dbms_stats.gather_schema_stats/gather_database_stats where you can call it with OPITIONS=>LIST AUTO/LIST STALE /LIST EMPTY and it will pass out a list of objects it will gather into our paramerter OBJLIST, type ObjectTab. So you could run that and dump it out and then run the actual gather.

But to see what the stats job really is collecting, you need to identify the process and then get the current SQL_ID out of v$session. With that you can then pull out the sql itself. OEM (or toad or any other DBA GUI) would make that easier to do of course.

Hope that helps,

Martin

Reply
6. Narendra - July 13, 2010: Thanks Martin.
v$session is what first came to my mind. But then I thought there might be a better way to achieve the same.
As for OEM (or any other GUI tool), I have turned into text-based tools’ advocate ever since I started working on Oracle. 🙂

Reply
7. ביצועים גרועים באיסוף סטטיסטיקות על פרטישנים « הבלוג של זהר אלקיים - November 23, 2010: […] השני שעסק פחות או יותר באותו עניין אך בגרסה אחרת היה בבלוג של Martin Widlake. במקרה הזה הוא מתאר מצב שאיסוף הסטטיסטיקות על […]

Reply
8. הבלוג של זהר אלקיים - November 23, 2010: ביצועים גרועים באיסוף סטטיסטיקות על פרטישנים…

בהמשך לפוסט הקודם, קראתי שני פוסטים בבלוגים שמתארים מצבים שבהם איסוף סטטיסטיקות על פרטישנים לוקח הרבה זמן. הפוסט הראשון שקראתי היה בבלוג של Doug Burns שבו הוא מתאר מצב שבו העדר סטטיסטיקות גלובליות על הטבלה הביא לזמן איסוף סטטיסטיקות גרוע על פרטישנים בא…

Reply

Martin Widlake's Yet Another Oracle Blog