jump to navigation

Automated Statistics Gathering Silently Fails #2 July 23, 2009

Posted by mwidlake in performance.
Tags: ,

<previous postnext post>

My first posting about the automated stats gathering job on 10/11g failing is here. I show how it can fail for really large segments and can go on failing night after night, the little so-and-so.

Today I am going to give you some pointers in spotting what the job choked on and how to deal with it {damn, ran out of time – see next post}.

How can you tell what it is failing on? Well, you can use OEM to look at the latest run of the automated stats gathering job and what was processed. It probably won’t tell you what it failed on, it did not in the version I tried (early 10.2) and no one has pinged me to say it now does. Look back at the previous posting for a screen shot if you want help finding the page in OEM.

However, you can check what was analyzed last night and compare to the list of objects shown. Concentrate on tables or partitions you know to be big, the job is not going to choke on a 14-row EMP table, more like a half-billion row ORDERS table.

select owner,table_name,last_analyzed,sample_size
from dba_tables
where last_analyzed between sysdate-.5 and sysdate -.4
order by table_name
dw> /
OWNER          TABLE_NAME                     LAST_ANALYZED     SAMPLE_SIZE
-------------- ------------------------------ ----------------- -----------
ODS XXXX_TIM_RF5 16-JUL-2009 00:09 19
ODS            H_XXXXXXXXXXXXXXX_TYPES        16-JUL-2009 00:10      272838
ODS            H_TXXXXXXXXXXXXXXXNER          16-JUL-2009 00:10       85184
ODS            H_XXXXXXXXXXXXXXXCTIONS        16-JUL-2009 00:04      165999
ODS            STGXXXXXXXXXXXXXXXCTIONS       16-JUL-2009 00:51          10
ODS            TBL_XXXXXXXX                   16-JUL-2009 01:06     3499693
ODS WRK_XXXXXX_FLAG 16-JUL-2009 01:09 173340

If it shows as being analysed by the automated job last night and it’s last_analyzed date is prior to the run {so is not in the list you just pulled out}, it choked on that.
It is often quicker to make an educated guess and check those objects that you know are big and what their LAST_ANALYZED date is.

If you have no access to OEM, you could try looking at objects that should have been analysed. These are tables/partitions with more than 10% difference. You do this via DBA_TAB_MODIFICATIONS. I’ve already posted about this table {look towards the end of the post}, I am strangely attached to it, I think it is jolly useful.

The table looks like this

desc all_tab_modifications
 Name                                 Null?    Type
 ---------------- -------------------------------
 TABLE_OWNER                           VARCHAR2(30)
 TABLE_NAME                             VARCHAR2(30)
 PARTITION_NAME                        VARCHAR2(30)
 INSERTS                                   NUMBER
 UPDATES                                   NUMBER
 DELETES                                   NUMBER
 TRUNCATED                               VARCHAR2(3)

You can interogate it with something like the below {which only looks at tables, not partitions} You can get my script here – this script here

-- tab_count
-- mdw 11/05/03
-- mdw 17/01/08 Modified to look at dba_tab_modifications
set pause on pages 24 pause 'Any Key>'
colu anlyzd_rows form 99999,999,999
colu tot_rows form 99999,999,999
colu tab_name form a30
colu chngs form 99,999,999
colu pct_c form 999.999
set lines 110
spool tab_count.lst
select dbta.owner||'.'||dbta.table_name tab_name
,dbta.num_rows anlyzd_rows
,to_char(dbta.last_analyzed,'yymmdd hh24:mi:ss') last_anlzd
-nvl(dtm.deletes,0) tot_rows
,nvl(dtm.inserts,0)+nvl(dtm.deletes,0)+nvl(dtm.updates,0) chngs
     /greatest(nvl(dbta.num_rows,0),1)                          pct_c
,dtm.truncated trn
from dba_tables dbta
left outer join sys.dba_tab_modifications dtm
on dbta.owner = dtm.table_owner
and dbta.table_name = dtm.table_name
  and dtm.partition_name is null
where dbta.table_name like upper(nvl('&Tab_name'||'%','WHOOPS'))
and dbta.owner        not in ('SYS','SYSTEM')
spool off
clear colu

And it comes out with something like {view plain}:-

------------------------ -------------- --------------- -------------- ----------- -------- --
O.ADGHERHEHES                       384 060726 09:49:22            391           7     .018 NO
O.AUEHERHERHERATUS               15,578 060825 15:39:38         15,578       2,185     .140 NO
O.ADRTJRTJRENT                1,865,595 060823 07:32:53      1,892,301      26,706     .014 NO
I.APP_ARTHRTHRT_EXTR                  0 060726 22:10:21              0           0     .000 YES
I.APP_ARTJHRJHTJH_EXTR                0 060726 22:10:21              0           0     .000 YES
O.AD_RHRTHRTHRS                 143,435 061110 15:49:46        186,646      86,899     .606 NO
O.AD_RHRTHHORHUMNS              599,066 061110 15:49:57        885,304     286,238     .478 NO
O.AD_UNMTABRTHLES                    25 061110 15:49:41             31           7     .280 NO
O.AD_FTD_SRCORTHHOLD                 23 060726 09:49:25             23           1     .043 NO

Anything with a PCT_C {percentage change} of over 0.1 should have stats gathered next time the job runs, if the table does not have it’s statistics locked {phew, remembered to mention that caveat}.

You could of course write something yourself that only selects objects that have 10% or more change – it will give you a heads up of what is likely to be analyzed tonight and those objects who’s dependent SQL could act differently come the morning 🙂

There is a final way to spot what got changed and it involves Oracle’s internal tables for storing stats, but I’ve run out of time. I’ll have to post that next time.