jump to navigation

Format of Date Histograms February 6, 2010

Posted by mwidlake in internals.
Tags: , ,
9 comments

Oracle annoyingly handles dates internally in several formats. Dates are stored in tables as seven bytes, each byte representing century, year-in-century, month, day, hour, minute and second, shifted by different amounts. For the high/low values seen in DBA_TAB_COLUMNS or DBA_TAB_COL_STATISTICS, it is stored as a RAW value, where a two-digit hex string represents the century, year-in-century,month etc as before – see this post on decoding high and low column values and check out the comments for corrections.

So what about histograms? You might want to know what is in the histogram for a date column in order to better understand the decisions made by the CBO. Below, I pull out the histogram for an example date column {I’ve trimmed the output a little to save space}:

select table_name
,column_name
,endpoint_value end_val
,endpoint_number rowcount
from all_tab_histograms
where table_name ='TEST_TABLE_X'
and owner ='TEST_1'
and column_name = 'PLACED_DT'
order by endpoint_number

TABLE_NAME      COLUMN_NAME   END_VAL       ROWCOUNT
--------------- ------------- ------------- ---------------
TEST_TABLE_X     PLACED_DT   2,452,258         0
TEST_TABLE_X     PLACED_DT   2,454,334         1
TEST_TABLE_X     PLACED_DT   2,454,647         2
TEST_TABLE_X     PLACED_DT   2,454,737         3
TEST_TABLE_X     PLACED_DT   2,454,820         4
TEST_TABLE_X     PLACED_DT   2,454,867         5
TEST_TABLE_X     PLACED_DT   2,454,929         6
TEST_TABLE_X     PLACED_DT   2,454,981         7
TEST_TABLE_X     PLACED_DT   2,455,006         8
TEST_TABLE_X     PLACED_DT   2,455,024         9
TEST_TABLE_X     PLACED_DT   2,455,039         10
TEST_TABLE_X     PLACED_DT   2,455,050         11
...
TEST_TABLE_X     PLACED_DT   2,455,205         42
TEST_TABLE_X     PLACED_DT   2,455,207         43
TEST_TABLE_X     PLACED_DT   2,455,209         44
TEST_TABLE_X     PLACED_DT   2,455,211         45
TEST_TABLE_X     PLACED_DT   2,455,213         46
TEST_TABLE_X     PLACED_DT   2,455,215         47
TEST_TABLE_X     PLACED_DT   2,455,216         48
TEST_TABLE_X     PLACED_DT   2,455,218         49
TEST_TABLE_X     PLACED_DT   2,455,220         50
TEST_TABLE_X     PLACED_DT   2,455,221         51
TEST_TABLE_X     PLACED_DT   2,455,222         52
TEST_TABLE_X     PLACED_DT   2,455,222         53
TEST_TABLE_X     PLACED_DT   2,455,222         54
TEST_TABLE_X     PLACED_DT   2,455,222         55
TEST_TABLE_X     PLACED_DT   2,455,223         56
TEST_TABLE_X     PLACED_DT   2,455,223         57
TEST_TABLE_X     PLACED_DT   2,455,223         58
TEST_TABLE_X     PLACED_DT   2,455,223         59
...
TEST_TABLE_X     PLACED_DT   2,455,223         69
TEST_TABLE_X     PLACED_DT   2,455,223         70
TEST_TABLE_X     PLACED_DT   2,455,223         71
TEST_TABLE_X     PLACED_DT   2,455,223         72
TEST_TABLE_X     PLACED_DT   2,455,223         73
TEST_TABLE_X     PLACED_DT   2,455,224         74
TEST_TABLE_X     PLACED_DT   2,455,226         75

Well, it looks like a seven digit number so it must be representing the date in a similar way to as described above, yes? No. The format is obviously something different {Oh come ON Oracle, some internal standards would be nice once in a while! :-) }

I pulled out the minimum and maximum values from the DBA_TAB_COLUMNS table and translated them:-

COLUMN_NAME  NUM_DIST   LOW_V               HI_V 
------------ ---------- ------------------- ------------------- 
PLACED_DT    375,428    2001-12-13 17:31:38 2010-01-28 23:51:38 

So the above low values and high values will pretty much match the first and last values in the histograms, which are 2452258 and 2455226. Any guesses? Those values from the histogram are very close to each other, only 2968 different to cover around 9 years of values. 9 times 365…

The histogram is representing the dates in Julian format, ie number of days since 1st Jan 4712BC. As a quick proof of this:

select to_date('2452258','J'),to_date('2455226','J')
from dual;

TO_DATE('2452258' TO_DATE('2455226'
----------------- -----------------
14-DEC-2001 00:00 29-JAN-2010 00:00

Well, what do you know. Very close to the 13th Dec 2001 and 28th Jan 2010

This of course makes sense, storing the date as an ascending numeric means it is simple to calculate the width of the range and how many values per day there are.

Imagine how complex it would be to do this if the date was stored in the bizarre way we humans deal with it – an ascending numeric for year, a cycling 12 digit number for month and a varying cyclic number for the day of the month. And that is ignoring other calendars in common use.

However, I’m looking at this method of representing the date and something bothers me. There is no allowance for the time portion. Maybe column histograms can’t cope with time? I feel a couple of tests coming on…

ADDITIONAL.
If you have seen the comments, you will know that Jonathan commented to ask if I might have truncated/been selective with the data in creating my test table as he is sure there is a fractional section representing the time portion.

Well, I did say I needed to check further so maybe I would have spotted my error on my own… :-)
This is the script I use to pull out histogram data (and yes, I trimed the output from the script before copying it into this post, so it does not quite match the script):

-- chk_hist.sql
-- Martin Widlake 7/4/4
-- pull out the histograms on a table
set pause on pages 32
spool chk_hist.lst
col owner form A8
col table_name form A15
col column_name form a20
col rowcount form 99,999,999,999
col end_val form 99,999,999,999
select owner
,table_name
,column_name
,endpoint_value  end_val
,endpoint_number rowcount
from all_tab_histograms
where table_name like upper(nvl('&tab_name','WHOOPS')||'%')
and  owner like upper('&tab_own')||'%'
and  column_name like upper('&col_name')||'%'
order by table_name,column_name,endpoint_number
/
spool off
clear colu

Hmm wonder what happens if I change the “col end_val form 99,999,999,999″ so that it would show fractions…

col end_val form 999,999,999.99999999

TABLE_NAME      COLUMN_NAM               END_VAL     ROWCOUNT
--------------- ---------- --------------------- ------------
TEST_TABLE_X    PLACED_DT     2,452,257.73030093            0
TEST_TABLE_X    PLACED_DT     2,454,333.76546296            1
TEST_TABLE_X    PLACED_DT     2,454,647.32561343            2
TEST_TABLE_X    PLACED_DT     2,454,737.25017361            3
TEST_TABLE_X    PLACED_DT     2,454,820.02204861            4
TEST_TABLE_X    PLACED_DT     2,454,866.98009259            5
TEST_TABLE_X    PLACED_DT     2,454,928.66848380            6
TEST_TABLE_X    PLACED_DT     2,454,980.94815972            7
TEST_TABLE_X    PLACED_DT     2,455,005.68413194            8
TEST_TABLE_X    PLACED_DT     2,455,023.67142361            9
TEST_TABLE_X    PLACED_DT     2,455,039.03236111           10
TEST_TABLE_X    PLACED_DT     2,455,050.39246528           11

Ahhhhh……

Thanks Jonathan :-)
It is still a Julian date, but with the time as a fraction as he said.

Friday Philosophy – Statistically Significant November 27, 2009

Posted by mwidlake in performance, VLDB.
Tags: ,
3 comments

There are very few generalist Oracle DBAs around, and very very very few expert generalist Oracle DBAs. Tom Kyte might count, but I’m sure if you pressed him he would admit to knowing nothing about {some key aspect of being a general DBA}. That is because to be an expert on something you have to spend a lot of time on it. Oh, and learn. {I’ve spent a lot of time around my wife but she still confuses me, I just don’t seem to be able to learn}. Oracle is simply too wide a topic to be able to spend a lot of time on all of it, even if by “Oracle” you mean just the core RDBMS technology, which I do. You have to pick an area.

Pete Finnigan specialises in security, Julian Dyke on RAC and Internals, Doug Burns on the relationship between the database and cuddly toys. Oh and ASH/AWR.

So I ask myself, what is my specialty?

Well, if I go on the last 5 working years of my life, it would probably be Oracle Database Statistics. Which is quite ironic given my woeful attempts with Statistics when I tried that maths ‘A’ level all those years back {for non-UK people, an ‘A’ level is what you do at age 17- 18. It’s that point in maths when most of us are convinced logic is replaced by magic}. I’ll go further and say I specialise in Oracle Database Statistics on VLDBS. Maybe even more specific, Gathering and Maintaining Oracle Database Statistics on VLDBs.

Not a very sexy-sounding specialty is it, even in the context of IT technical specialties. I am sure that if I was to tell a lady I wish to impress that I was “a specialist in gathering and maintaining Oracle database statistics on VLDBs” then I would soon be standing alone as she looked for an accountant to talk to {I refer back to my comment on my wife, I repeatedly try to impress her with this line and she never fails to walk away}. Heck, I could spend all my time at the UKOUG Conference next week and struggle to find someone who would respond positively to such a claim to greatness.

But the situation is that I have had to deal with the failures and idiosyncrasies of this area of Oracle for 4 major clients on a dozen or so systems and have discussed it with half a dozen other people who have had challenging times with it. And even now it trips me up all the time. Because, frankly, some if it is not very well implemented (choking automated stats gathering job anyone?), different parts work in different ways (if you state a statid, statown and stattab when you SET_TABLE_STATS the values go into the stats table, if you state them for GATHER_TABLE_STATS, the gathered values go into the data dictionary and the OLD ones go into the stats table – yeah, I know, if you did not know that you are now going ?huh?), some of it is wrapped up in a blanket of confusion and secrecy (what exactly DOES the automated stats job do and when you say “GATHER AUTO” what exactly will Oracle do automatically?).

Thankfully the secrecy side is reducing as Oracle (and others) say more about how some of these things are controlled or decided “automatically” , but then the world shifts under your feet with new versions of Oracle. Stats gathering under 11g is far more open to control than 10, but as my current client is not on 11g then I can’t spend too long looking at that.

So currently I am a expert in 10g Gathering and Maintaining Oracle Statistics on VLDBs. Now if that is not pretty damned specialist then I don’t know what is.

I should blog technical details on all of this {sadly I know the answers to the things I mention above}, but I suspect people would go “Hmmm, interesting…..” in that kind of “who in heck needs to know that! I’m off to do block dumps of temporary segments” way. But if you think otherwise, let me know.

Besides, I am being very, very poor at getting on with the Partition stuff I want to do, even though I have half-written about 4 more posts on it. I need to stick to that for my technical blogs for now.

That and I kind of shy away from being an expert in such an unexciting area, I might get offered {oh dear lord not more} work in it.

Thankfully I also specialise in beta testing new features of Oracle for clients in the travel and media industries, where on-site work in exotic locations is, at times, required… If anyone has any opening in that field, drop me a line :-)

Depth of Indexes on VLDBs November 18, 2009

Posted by mwidlake in performance, Uncategorized, VLDB.
Tags: , ,
3 comments

In a couple of recent posts on partitions, on single row and range scan access I have started exploring the performance of partitions. I will continue on the main theme soon, honest.

I feel a little bad that I have not given concrete examples so far {creating very large objects for the sake of demonstrations and proof-of-concepts is a bit of an issue with VLDBs, due to the space and time requirements to do it}. So, here is just some real-world information on index depth, or BLEVEL of indexes on Very Large DataBases.

The BLEVEL is the number of levels in the index. Minus one. See this post for a full description, but in brief, if an index has a BLEVEL of 3, that means there is a root node, a level of branch nodes, a second level of branch nodes and then the leaf nodes. 4 levels. 4 IOs to check if the index holds the value you are looking for. And then one or more IOs against the table to get the actual records from the table, if required (one block if there is only one row, one or more blocks if there are several rows).

The below is from a 25TB datawarehouse system. I am looking for the maximum index depth on the database:-

select max(blevel) from dba_indexes

MAX(BLEVEL)
———–
4

OK, how many indexes are this deep?…

select owner,index_name from dba_indexes
where blevel=4

OWNER INDEX_NAME
—————————— ———————-
ERIC W_IIIIIIII_URLS_PK

1 row selected.

Oh. I have a pretty massive database and the deepest index I have is BLEVEL 4 and there is one of them. So the index height is 5. And access to that table for a given referral ID is going to take 5 IOs to work down the index and one against the table.

What is more, the index is an IOT

INDEX_NAME INDEX_TYPE
—————————— —————————
W_REFERRAL_URLS_PK IOT – TOP

so, in fact, to get to the table record it is just the 5 IOs as the table record is held in the IOT.

So with a 25TB data warehouse, the deepest normal indexes are BLEVEL 3.

select index_name,num_rows/leaf_blocks
from dba_indexes
where BLEVEL =3
order by num_rows desc

INDEX_NAME                             NUM_ROWS  LEAF_BLOCKS
------------------------------ ---------------- ------------
W_WL_SH_API_XIIIIIIIIIII_ID     4,585,108,192   13,406,015
W_WL_SHIIIIIIIIIIIXXXX_ID    4,564,350,002   16,892,898
W_WL_SIIIIIIIIIIIIIIIIIIIIII_ID     4,422,775,820   17,189,536
W_WL_SHIIIIIIIIIIIIIIIIIIIIII_ID    4,182,087,545   12,243,438
W_WL_GX_RESIIIIIIIIIII_LOG_ID1     3,690,374,216   14,613,388
IDX_W_WL_LIIIIIIIIIIIESS_ID          351,507,905    1,958,109
WL_LOGIN_SUIIIIIIIIIIIIE_ID          348,004,861    1,550,180
IND_IIII_LEG_ID                     312,727,000      972,690
IND_TMLR_MARIIIIIIIIIII_ID           306,988,247    1,121,592
PK_TBL_IIIIIIIIIIIUNNER              305,566,240      939,775
PK_IIIIIIIIIIIGS                    284,208,000      863,000

I asked an old friend, Tony Webb, to check out the depth of indexes on the largest databases they have, which range from a few TB to several 10s of TB. {one  is 100TB plus, but a lot of the data is in LOBs, so the actual number of records is only billions, not 10′s of billions :-) } No index had a BLEVEL greater than 3.

Hi Martin,

I ran the first query on DAVE, DEE(our biggest db), DOZY and MICK. The max depth results were 3,2,3 and 3. Our largest database only had 2. In fact some of the other largish dbs also only have a max depth of 2.

That enough for your blog Martin?

Now, some of you may be smelling a Rat, and you would be right. At both sites, the VLDBs have been designed with partitioning as a key part of the Physical Implementation. All the largest objects are partitioned. What impact does partitioning have on BLEVEL?

The depth of the index segment is dependent on the number of keys per block.

Let us say you have 8k blocks {this is a questionably low block size for a data warehouse, but it is what some of these systems have} and your index is on a couple of columns, totalling 20 bytes per key on average, including rowid. Allowing block overhead, that is approx 8000/20 entries = 400 max….

Blow that, that’s whooly theory, I’ll look at my real 25TB database for number of rows per leaf block… 

select leaf_blocks, num_rows\leaf_blocks rows_per_leaf
from dba_ind_statistics
where num_rows > 100000000
order by leaf_blocks desc

 LEAF_BLOCKS ROWS_PER_LEAF
------------ -------------
  30,211,949    325.602273
  28,717,447    262.762331
  26,308,608    382.505959
  25,429,862    385.047045
  24,054,626     309.92864
  23,660,747    382.571434
  22,458,058    411.063205
  22,316,976    346.620625
  18,875,827    379.952198
  17,451,901    338.600909
  17,189,536     257.29466

From the above let us take a low figure of 300 average enteries per block. For a BLEVEL 0 , a one block index, that would be 300 rows that are referenced. For a level 1 index, 300 entries in the root block will point to 300 leaf blocks, so that will be 300*300 rows…90,000 entires.

Go down another BLEVEL to 2 (and a height of 3) and you have a root node referencing 300 Branch nodes, referenceing 300 Leaf nodes each referencing 300 records. 300*300*300 = 27 million.

Another level (BLEVEL 3), another factor of 300 and that is 8.1 billion.

BLEVEL 0= 300
BLEVEL 1 = 90,000
BLEVEL 2 = 27,000,000
BLEVEL 3 = 8,100,000,000
BLEVEL 4 = 6,480,000,000,000 (6,480 billion entries).

You can see that with an 8k block size and being very pessimistic about the number of entries per block (I used a low average to calculate enttries per block where as in reality most indexes will fill the block at the root and branch nodes to close to full before throwing a new level) your have to increase your data volume by amost 300 to throw a new level of index.

On average, knowing nothing about the actual number of records in an index, you would have to reduce your number of indexed rows by 150 times (half of 300) to have a 50% chance of lowering the BLEVEL of a locally partitioned index by one. ie 150 or more partitions.

with a 32k block size, you would have to reduce your data volume by more like 620 times to drop the BLEVEL (not 600 as you have less wastage from block overhead with larger blocks).

To reduce it by two it would be 300*150 I think, wiht 8k block size {can any statisticians, like Mr Lewis, help me out on this one?}. If I am right, that is 45,000 partitions. In that 25TB datawarehouse, no table has more than 4,000 partitions.

That is why I say, when you partition a table and have locally partitioned indexes, you might reduce the index level by 1. Might.

However, none of these systems has more than 4000 partitions per table. That might sound a lot, but it is only going to reduce an index BLEVEL by 1. Almost certainly it will, but if you have read the previous postings, that is not going to really help index lookups be that more efficient :-)

I’ll just add a couple of last comments.

  • If you have an index BLEVEL of 4 or 5 or more and do not have a VLDB, you might want to look at why {hints, it could be an IOT, it could be an index on several concatenated VARCHAR2 columns, it could be you are using 2k block size, it could be that you regularly delete a large pecentage of your data and then re-insert more data, it could be you have a one-table database, hehe.}.
  • Point one does not mean you should rebuild said index. I almost never rebuild indexes. The need to do so is one of those Oracle myths. There are situations where index rebuilds are advantageous, but they are pretty damned rare. If you regularly rebuild all your indexes or you are thinking of doing so, can I suggest you book a “meeting” for an afternoon and sit down with a cup of tea and google/bing/yahoo or whatever and check out index rebuilding first? Ignore any sites you land on offering database health checks or training courses on cruise liners.

That’s all for now.

BLEVEL and Height of Indexes November 13, 2009

Posted by mwidlake in internals, performance.
Tags: , ,
6 comments

{Update – this post describes blevel, demonstrates it’s growth with the number if index entries and the impact on index-to-table lookups. for a pretty graphic of BLEVEL try looking at this page}

I got something wrong on a couple of postings recently, namely the relationship between BLEVEL and the number of blocks needed to read “down” an index, the true depth or HEIGHT of the index {I used to know this but I forgot, but heck no one pinged me on the two posts in question, so I got away with it :-) – I’ve updated the postings already.}

BLEVEL is the number of branch levels (including the root node) in a B-Tree index. Height is the actual depth of the index. Height is BLEVEL plus one. So when you see BLEVEL of 3 against an index in DBA_INDEXES/DBA_IND_STATISTICS, that means the index has a root node, a first level of Branch blocks, then a second level of Branch blocks and finally the Leaf blocks (which hold the indexed values and rowids to the table entries).

Thus to scan the index for one unique entry, Oracle will need to read the root node to locate the correct branch node in branch level one, read that to find the correct branch node in branch level 2 and that will lead to the correct leaf block. That is four blocks to read. The leaf block contains the index entry and the rowid of the relevant data block, which allows oracle to go directly to that block, for the fifth block read.

{I’m having trouble finding a nice diagram of this {{ I hate the one in the Oracle manuals}}, not even on Mr Foote’s or Mr Lewis’s pages, so if you spot one before I do, let me know and I’ll update this page with a relevant link}.
{Update 18 months later – I finally drew a nice diagram of the index-rowid-table_row path.}

Some documentation on the Web mentions HEIGHT being held in the index stats table. This is SYS.INDEX_STATS, not the DBA_IND_STATISTICS table, and SYS.INDEX_STATS is only populated when you run the old “ANLAYZE INDEX index_name VALIDATE STRUCTURE” command, so ignore that. You should not really be using the old ANALYZE command any more.

The below demonstrates the increasing BLEVEL and the number of consistent gets to select one record {it’s more complicated if you select more than one}

NAME                           VALUE
------------------------------ --------------------
compatible                     10.2.0.3.0
cpu_count                      8
db_block_size                  8192
db_file_multiblock_read_count  16
optimizer_mode                 ALL_ROWS
sga_target                     0
sort_area_size                 65536

create table test_bl
 (id    number(8) not null
 ,status number(1) not null
 ,num_1     number(3) not null -- random 20
 ,num_2     number(3) -- random 20
 ,num_3     number(5) -- cycle smoothly
 ,num_4     number(5) -- cycle smoothly
 ,vc_1      varchar2(10)
 ,vc_2      varchar2(10)
 ,vc_pad varchar2(2000))
 tablespace users
 /
Table created.

insert into test_bl(id,status,num_1,num_2,num_3,num_4
                   ,vc_1,vc_2,vc_pad)
select rownum,decode(mod(rownum,100),0,1
               ,0)
,trunc(dbms_random.value(1,20))
,trunc(dbms_random.value(1,30))
,mod(rownum,10)+1
,mod(rownum,100)+1
,dbms_random.string('U',10)
,lpad(chr(mod(rownum,6)+65),5,chr(mod(rownum,6)+65) )
,lpad('A',100,'A')
from dba_objects
where rownum < 500
/
499 rows created.

commit;
Commit complete.

-- now add a pK on the ID
alter table test_bl
add constraint tb_pk primary key (id)
using index
tablespace users
 /
Table altered.

begin
  dbms_stats.gather_table_stats(user,'TEST_BL');
END;
/
PL/SQL procedure successfully completed.

select index_name,blevel,leaf_blocks
from dba_indexes
where owner=user
and index_name like 'TB%'
/

INDEX_NAME                         BLEVEL LEAF_BLOCKS
------------------------------ ---------- -----------
TB_PK                                   0           1

So I’ve created the test table, put 499 records in it and added the index, via a primary key constraint. The index created has one leaf block in it and a BLEVEL of 0.

Now let’s select a record via it {and the reason I put 499 records in the table is so that oracle decides to use the index and not a full table scan, which would be a likely choice by CBO with a very small table}.

set autotrace on
select vc_1 from test_bl where id=54
 /

VC_1
----------
BRFVRHEMWP
1 row selected.

Execution Plan
----------------------------------------------------------
| Id| Operation                   | Name    | Rows  |Bytes| Cost|
----------------------------------------------------------
| 0 | SELECT STATEMENT            |         |    1 |  15 |  1  (0)|
| 1 |  TABLE ACCESS BY INDEX ROWID| TEST_BL |    1 |  15 |  1  (0)|
|*2 |   INDEX UNIQUE SCAN         | TB_PK   |    1 |     |  0  (0)|
----------------------------------------------------------

Statistics
----------------------------------------------------------
          1  recursive calls
          0  db block gets
          2  consistent gets
          0  physical reads
          0  redo size
        339  bytes sent via SQL*Net to client
        338  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed

select vc_1 from test_bl where id=54
 /

VC_1
----------
BRFVRHEMWP
1 row selected.

Execution Plan
----------------------------------------------------------
|Id | Operation                   | Name    | Rows |Bytes| Cost
----------------------------------------------------------
| 0 | SELECT STATEMENT            |         |    1 |  15 | 1   (0)|
| 1 |  TABLE ACCESS BY INDEX ROWID| TEST_BL |    1 |  15 | 1   (0)|
|*2 |   INDEX UNIQUE SCAN         | TB_PK   |    1 |     | 0   (0)|
-----------------------------------------------------------


Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
          2  consistent gets
          0  physical reads

I generally run test selects twice, to remove any parsing and recursive SQL overhead {I’ll remove these from the rest of this post}. So, 2 consistent gets. That would be one on the index and one on the table then.
Note the cost of the index index unique scan – 0. See end.

Now I’ll add more data and grow the index.

test102>set autotrace off echo off
insert into test_bl(id,status,num_1,num_2,num_3,num_4
                  ,vc_1,vc_2,vc_pad)
select rownum+500,decode(mod(rownum,100),0,1
              ,0)
,trunc(dbms_random.value(1,20)) 
,trunc(dbms_random.value(1,30)) 
,mod(rownum,10)+1
,mod(rownum,100)+1
,dbms_random.string('U',5)
,lpad(chr(mod(rownum,6)+65),5,chr(mod(rownum,6)+65) )
,lpad('A',100,'A')
from dba_objects
where rownum < 5500
/
5499 rows created.

begin
  dbms_stats.gather_table_stats(user,'TEST_BL');
END;
/
PL/SQL procedure successfully completed.

select index_name,blevel,leaf_blocks
from dba_indexes
where owner=user
and index_name like 'TB%'
/

INDEX_NAME                         BLEVEL LEAF_BLOCKS
------------------------------ ---------- -----------
TB_PK                                   1          11

So now we have a BLEVEL of 1 and 11 leaf blocks. That will be a root node and below it the leaf blocks. Let’s try a select:

select vc_1 from test_bl where id=454
/
VC_1
----------
IQGSEOCCCH

Execution Plan
----------------------------------------------------------
|Id | Operation                   | Name    | Rows  |Bytes| Cost|
----------------------------------------------------------
| 0 | SELECT STATEMENT            |         |     1 |  11 | 2   (0)|
| 1 |  TABLE ACCESS BY INDEX ROWID| TEST_BL |     1 |  11 | 2   (0)|
|*2 |   INDEX UNIQUE SCAN         | TB_PK   |     1 |     | 1   (0)|
-----------------------------------------------------------

Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
          3  consistent gets
          0  physical reads

3 consistent gets, one for the root node, one for the relevant leaf block and one for the data block holding the record.

Now I’ll insert about 400,000 more records to cause the index to become one level deeper. (You might be interested to know that 300,000 records was not enough to cause a layer of branch nodes to be created, though as I am indexing an ascending numerical column, each index entry is not exactly huge. This does show that the BLEVEL does not scale with data volume – but it ‘does not scale’ in a very beneficial way. You need to massively increase the volume of data between increasing BLEVELs.)

I will then select my record:

99999 rows created.
99999 rows created.
99999 rows created.
99999 rows created.

begin
  dbms_stats.gather_table_stats(ownname=>user,tabname =>'TEST_BL'
                                ,estimate_percent=> 10);
END;
 /
PL/SQL procedure successfully completed.

INDEX_NAME                         BLEVEL LEAF_BLOCKS
------------------------------ ---------- -----------
TB_PK                                   2         761

set autotrace on
select vc_1 from test_bl where id=454
 /

VC_1
----------
IQGSEOCCCH
1 row selected.

Execution Plan
----------------------------------------------------------
|Id | Operation                   | Name    | Rows  |Bytes| Cost|
--------------------------------------------------------
| 0 | SELECT STATEMENT            |         |     1 |  12 | 3   (0)|
| 1 |  TABLE ACCESS BY INDEX ROWID| TEST_BL |     1 |  12 | 3   (0)|
|*2 |   INDEX UNIQUE SCAN         | TB_PK   |     1 |     | 2   (0)|
---------------------------------------------------------

Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
          4  consistent gets
          0  physical reads

The index BLEVEL has gone up to 2 {index height is 3} and now 4 consistent gets are needed to fetch the record.

You may have noticed that the estimated cost of the INDEX_UNIQUE_SCAN is the same as the BLEVEL, which is not really correct. After all, in the first example the cost was 0 and there has to be a read of the index leaf block! The costing makes more sense when it is part of the calculation for scanning an index and then visiting the table for all found records:-

“basic index range scan cost = index blevel + ceil(index selectivity x leaf blocks) + ceil(table selectivity x clustering factor)”

In words this formula means “go down the index to the leaf nodes (this is the BLEVEL), scan the number of leaf nodes expected for this index value, then visit the number of table blocks this set of index entries would map to”.

For more information on the formula, I’d plug part of that formula into google (or bing or whatever takes your fancy, search-engine-wise). The original is this page by Richard Foote but there are some good notes by others as well.

There are a lot of references on the web about the cost of accesing an index being the BLEVEL, but remember, if it is a unique access it is the BLEVEL plus one, and oracle seems (in my little tests anyway) to be underestimating the cost by 1. I think this reference to the BLEVEL and the costs might be leading to people into mistaking the BLEVEL as the actual height of the index.

Privileges required to FLUSH_DATABASE_MONITORING_INFO October 19, 2009

Posted by mwidlake in development, performance.
Tags: , , ,
1 comment so far

I’m doing some work at the moment on gathering object statistics and it helps me a lot to have access to the number of changed records in SYS.DBA_TAB_MODIFICATIONS. To ensure you have the latest information in this table, you need to first flush any data out of memory with DBMS_STATS.FLUSH_DATABASE_MONITORING_INFO.  For the live system, the DBAs rather understandably {and sensibly} want all users to run with the least access privileges they need, so granting DBA role to my user is out.

I googled for the actual system privilege or privileges needed to flush_database_monitoring_info and drew a blank, so I have had to find out for myself. And being a nice guy {who am I kidding}, I am now recording the info for anyone else who is interested to find:

On 10.2.0.3, to execute DBMS_STATS.FLUSH_DATABASE_MONITORING INFO you need the ANALYZE ANY system privilege.

{Not ANALYZE ANY DICTIONARY, which would make more sense to me}

For those who like such things, here is the proof. I had to use two sessions, thus the constant displaying of system time.

-- current user privs
DWPERFDEV1> @usr_privs
enter user whos privs you wish to see> dwperf
GRANTEE              TYPE PRIVILEGE                           adm
----------------------------------------------------------------
DWPERF               SYSP CREATE JOB                          NO
DWPERF               SYSP CREATE PROCEDURE                    NO
DWPERF               SYSP CREATE PUBLIC SYNONYM               NO
DWPERF               SYSP CREATE SESSION                      NO
DWPERF               SYSP CREATE SYNONYM                      NO
DWPERF               SYSP CREATE TABLE                        NO
DWPERF               SYSP CREATE TRIGGER                      NO
DWPERF               SYSP DEBUG CONNECT SESSION               NO
DWPERF               SYSP DROP PUBLIC SYNONYM                 NO
DWPERF               SYSP EXECUTE ANY PROCEDURE               NO
DWPERF               SYSP SELECT ANY DICTIONARY               NO
DWPERF               SYSP SELECT ANY TABLE                    NO
DWPERF               ROLE CONNECT                             NO
DWPERF               OBJP SYS.DBMS_UTILITY-EXECUTE            NO
DWPERF_ROLE          SYSP ANALYZE ANY                         NO
DWPERFDEV1> @showtime
  Date       Time
--------------------------------------------------------
19-OCT-2009 13:29:16

DWPERFDEV1> exec dbms_stats.flush_database_monitoring_info
BEGIN dbms_stats.flush_database_monitoring_info; END;
*
ERROR at line 1:
ORA-20000: Insufficient privileges
ORA-06512: at "SYS.DBMS_STATS", line 2148
ORA-06512: at "SYS.DBMS_STATS", line 14135
ORA-06512: at line 1

DEV1> @showtime
  Date       Time
------------------------------------------------
19-OCT-2009 13:29:30

DEV1> grant analyze any dictionary to dwperf;
Grant succeeded.

DWPERFDEV1> @showtime
  Date       Time
------------------------------------------
19-OCT-2009 13:29:40

DWPERFDEV1> exec dbms_stats.flush_database_monitoring_info
BEGIN dbms_stats.flush_database_monitoring_info; END;
*
ERROR at line 1:
ORA-20000: Insufficient privileges
ORA-06512: at "SYS.DBMS_STATS", line 2148
ORA-06512: at "SYS.DBMS_STATS", line 14135
ORA-06512: at line 1

DEV1> @showtime
 Date       Time
---------------------------------------------
19-OCT-2009 13:30:46

DEV1> grant analyze any to dwperf;
Grant succeeded.

DWPERFDEV1> @showtime
  Date       Time
------------------------------------
19-OCT-2009 13:31:20

DWPERFDEV1> exec dbms_stats.flush_database_monitoring_info
PL/SQL procedure successfully completed.

-- SUCCESS!

DEV1> @showtime
  Date       Time
-------------------------------------------
19-OCT-2009 13:31:38
DEV1> revoke analyze any from dwperf
Revoke succeeded.

DWPERFDEV1> @showtime
  Date       Time
------------------------------------------------------
19-OCT-2009 13:31:57

DWPERFDEV1> exec dbms_stats.flush_database_monitoring_info
BEGIN dbms_stats.flush_database_monitoring_info; END;
*
ERROR at line 1:
ORA-20000: Insufficient privileges
ORA-06512: at "SYS.DBMS_STATS", line 2148
ORA-06512: at "SYS.DBMS_STATS", line 14135
ORA-06512: at line 1

Of course, I’ll soon find something else that breaks due to my minimum privs before the end of the day, but it’s not easy creating more secure systems {note, I don’t say Secure, just more secure, as in less open!}.

Decrypting Histogram Data #3 – is 7 or 15 Characters Significant? September 15, 2009

Posted by mwidlake in internals, performance.
Tags: , , ,
2 comments

At last I am going to wrap up about looking at Histogram data and endpoint_value. Sorry about the delay, life is a tad complex at present. This is also a very, very long post. You may want to skip to the end to get the conclusions

I first posted about how to translate endpoint_values.
I then posted in the second post a week ago about how endpoint_value is limited to 7 or so characters for varchar2 columns.
{Addition – this later post discusses the content of endpoinit_value for dates}

Coskan contacted me to say he believed I had missed something, and of course he was right – but I’d laid big hints that I was using theory and not practical test and there was more to know. {There is usually more to know, only practical tests confirm whether your theory works or not. And even after that, it seems there is still more to know!}

OK, I have a test table, created with the following {oh, this is under 10.2.0.3, Enterprise Edition on linux}:

create table hist_test
as select rownum  id
,lpad(chr(mod(rownum,6)+65),5,chr(mod(rownum,6)+65) )vc_1
,lpad('A',2,'A')||dbms_random.string('U',1) vc_2
,lpad('B',3,'B')||dbms_random.string('U',1) vc_3
,lpad('C',4,'C')||dbms_random.string('U',1) vc_4
,lpad('D',5,'D')||dbms_random.string('U',1) vc_5
,lpad('E',6,'E')||dbms_random.string('U',1) vc_6
,lpad('F',7,'F')||dbms_random.string('U',1) vc_7
,lpad('G',8,'G')||dbms_random.string('U',1) vc_8
,lpad('H',16,'H')||dbms_random.string('U',1)) vc_16
,lpad('I',40,'I')||dbms_random.string('U',1) vc_40
from dba_objects
where rownum < 1001

I need to skew the data in some columns so that different WHERE clauses should see different expected numbers of rows
I now need to add some skew to the data so you can see the histogram in action .

update hist_test set
VC_5=lpad(‘D’,5,’D’)||’A’
,VC_7=lpad(‘F’,7,’F’)||’A’
,VC_16=lpad(‘H’,16,’H’)||’A’
,VC_40=lpad(‘I’,40,’I’)||’A’
where mod(id,10)=1
/

Gather histogram stats, 10 buckets:
dbms_stats.gather_table_stats(ownname=>user,tabname=>’HIST_TEST’
,method_opt=>’FOR ALL COLUMNS SIZE 10′)

And looking at ALL_TAB_COLUMNS we can see the general stucture of the data (just the VC_ columns are shown)

COL_NAME NUM_DIST LOW_V                HI_V                 N_BUCK  AVG_L
-------------------------------------------------------------
ID          1,000 1                    1000                     10      4
VC_1            6 AAAAA                FFFFF                     6      6
VC_2           26 AAA                  AAZ                      10      4
VC_3           26 BBBA                 BBBZ                     10      5
VC_4           26 CCCCA                CCCCZ                    10      6
VC_5           26 DDDDDA               DDDDDZ                   10      7
VC_6           26 EEEEEEA              EEEEEEZ                  10      8
VC_7           26 FFFFFFFA             FFFFFFFZ                 10      9
VC_8           26 GGGGGGGGA            GGGGGGGGZ                10     10
VC_16          26 HHHHHHHHHHHHHHHHA    HHHHHHHHHHHHHHHHZ        10     18
VC_40           1 IIIIIIIIIIIIIIIIIIII IIIIIIIIIIIIIIIIIIII      1     42

I have created a table with varchar2 columns set to a fixed sting of characters the length indicated by the column name with the last column varying from A to Z. So VC_5 contains EEEEEx, where x is A to Z. 26 distinct values each column.

If you look at the above, all columns are showing as having 26 values and 10 buckets EXCEPT VC_40, but it does have 26 distinct values:

select count(distinct(VC_40)) from hist_test;
COUNT(DISTINCT(VC_40))
———————-
26

I have my function to unpack the endpoint_value for varchar2 columns from the number stored to the character string it represents. My function is not perfect, but it is reasonably good. Go checkout post one for the function.
Looking at the histograms with my function you can see the below {I show only some records}

colname                               END_VAL ROWCOUNT MOD_REAL
------------------------------------------------------------------
VC_5     354460798875962000000000000000000000        1 DDDDDA
VC_5     354460798875972000000000000000000000        2 DDDDDD
VC_5     354460798875986000000000000000000000        3 DDDDDG
{snip}
VC_5     354460798876071000000000000000000000        9 DDDDDY
VC_5     354460798876080000000000000000000000       10 DDDDDZ
VC_7     364886116489977000000000000000000000        1 FFFFFGK
VC_7     364886116489977000000000000000000000        2 FFFFFGK
VC_7     364886116489977000000000000000000000        3 FFFFFGK
VC_7     364886116489977000000000000000000000        4 FFFFFGK
{snip}
VC_7     364886116489977000000000000000000000        9 FFFFFGK
VC_7     364886116489977000000000000000000000       10 FFFFFGK
VC_16    375311434103976000000000000000000000        1 HHHHHI:
VC_16    375311434103976000000000000000000000        2 HHHHHI:
VC_16    375311434103976000000000000000000000        3 HHHHHI:
{snip}
VC_16    375311434103976000000000000000000000        9 HHHHHI:
VC_16    375311434103976000000000000000000000       10 HHHHHI:
VC_40    380524092910976000000000000000000000    1,000 IIIIIJM

For VC_5, there are 10 buckets, all with different endpoint_values.
The VC_7 has 10 buckets, but most have the same endpoint_value.
The VC_16 has 10 records in the HISTOGRAM table, all with the same endpoint_value.
VC_40 has only one record in the HISTOGRAM table.

Basically, if I am right, WHERE filters on a column holding values that are the same for the first seven or so characters will not be well supported by the histograms. We should see the estimated cost for such a WHERE filter to be wrong.

I have skewed the data so that Histograms should show more expected values for DDDDDA than DDDDDS, for FFFFFFFA than FFFFFFFS etc

update hist_test
set
 VC_5=lpad(‘D’,5,’D’)||’A’
,VC_7=lpad(‘F’,7,’F’)||’A’
,VC_16=lpad(‘H’,16,’H’)||’A’
,VC_40=lpad(‘I’,40,’I’)||’A’
where mod(id,10)=1
/

And now I’ll see how the CBO estimates the rows coming back for some WHERE clauses.

select count(*) from hist_test
where VC_5 = 'DDDDDA'
  COUNT(*)
----------
       139

Execution Plan
----------------------------------------------------------
| Id  | Operation          | Name      | Rows  | Bytes | Cost (%CPU)|
----------------------------------------------------------
|   0 | SELECT STATEMENT   |           |     1 |     7 |     8   (0)|
|   1 |  SORT AGGREGATE    |           |     1 |     7 |            |
|*  2 |   TABLE ACCESS FULL| HIST_TEST |   100 |   700 |     8   (0)|
----------------------------------------------------------

select count(*) from hist_test
where VC_5 = 'DDDDDS'
  COUNT(*)
----------
        35

Execution Plan
----------------------------------------------------------
| Id  | Operation          | Name      | Rows  | Bytes | Cost (%CPU)|
----------------------------------------------------------
|   0 | SELECT STATEMENT   |           |     1 |     7 |     8   (0)|
|   1 |  SORT AGGREGATE    |           |     1 |     7 |            |
|*  2 |   TABLE ACCESS FULL| HIST_TEST |    49 |   343 |     8   (0)|
---------------------------------------------------------------------

For column VC_5 it all works as we would expect. the CBO estimates 100 values for DDDDDA and 49 for DDDDDS. I’ve shown the actual count(*)’s as well to show there is a real variation in numbers. There are good reasons why the estimates do not match reality, I won’t go into them now but this does highlight that histograms can help but do have flaws.

What about the other rows? We are only interested in the WHERE clause and the estimate for the number of rows from the table access, so I’ll show only those.

where VC_7 = 'FFFFFFFA'
|*  2 |   TABLE ACCESS FULL| HIST_TEST |   100 |   900 |     8   (0)|

where VC_7 = 'FFFFFFFS'
|*  2 |   TABLE ACCESS FULL| HIST_TEST |    49 |   441 |     8   (0)|

where VC_16 = 'HHHHHHHHHHHHHHHHA'
|*  2 |   TABLE ACCESS FULL| HIST_TEST |   100 |  7200 |     8   (0)|

where VC_16 = 'HHHHHHHHHHHHHHHHS'
|*  2 |   TABLE ACCESS FULL| HIST_TEST |    47 |   216 |     8   (0)|

where VC_40 = 'IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIA'
|*  2 |   TABLE ACCESS FULL| HIST_TEST |  1000 | 42000 |     8   (0)|

where VC_40 = 'IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIS'
|*  2 |   TABLE ACCESS FULL| HIST_TEST |  1000 | 42000 |     8   (0)|

{and just to show there is no typo in the string of 'I's above...
where VC_40 = lpad('I',40,'I')||'S'
|*  2 |   TABLE ACCESS FULL| HIST_TEST |  1000 | 42000 |     8   (0)|

In the above you can see that the CBO estimates the costs for VC_7 exactly the same as it did for VC_5. Despite the endpoint_values no longer distinguishing the buckets.

Even more telling is tha for VC_16 the CBO also is able to predict there are more values ending in A than S.

As for VC_40, the CBO utterly fails to spot the skew. In fact it utterly fails to spot that there is more than one value in the column (look back at the section showing min and max values and number of distinct values), and there is only one histogram bucket. so I supposed it is no surprise.

How does the CBO still detect the skew for VC_7 and VC_16, even though the endpoint_valuess are the same?
Because it swaps to using column ENDPOINT_ACTUAL_VALUE.

ENDPOINT_ACTUAL_VALUE is only populated for varchar2 columns when the precision of endpoint_value is exceeded:

colname ROWCOUNT MOD_REAL          ENDPOINT_ACTUAL_VALUE
------- ---------------------------------------------
VC_5           7 DDDDDR
VC_5           8 DDDDDV
VC_5           9 DDDDDY
VC_5          10 DDDDDZ
VC_6           0 EEEEEF8           EEEEEEA
VC_6           1 EEEEEF8           EEEEEEC
VC_6           2 EEEEEF8           EEEEEEF
VC_6           3 EEEEEF8           EEEEEEH
VC_6           4 EEEEEF8           EEEEEEK
VC_6           5 EEEEEF8           EEEEEEN
VC_6           6 EEEEEF8           EEEEEEP
VC_6           7 EEEEEFn           EEEEEES
VC_6           8 EEEEEFn           EEEEEEU
VC_6           9 EEEEEFn           EEEEEEX
VC_6          10 EEEEEFn           EEEEEEZ
VC_7           1 FFFFFGK           FFFFFFFA
VC_7           2 FFFFFGK           FFFFFFFC
VC_7           3 FFFFFGK           FFFFFFFF
VC_7           4 FFFFFGK           FFFFFFFI
VC_7           5 FFFFFGK           FFFFFFFK
VC_7           6 FFFFFGK           FFFFFFFO
VC_7           7 FFFFFGK           FFFFFFFR
VC_7           8 FFFFFGK           FFFFFFFT
VC_7           9 FFFFFGK           FFFFFFFW
VC_7          10 FFFFFGK           FFFFFFFZ
VC_16          1 HHHHHI:           HHHHHHHHHHHHHHHHA
VC_16          2 HHHHHI:           HHHHHHHHHHHHHHHHD
VC_16          3 HHHHHI:           HHHHHHHHHHHHHHHHG
VC_16          4 HHHHHI:           HHHHHHHHHHHHHHHHJ
VC_16          5 HHHHHI:           HHHHHHHHHHHHHHHHL
VC_16          6 HHHHHI:           HHHHHHHHHHHHHHHHO
VC_16          7 HHHHHI:           HHHHHHHHHHHHHHHHS
VC_16          8 HHHHHI:           HHHHHHHHHHHHHHHHU
VC_16          9 HHHHHI:           HHHHHHHHHHHHHHHHX
VC_16         10 HHHHHI:           HHHHHHHHHHHHHHHHZ
VC_40      1,000 IIIIIJM

You can see that VC_5 columns have no ENDPOINT_ACTUAL_VALUEs but VC_6, VC_7 and VC_16 do. VC_40 does not.

So, at what point does the CBO stop storing values in ENDPOINT_ACTUAL_VALUES? 32 characters. I created  another test table and these are the low/high values, number of distinct values and number of histogram buckets:

COL_NAME NUM_DIST LOW_V HI_V N_BUCK AVG_L
————————————————————–
ID 1,000 1 1000 10 4
NUM_1 10 0 9 10 3
VC_1 6 AAAAA FFFFF 6 6
VC_30 26 AAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAA 10 32
VC_31 26 BBBBBBBBBBBBBBBBBBBB BBBBBBBBBBBBBBBBBBBB 10 33
VC_32 1 CCCCCCCCCCCCCCCCCCCC CCCCCCCCCCCCCCCCCCCC 1 34
VC_33 1 DDDDDDDDDDDDDDDDDDDD DDDDDDDDDDDDDDDDDDDD 1 35
VC_34 1 EEEEEEEEEEEEEEEEEEEE EEEEEEEEEEEEEEEEEEEE 1 36
VC_35 1 FFFFFFFFFFFFFFFFFFFF FFFFFFFFFFFFFFFFFFFF 1 37[/sourcecode ]

The name of the column is the length of the fixed string, so the columns are actually one character longer, as can be seen by the “AVG_L” column.

And just to check {as after all this is just a nice theory until you test}, I skew the number of records ending in ‘A’ for VC_31 and VC_32:-

update hist_test2
set
VC_31=lpad(‘B’,31,’B’)||’A’
,VC_32=lpad(‘C’,32,’C’)||’A’
where mod(id,10)=1

And see what the CBO thinks the rows to be identified are:-

where VC_31 = lpad(‘B’,31,’B’)||’A’
|* 2 | TABLE ACCESS FULL| HIST_TEST2 | 100 | 3300 |

where VC_31 = lpad(‘B’,31,’B’)||’S’
|* 2 | TABLE ACCESS FULL| HIST_TEST2 | 49 | 1617 |

where VC_32 = lpad(‘C’,32,’C’)||’A’
|* 2 | TABLE ACCESS FULL| HIST_TEST2 | 1000 | 34000 |

where VC_32 = lpad(‘C’,32,’C’)||’S’
|* 2 | TABLE ACCESS FULL| HIST_TEST2 | 1000 | 34000 | [/sourcecode ]

Yes. The CBO can see the skew for the 32 characters of VC_31 and not for the 33 characters of VC_32

So in conclusion

  • endpoint_value is only accurate for varchar2 columns up to around 7 characters.
  • If the varchar2 exceeds the precision of the endpoint_value column then endpoint_actual_value is used.
  • endpoint_actual_value is used for varchar2 columns up to 32 characters and not for columns less than 7 characters.
  • Histograms will not help with columns having many values with the same leading 32 or more characters.
  • If you column has the same leading 32 characters or more, even the number of distinct values is not gathered by DBMS_STATS.
  • You can’t tell how SQL/CBO will work from theory, you have to test. And still consider you knowledge as open to correction after that :-)

Another Day, Another Obscure Oracle Error September 11, 2009

Posted by mwidlake in internals, performance.
Tags: , ,
8 comments

I seem to be spending my life hitting odd bugs or errors in Oracle’s performance-related areas at the moment. I guess part of it is because I am using features not widely touched, part is I’m not working as the SYS or SYSTEM user {in my experience a lot of people working on such things do it when connected as sysdba or at least as SYSTEM} and part is that Larry Ellisson personally hates me {he fills in for God when it comes to Oracle}.

I’m seeing this a lot today,and it seems virtually unknown:-

ERROR at line 1:
ORA-20003: Unable to gather system statistics
ORA-06512: at “SYS.DBMS_STATS”, line 15882
ORA-06512: at line 1

I’m also seeing this, but this error is documented:-

ERROR at line 1:
ORA-20000: Unable to gather system statistics : insufficient privileges
ORA-06512: at “SYS.DBMS_STATS”, line 15822
ORA-06512: at line 1

The rest of this post will explain my path to only partial enlightenment. If you have hit the above errors and found this page via Google, this might help you out and you might like to skip towards the end.
If you just read my blog, regard this as a very long “tweet” or whatever the hell they are. Again, feel free to skip to the end. Or just skip.

The task I’m doing which is causing me to hit these issues is that I’m trying to assure myself of the ability to role back gathering system statistics before getting the DBA team to do it on a production system. {I am not in the DBA team on this one}.

No system statistics have been gathered yet so I have the default set, I’m sure many of you will recognise them:

SNAME           PNAME                     PVAL1 PVAL2
---------------------------------------------------
SYSSTATS_MAIN   CPUSPEED
SYSSTATS_MAIN   CPUSPEEDNW         1,567.798516
SYSSTATS_MAIN   IOSEEKTIM             10.000000
SYSSTATS_MAIN   IOTFRSPEED         4,096.000000
SYSSTATS_MAIN   MAXTHR
SYSSTATS_MAIN   MBRC
SYSSTATS_MAIN   MREADTIM
SYSSTATS_MAIN   SLAVETHR
SYSSTATS_MAIN   SREADTIM
SYSSTATS_INFO   DSTART                          07-25-2006 12:39
SYSSTATS_INFO   DSTOP                           07-25-2006 12:39
SYSSTATS_INFO   FLAGS                  1.000000
SYSSTATS_INFO   STATUS                          COMPLETED

I decided to pull the current system statistics into a statistics table:-

exec dbms_stats.create_stat_table(‘DWPERF’,’AD_BF_STATS’,’DW_COMMON_MG’)

EXEC DBMS_STATS.EXPORT_SYSTEM_STATS
(‘AD_BF_STATS’,’SYSSTAT’||TO_CHAR(SYSDATE,’YYMMDDHH24MI’),’DWPERF’)

SELECT * FROM AD_BF_STATS;
SYSSTAT0909101309 S 4 1 COMPLETED
07-25-2006 12:39 07-25-2006 12:39 CPU_SERIO

SYSSTAT0909101309 S 4 1
PARIO
10 4096 1567.79852

 
Two records created, first is the header record about what this statid is all about, the second contains the stored info. I’ve highlighted the relevant records.

All well and good, I can save my stats. Why do I want to? Because we may gather system statistics on an infrequent but regular basis, and I want a record of them. The automtically stored history only goes back a month {by default}.

Of course, there is no need for me to explicitly export the stats to the table, it can be done as part of the gathering process, by stating the statistics table, owner and an ID when I do the gather, so I did:-

exec dbms_stats.gather_system_stats(gathering_mode => ‘INTERVAL’,interval => 15,stattab=> ‘
AD_BF_STATS’,statown=> ‘DWPERF’ ,statid=>’SYSSTAT’||TO_CHAR(SYSDATE,’YYMMDDHH24MI’))

I went off for a cup of coffee, came back, checked my system stats.

SNAME           PNAME                     PVAL1 PVAL2
---------------------------------------------------
SYSSTATS_MAIN   CPUSPEED
SYSSTATS_MAIN   CPUSPEEDNW         1,567.798516
SYSSTATS_MAIN   IOSEEKTIM             10.000000
SYSSTATS_MAIN   IOTFRSPEED         4,096.000000
SYSSTATS_MAIN   MAXTHR
SYSSTATS_MAIN   MBRC
SYSSTATS_MAIN   MREADTIM
SYSSTATS_MAIN   SLAVETHR
SYSSTATS_MAIN   SREADTIM
SYSSTATS_INFO   DSTART                          07-25-2006 12:39
SYSSTATS_INFO   DSTOP                           07-25-2006 12:39
SYSSTATS_INFO   FLAGS                  1.000000
SYSSTATS_INFO   STATUS                          COMPLETED

Oh. No change. No system stats gathered. Damn.

Well, it’s a test box, there may be too little worload to detect anything – like none! I checked the SGA for recent code and the only SQL was that for recording the stats gathering :-).
So, repeated with me running some workload in another window for 15 minutes.

Still no change, still no system stats. Double Damn!

I remembered from reading around that if MREADTIME is less than SREADTIME the stats gathered could be ignored so I tried again.
And again.
And then decided it would not magically work, there was a problem.

I’ll gather the system stats with START and STOP and really hammer the box for a while, ensuring with autotrace
exec dbms_stats.gather_system_stats(gathering_mode => ‘START’,stattab=>’AD_BF_STATS’,statown
=> ‘DWPERF’ ,statid=>’SYSSTAT’||TO_CHAR(SYSDATE,’YYMMDDHH24MI’))

work-work-work-work-work

exec dbms_stats.gather_system_stats(gathering_mode => ‘STOP’);
BEGIN dbms_stats.gather_system_stats(gathering_mode => ‘STOP’); END;
*
ERROR at line 1:
ORA-20003: Unable to gather system statistics
ORA-06512: at “SYS.DBMS_STATS”, line 15882
ORA-06512: at line 1

Oh.

Oh Hell. Well, at least I now know that the chances are my interval systenm stats collection failed with this, but silently.

I bet if I dug around I would find a few failed jobs recorded where the INTERVAL version of the stats collection failed with this. I have not checked, I am just up to my eyes right now.

I found nothing at all via Google or Metalink about this exact error for gathering system stats. BUT maybe I lacked the correct priveleges on my non-SYS account. So I granted ANALYZE AND DICTIONARY and ANALYZE ANY to my user, as those or the priveleges mentioned for many of the DBMS_STATS management procedures.

Nope, did not help.

Blow it, If DBMS_STATS won’t let me keep the stats in my table, I’ll just rely on the default store of them, and not state my stats table in the call:

DWD2_DWPERF> exec dbms_stats.gather_system_stats(gathering_mode => ‘START’)
BEGIN dbms_stats.gather_system_stats(gathering_mode => ‘START’); END;
*
ERROR at line 1:
ORA-20000: Unable to gather system statistics : insufficient privileges
ORA-06512: at “SYS.DBMS_STATS”, line 15822
ORA-06512: at line 1

Oh. Well, that’s interesting.
If I don’t state a stats table the gather fails when initiated.
If I state a stats table the gather fails when I stop.
How frustrating.

I decided to give up. Then I decided I won’t be beaten and to grant myself DBA role and see if it works.

DWDEV2> exec dbms_stats.gather_system_stats(gathering_mode => ‘START’)
PL/SQL procedure successfully completed.

YEEEESSSSSS!!!

But will it now error when I stop the gather…

DWDEV2> exec dbms_stats.gather_system_stats(gathering_mode => ‘STOP’)
PL/SQL procedure successfully completed.

Deep Joy.

And it gathered stats.

SNAME           PNAME                     PVAL1 PVAL2
------------------------------------------
SYSSTATS_MAIN   CPUSPEED           1,545.000000
SYSSTATS_MAIN   CPUSPEEDNW         1,567.798516
SYSSTATS_MAIN   IOSEEKTIM             10.000000
SYSSTATS_MAIN   IOTFRSPEED         4,096.000000
SYSSTATS_MAIN   MAXTHR
SYSSTATS_MAIN   MBRC
SYSSTATS_MAIN   MREADTIM
SYSSTATS_MAIN   SLAVETHR
SYSSTATS_MAIN   SREADTIM               5.000000
SYSSTATS_INFO   DSTART                          09-11-2009 13:02
SYSSTATS_INFO   DSTOP                           09-11-2009 13:02
SYSSTATS_INFO   FLAGS                  1.000000
SYSSTATS_INFO   STATUS                          COMPLETED

{ignore the lack of mreadtime and sreadtime, nothing happend on the box in the few seconds I was collecting for}

SO, the lack of DBA role was a problem but had been hidden by my attempts to use a stats table. Now surely I can do what I wanted to and store the stats in my stats table…

exec dbms_stats.gather_system_stats(gathering_mode => ‘START’,stattab=>’AD_BF_STATS’,statown
=> ‘DWPERF’ ,statid=>’SYSSTAT’||TO_CHAR(SYSDATE,’YYMMDDHH24MI’))

PL/SQL procedure successfully completed.

Wehay.
work-work-work-work-work

exec dbms_stats.gather_system_stats(gathering_mode => ‘STOP’)
BEGIN dbms_stats.gather_system_stats(gathering_mode => ‘STOP’); END;
*
ERROR at line 1:
ORA-20003: Unable to gather system statistics
ORA-06512: at “SYS.DBMS_STATS”, line 15882
ORA-06512: at line 1

AAAAAGGGGHHHHHH!!!!!!!!!

Let’s check back.

exec dbms_stats.gather_system_stats(gathering_mode => ‘START’)
PL/SQL procedure successfully completed.

work-work-work-work-work

exec dbms_stats.gather_system_stats(gathering_mode => ‘STOP’)
PL/SQL procedure successfully completed.

That works, because I am not involving a stats table. If you look back at the two errors, they come from different (but very similar) line numbers. I’d missed that for a couple of hours, I blame dyslexia.

Conclusion:

Gathering and storing System Stats is as buggy and random as the rest of the dbms_stats functionality. {yes, I am peeved about this}. It Seems..

  • If you gather system stats with the interval method, it can silently fail. This might only be if you involve a stats table.
  • You need the DBA role to gather system stats,  otherwise you get ORA-20000, via ORA-06512 at line 15822
  • analyze and dictionary and analyze any are not enough to allow a non DBA account to gather system stats.
  • If you try and store the previous system stats in your stats table as part of the gather you get ORA-20003, via ORA-06512 at line 15882 {different line}, even if you have DBA role.
  • If you just rely on Oracle preserving the previous version of the stats, you can gather system stats fine.
  • You can work around the issue by exporting the system stats to your stats table as a single step first, then gathering system stats without involving your stats table.

There is no metalink for the ORA-20003 via line 15882 in metalink and no google hit for it, so I reckon it is a damned rare event. And since the last time I blogged about a damned rare error, I see 1 or 2 hits on that blog page every day since :-).

Decrypting Histogram Data August 11, 2009

Posted by mwidlake in internals, performance.
Tags: , , ,
5 comments

{part two is here}

I’m looking at Histograms at the moment and I’ve hit a problem I always hit when looking at histograms. So I’ve decided once and for all to see if I can find a solution.

The value held in DBA_TAB_HISTOGRAMS.ENDPOINT_VALUE for VARCHAR2 columns is a number with no apparent link to reality.

{Before anyone who has read my previous posts suggests it, I’ve already looked at the view and the relevant column resolves to data in either sys.histgrm$ or sys.hist_head$ and the underlying values held are still numbers and not processed in the view, so they are numbers. I think}.
{{I have a sneaking suspicion this could turn into a set of increasingly obscure postings…}}
{{{THIS one might be my most obscure so far!}}}

Let’s have a look:

select --owner
 table_name
,column_name
,endpoint_value  end_val
,endpoint_number rowcount
from all_tab_histograms
where table_name ='TEST_P'
and  owner='WIDLAKEM'
order by table_name,column_name,endpoint_number
/
TABLE_NAME      COLUMN_NAME        END_VAL        ROWCOUNT
--------------- --------------- ---------- ---------------
TEST_P          STATUS                   0          41,523
TEST_P          STATUS                   1          42,065
TEST_P          STATUS                   2          42,099
TEST_P          VC_1            3.3882E+35               0
TEST_P          VC_1            3.4951E+35               1
TEST_P          VC_1            3.6487E+35               2
TEST_P          VC_1            3.7558E+35               3
TEST_P          VC_1            3.9095E+35               4
TEST_P          VC_1            4.0162E+35               5
TEST_P          VC_1            4.1697E+35               6
TEST_P          VC_1            4.3235E+35               7
TEST_P          VC_1            4.4305E+35               8
TEST_P          VC_1            4.5841E+35               9
TEST_P          VC_1            4.6914E+35              10

So what in heck is 4.6914E+35 {last value listed}?

Well, it is a number of course. If I set my column heading appropriately I can see this:-

col end_val form 999,999,999,999,999,999,999,999,999,999,999,999
col column_name form a10
select column_name
,endpoint_value  end_val
,endpoint_number rowcount
from all_tab_histograms
where table_name ='TEST_P'
and  owner='WIDLAKEM'
and column_name='VC_1'
order by table_name,column_name,endpoint_number
/
COLUMN_NAM                                          END_VAL   ROWCOUNT
---------- ------------------------------------------------ ----------
VC_1        338,822,823,410,931,000,000,000,000,000,000,000          0
VC_1        349,512,448,932,628,000,000,000,000,000,000,000          1
VC_1        364,867,102,368,954,000,000,000,000,000,000,000          2
VC_1        375,575,265,424,979,000,000,000,000,000,000,000          3
VC_1        390,949,334,699,583,000,000,000,000,000,000,000          4
VC_1        401,618,352,253,998,000,000,000,000,000,000,000          5
VC_1        416,972,612,321,579,000,000,000,000,000,000,000          6
VC_1        432,345,727,450,272,000,000,000,000,000,000,000          7
VC_1        443,054,364,035,286,000,000,000,000,000,000,000          8
VC_1        458,409,658,691,912,000,000,000,000,000,000,000          9
VC_1        469,139,289,515,351,000,000,000,000,000,000,000         10

The ENDPOINT_VALUE is an encoded representation of the varchar2 value, based on the trick of turning each character into it’s ascii equivalent (0-255) and then multiplying it by 256*no-of-characters-in to the string and adding it all together. Let’s use “Cat” as an example.

  • Last character is ‘t’, ascii value is 116=116;
  • Next-to-last character is ‘b’, ascii value 97, *(256)= 24832;
  • Chr 3 in is ‘C’, ascii value 67, *(256*256) =4390912
  • Total is 4390912+24832+116=4415860

What Oracle actually does is slightly different {and I could have this slightly wrong, my brain waved a white flag a couple of hours back} it takes the first 15 characters and multiplies the first character ascii value by power(256*15), second character ascii value by power(256*14) etc until it runs out of characters or gets to character 15.

How do I know this? I decoded the ‘C’ in the kernel :-). No, I didn’t, I found a link to an encode script Jonathan Lewis had written {it’s also in his latest book, or at least in the set of scripts the book tells you how to download}. I’ve lost the original link I found, and thus a nice little article on the whole topic, but this link will do until I can re-find the original. {NB this referenced article mentions using the hexstr function which may be neater but, as I said earlier, my brain has given up}.

I’ve outrageously stolen Jonathan’s script which encodes a varchar2 string into the relevant numeric and used it as a template to create a decode version too. Just a couple of notes:

- I can imagine this might not work if you have multibyte characters.
- Oracle confuse things by doing something like translating the 15 character into a 36-digit string – and then taking only the first 15 significant digits. This kind of messes up the untranslate.

So, I create three functions:

  • hist_chartonum that converts a varchar2 into something very similar to what is stored in the histogram views. Jonathan will find it incredibly familiar. An optional second paramter turns off the truncation that oracle does for the histogram data, so you can get the full glorious 36 significant digits for 15 characters if you wish.
  • hist_numtochar which converts the numeric back to a string. It gets it slightly wrong due to the truncation of the least significant 20 or so characters. Again, an optional second paramater allows it to not replicate the trunaction and work with all 15 characters.
  • hist_numtochar2 which is an attempt to allow for the truncation errors. I add power(256,9) to the value if it has been truncated. It seems to make eg the numeric representation of AAAAA translate back to AAAAA not AAAA@. Yes, it’s a fudge.

This is the script:-

-- cre_hist_funcs.sql
-- Heavily borrowed from Jonathan Lewis, sep 2003
-- MDW 11/8/09 - all mistakes mine.
-- hist_chartonum converts a string to a number in a very similar way
-- to how Oracle does so for storing in column ENDPOINT_VALUE in XXX_TAB_HISTOGRAMS
-- Optionally get it to not truncate the value in the same way to only 15 digits.
-- hist_numtochar converts the ENDPOINT_VALUE back to human-readable format.
-- Optinally get it to go beyond 15 numerics (7 or 8 characters)
-- hist_numtochar2 attempts to allow for truncation errors.
-- JLs version has been checked against 8.1.7.4, 9.2.0.4 and 10.1.0.2
-- I've used on 10.2.0.3
set timing off
create or replace function hist_chartonum(p_vc varchar2
                                         ,p_trunc varchar2 :='Y') return number
is
m_vc varchar2(15) := substr(rpad(p_vc,15,chr(0)),1,15);
m_n number := 0;
begin
  for i in 1..15 loop
/*  dbms_output.put(ascii(substr(m_vc,i,1)));
    dbms_output.put(chr(9));
    dbms_output.put_Line(to_char( power(256,15-i) * ascii(substr(m_vc,i,1)),
                                '999,999,999,999,999,999,999,999,999,999,999,999'
                                 )
                        ); */
    m_n := m_n + power(256,15-i) * ascii(substr(m_vc,i,1));
  end loop;
-- this converts it from a 36 digit number to the 15-digit number used
-- in the ENDPOINT_VALUE Column.
  If p_trunc = 'Y' then
    m_n := round(m_n, -21);
  end if;
-- dbms_output.put_line(to_char(m_n,'999,999,999,999,999,999,999,999,999,999,999,999'));
return m_n;
end;
/

create or replace function hist_numtochar(p_num number
                                         ,p_trunc varchar2 :='Y') return varchar2
is
  m_vc varchar2(15);
  m_n number :=0;
  m_n1 number;
  m_loop number :=7;
begin
m_n :=p_num;
if length(to_char(m_n))&lt;36 then
--dbms_output.put_line ('input too short');
  m_vc:='num format err';
else
  if p_trunc !='Y' then
    m_loop :=15;
  end if;
--dbms_output.put_line(to_char(m_N,'999,999,999,999,999,999,999,999,999,999,999,999'));
  for i in 1..m_loop loop
    m_n1:=trunc(m_n/(power(256,15-i)));
--    dbms_output.put_line(to_char(m_n1));
    if m_n1!=0 then m_vc:=m_vc||chr(m_n1);
    end if;
    dbms_output.put_line(m_vc);
    m_n:=m_n-(m_n1*power(256,15-i));  
  end loop;
end if;
return m_vc;
end;
/
create or replace function hist_numtochar2(p_num number
                                         ,p_trunc varchar2 :='Y') return varchar2
is
  m_vc varchar2(15);
  m_n number :=0;
  m_n1 number;
  m_loop number :=7;
begin
m_n :=p_num;
if length(to_char(m_n))&lt;36 then
--dbms_output.put_line ('input too short');
  m_vc:='num format err';
else
  if p_trunc !='Y' then
    m_loop :=15;
  else
   m_n:=m_n+power(256,9);
  end if;
--dbms_output.put_line(to_char(m_N,'999,999,999,999,999,999,999,999,999,999,999,999'));
  for i in 1..m_loop loop
    m_n1:=trunc(m_n/(power(256,15-i)));
--    dbms_output.put_line(to_char(m_n1));
    if m_n1!=0 then m_vc:=m_vc||chr(m_n1);
    end if;
    dbms_output.put_line(m_vc);
    m_n:=m_n-(m_n1*power(256,15-i));  
  end loop;
end if;
return m_vc;
end;
/
rem
rem Sample of use:
rem 

col id1 format 999,999,999,999,999,999,999,999,999,999,999,999
col id3 format 999,999,999,999,999,999,999,999,999,999,999,999
select
  'short'  text
 ,hist_chartonum('short') id1
 ,hist_numtochar(hist_chartonum('short')) id2
 ,hist_chartonum('short','N') id3
 ,hist_numtochar(hist_chartonum('short','N'),'N') id4      
from dual;
select
  'alongteststring'  text
 ,hist_chartonum('alongteststring') id1
 ,hist_numtochar(hist_chartonum('alongteststring')) id2
 ,hist_chartonum('alongteststring','N') id3
 ,hist_numtochar(hist_chartonum('alongteststring','N'),'N') id4      
from dual;
select
  'amuchlongerteststring'  text
 ,hist_chartonum('amuchlongerteststring') id1
 ,hist_numtochar(hist_chartonum('amuchlongerteststring')) id2
 ,hist_chartonum('amuchlongerteststring','N') id3
 ,hist_numtochar(hist_chartonum('amuchlongerteststring','N'),'N') id4      
from dual;
spool off

The test at the end produces the following output:-

strng  histgrm_translation
------------------------------------------------
translate_back
------------------------------------------------
untrimmed_translation
------------------------------------------------
untrimmed_translate_back
------------------------------------------------
short  599,232,339,077,851,000,000,000,000,000,000,000
short
 599,232,339,077,850,549,055,205,294,263,500,800
short

alongteststring  505,852,124,009,532,000,000,000,000,000,000,000
alongtm
 505,852,124,009,531,849,435,547,374,314,942,055
alongteststring

amuchlongerteststring  505,872,878,384,947,000,000,000,000,000,000,000
amuchly
 505,872,878,384,946,809,648,430,989,359,543,156
amuchlongertest

Final comment. Oracle trims the value stored in sys.histgrm$ and knackers up the untranslate. Why? Well, this is only me thinking out aloud, but this reduces the number from 36 to 15 significant digits and numerics are stored as one byte per two digits plus an offset byte {am I missing a length byte?}. So 10 bytes or so. This will keep the size of data stored down. I mean, as I posted here, the sys.histgrm table gets big enough as it is!

Tomorrow {or whenever my brain recovers} I’ll show how I use it and one or two oddities I have found or come across from other people’s blogs.

Why is my SYSTEM Tablespace so Big? August 3, 2009

Posted by mwidlake in internals, VLDB.
Tags: , ,
25 comments

How big is your SYSTEM tablespace? 512M? 1GB? 4GB? Bigger? Is it bigger than the whole of the first critical database you ever looked after? Why has the SYSTEM tablespace got so big?

You can easily check the size of the SYSTEM tablespace with any oracle development/admin GUI or with this script.

@dbf_lst
Enter value for ts_name: system
old   8: where tablespace_name like upper('&ts_name'||'%')
new   8: where tablespace_name like upper('system'||'%')
continue> 

FILE_NAME                                  F_ID  TS_NAME         SIZE_M   THOU_BL
----------------------------------------- ----- ---------------- -------- -------
+DISCG/sid/datafile/system.272.596722817      1 SYSTEM             24,920   3,190

That is 24GB. It is not the biggest one I have seen, but it is up there in the top 3.

Not so many versions ago, it was not unusual to have a system tablespace of less then 128MB in size. In fact, if you did not use stored PL/SQL this was probably the case up to and through 8i and into 9i. So – it’s PL/SQL causing the bloat? No, probably not. Ahhh, what about all that AWR automatic real-time monitoring gubbins you hear so much about? Nope, that goes into SYSAUX.

No, it’s probably down to one or both of two internal tables, SYS.AUD$ and SYS.HISTGRM$.

Let’s check out what are the biggest objects in this particular SYSTEM tablespace:

select owner,segment_name,segment_type
 ,bytes/(1024*1024) size_m
 from dba_segments
 where tablespace_name = 'SYSTEM'
 and    bytes/(1024*1024) &gt;1
 order by size_m desc

OWNER        SEGMENT_NAME         SEGMENT_TYPE         SIZE_M
------------ -------------------- ------------------ --------
SYS          C_OBJ#_INTCOL#       CLUSTER              13,313
SYS          AUD$                 TABLE                 5,688
SYS          I_H_OBJ#_COL#        INDEX                 1,863
SYS          I_AUD1               INDEX                 1,606
SYS          HIST_HEAD$           TABLE                   311
SYS          SOURCE$              TABLE                   224
SYS          IDL_UB1$             TABLE                   224
SYS          C_FILE#_BLOCK#       CLUSTER                 208
SYS          INDPART$             TABLE                   160
SYS          OBJ$                 TABLE                   144
SYS          I_HH_OBJ#_COL#       INDEX                   128
SYS          I_HH_OBJ#_INTCOL#    INDEX                   128
SYS          I_OBJ2               INDEX                    80
SYS          I_FILE#_BLOCK#       INDEX                    62
SYS          TABPART$             TABLE                    57
SYS          I_OBJ1               INDEX                    49
{snip}

To be a little obtuse, I’ll start with the second largest object, SYS.AUD$, which you can see is getting close to 6GB in size {Oh, and SYS.I_AUD1, the fourth largest object, is the PK index on AUD$}.

AUD$ is the underlying table that holds all of the system auditing information that is turned on and off with the SQL AUDIT command. You can use AUDIT to monitor particular users, changes to system priveleges, access on key tables or particular activity like table changes. Go check the SQL manual if you want to know more.
The records for this captured activity all go into this table. On this particular system a few key priveleges and logons by certain users are the only things being audited, but the data has built up over 3 or 4 years.

On version 9 and 10 the options for managing this table are…poor. Bascially, you have to do it yourself. I think I am right to say that it is the only SYS-owned table that Oracle Corp say it is OK for you to move into another tablespace yourself and that you are allowed to delete and truncate data from. If you AUDIT a lot of things or activities you probably need to delete some data. {If you AUDIT a lot of things or activities and you don’t delete any of the data, you probably did not turn on AUDIT very long ago or have not checked the size of your SYSTEM tablespace for a while…}.

Oh, and one final thing on the AUD$ table. It is lacking indexes – it has only one {ignoring LOB indexes}. This is a good thing as indexes slow down inserts quite significantly {I use a rule of thumb that if it takes “N“ms to insert a record into a table, it will take “2.2*N“ms if there is an index on the table, “3.4*N“ms if there are two indexes, “4.6*N“ms if there are three… Get the idea? However, I have not checked this for real in years. Treat this statement as a potential myth}. I would recommend you do not add further indexes to AUD$. If you need to investigate the information in there, I would strongly suggest you extract what you want into a new table with a CTAS statement, then index that copy table and do your work on that. You do not want to slow down further everything that is already slowed down a little by being audited.

What about the largest thing in the SYSTEM tablespace? It’s SYS.C_OBJ#_INTCOL# nd is over 13GB in size. I said it would be SYS.HISTGRM$ didn’t I?

Well, SYS.C_OBJ#_INTCOL# is a cluster. Clusters are structures, segments, that hold one or more tables. All records for the same key across all those tables are held together, in the same block or blocks. The key is either a normal index or a hash. The idea is that it speeds up selecting records from different tables with the same key – they are all in the same block or blocks. It does work too, but they seem to have fallen out of favour with Oracle designers, but are worth considering.

In this particular cluster there is only one table, SYS.HISTGRM$:

select owner,table_name
from dba_tables
where cluster_name='C_OBJ#_INTCOL#'

OWNER                          TABLE_NAME
------------------------------ ------------------------
SYS                            HISTGRM$
1 row selected.

So, SYS.HISTGRM$ is the only table in the cluster. It {and it’s cluster index, on columns OBJ# and INTCOL#} are hiding in this cluster segment. It IS the biggest thing in this particular SYSTEM tablespace. The only index, other than the cluster index, on the table SYS.HISTGRM$ is I_H_OBJ#_COL# and is the third largest object in the SYSTEM tablespace.

--ind_cols
select
INDEX_OWNER                 ind_owner
,INDEX_NAME                  ind_Name
,TABLE_NAME                  tab_Name
,substr(column_position,1,3) Psn
,COLUMN_NAME                 Col_Name
from dba_ind_columns
where table_name like upper(nvl('&tab_name','WHOOPS')||'%')
order by 3,1,2,4,5
--eof
@ind_cols
Enter value for tab_name: histgrm$

IND_OWNER   IND_NAME           TAB_NAME           PSN       COL_NAME
----------- ------------------ ------------------ --------- --------------
SYS         I_H_OBJ#_COL#      HISTGRM$           1         OBJ#
                                                  2         COL#
2 rows selected.
Elapsed: 00:00:02.07

-- you find the cluster index by looking on DBA_INDEXES and
--DBA_IND_COLUMNS where the "TABLE_NAME" is the name of the cluster
@ind_cols
Enter value for tab_name: C_OBJ#_INTCOL#

IND_OWNER   IND_NAME           TAB_NAME           PSN       COL_NAME
----------- ------------------ ------------------ --------- --------------
SYS         I_OBJ#_INTCOL#     C_OBJ#_INTCOL#     1         OBJ#
                                                  2         INTCOL#
2 rows selected.
Elapsed: 00:00:00.93

What is HISTGRM$? Do you think it looks like it could be something to do with column Histograms? Well it is, it’s the underlying data dictionary table that holds column histograms. It started to become common to see a large HISTGRM$ table with Oracle 9 systems with many partitions {and of course subpartitions, if used}, where statistics were gathered on these partitioned tables aggressively and included a METHOD_OPT clause that causes column statistics to be collected. eg FOR ALL INDEXED COLUMNS SIZE 200 or FOR ALL COLUMNS SIZE 100.

Where most people have seen it balloon in size though is with Oracle 10 and the introduction of…the Automated Stats gathering job {so, there is a connection to previous recent posts :-) }.
Every night, every weekend, any partition or sub-partiton that is 10% or more different to last time stats were gathered, get stats gathered. The automatic job gathers states with METHOD_OPT=AUTO. Which means your Oracle software decides which columns to gather histogram stats on and how many buckets it collects.

Whoever wrote that part of the stats gathering feature really liked to gather information.

If you have lots of partitions {or sub-partitions} and gather stats with the METHOD_OPT=AUTO, the two factors added together result in a massive HISTGRM$ table and thus a massive cluster.

Is space important? No, of course not, disc acreage is cheap, just buy bigger discs ;-). {That, of course, was a highly questionable suggestion}.
Being serious, I’m personally not too concerend about my SYSTEM tablespace being that big, after all compared to the size of databases that use thousands of partitions, 32GB is usually not a major concern, but I know of people who did not expect their system tablespace to get that big and have issues with simple running out of strage available to them.

I am more concerned about the impact on parse times and general data dictionary perforance of such large objects. I’ve found the data dictionary can be very slow for some queries when there are large numbers of extents, objects, columns…Histograms are just in there with the mix. I’ve had to work around such slow data dictionary access several times in the past and I suspect {though have not had time to in any way support this with evidence} that parsing is slowed down by unneccessary histograms. As an example of this, which actually has nothing to do with SYS.HISTGRM$, if you look back up this post at the code listing the indexes on the cluster and table, the code took between a second and two seconds to run. Normally it takes less than 1/10 of a second.

Can you do anything about the size of these objects? Well, with SYS.AUD$ Oracle allow you to delete from it and move the table to a new tablespace. If you use the SQL AUDIT command, you should include housekeeping the SYS.AUD$ table in your overall implementation.

SYS.C_OBJ#_INTCOL# and SYS.HISTGRM in it are slightly trickier. It is not easy to reduce the space used by these objects and thus the SYSTEM tablespace once they have grown large, but you can stop them getting bigger by collecting fewer column histogram statistics on your partitioned tables. Obviously that is not as simple as it sounds, but many sites with large numbers of objects do turn off the automated stats gathering and run their own code.

Automatic Statistics Gathering Fails #3 July 29, 2009

Posted by mwidlake in internals, performance.
Tags: , ,
7 comments

I’m steeling a few minutes from my lunch time to finish off this thread on the automated statistics gathering failing.
<<first post<second post

There is another way to spot what the automated stats gathering choked on but it is not that reliable. But it does involve digging a little into Oracle’s internals, which is fun.

Under 10.2 onwards, when you gather stats on a segment, either directly or via a schema, database or the automatic stats collection, Oracle stores the stats that are replaced for you automatically {I say 10.2 as I am not sure that segment-level gather_table/index_stats are automatically backed up like this. I have some old notes saying it seemed not but I am not sure if this was 9i, 10.1 or me just getting confused at the time. If anyone knows or can check back, let me know :-) }

This means you can get the stats back using the dbma_stats.restore_xxxxxxxx_stats procedures. eg dbms_stats.restore_table_stats(ownname=>user,tabname=>’TEST_P’,as_of_timestamp=>systimestamp-1);
This will restore the stats of my table TEST_P to what they were at this time yesterday. You do not need to create a stattab table and store the prevous stats manually.
I’m not going to blog any more right now about this handy feature, the “PL/SQL packages and types” manual will tell you what you need, but I will comment that by default you can only go back 31 days.

Oracle gives you a table to see the history of stats stored, DBA_TAB_STATS_HIST

desc dba_tab_stats_history
 Name                                      Null?    Type
 ----------------------------------------- -------- ----------------------------
 OWNER                                              VARCHAR2(30)
 TABLE_NAME                                         VARCHAR2(30)
 PARTITION_NAME                                     VARCHAR2(30)
 SUBPARTITION_NAME                                  VARCHAR2(30)
 STATS_UPDATE_TIME                                  TIMESTAMP(6) WITH TIME ZONE

select owner,table_name, partition_name,stats_update_time
from dba_tab_stats_history
where table_name='TEST_P'  and owner=user

OWNER    TABLE_NAME PARTITION_NAME  STATS_UPDATE_TIME
-------- ---------- --------------- --------------------------------
WIDLAKEM TEST_P                     28-JUL-09 13.59.43.647785 +00:00
WIDLAKEM TEST_P     ID_10K          28-JUL-09 13.59.43.647785 +00:00
WIDLAKEM TEST_P     ID_30K          28-JUL-09 13.59.43.647785 +00:00
WIDLAKEM TEST_P     ID_MAX          28-JUL-09 13.59.43.647785 +00:00
WIDLAKEM TEST_P     ID_40K          28-JUL-09 13.59.43.647785 +00:00
WIDLAKEM TEST_P     ID_20K          28-JUL-09 13.59.43.647785 +00:00
WIDLAKEM TEST_P                     28-JUL-09 13.37.42.043557 +00:00
WIDLAKEM TEST_P     ID_MAX          28-JUL-09 13.37.42.043557 +00:00
WIDLAKEM TEST_P     ID_40K          28-JUL-09 13.37.42.043557 +00:00
WIDLAKEM TEST_P     ID_30K          28-JUL-09 13.37.42.043557 +00:00
WIDLAKEM TEST_P     ID_20K          28-JUL-09 13.37.42.043557 +00:00
WIDLAKEM TEST_P     ID_10K          28-JUL-09 13.37.42.043557 +00:00
WIDLAKEM TEST_P                     28-JUL-09 09.11.40.098445 +00:00
WIDLAKEM TEST_P     ID_MAX          28-JUL-09 09.11.40.098445 +00:00
WIDLAKEM TEST_P     ID_40K          28-JUL-09 09.11.40.098445 +00:00
WIDLAKEM TEST_P     ID_30K          28-JUL-09 09.11.40.098445 +00:00
WIDLAKEM TEST_P     ID_20K          28-JUL-09 09.11.40.098445 +00:00
WIDLAKEM TEST_P     ID_10K          28-JUL-09 09.11.40.098445 +00:00
 

You can see that I probably gathered stats on my test table 3 times yesterday, each time I gathered at a granularity of all, ie global and partition stats. Note also, all partitions and the table get the same timestamp. I think this is because oracle records the timestamp as when the stats for the set of segments was swapped into the data dictionary {note, not started}.

That’s quite useful. However, there is no similar view for indexes. That’s a shame. But you can recover index stats and this view is saying nothing about the stats as they were?…Is there more to be found?… Let’s go see what that view is looking it

@vw_txt
Enter value for vw_name: dba_tab_stats_history
OWNER    VIEW_NAME                      TEXT_LENGTH
-------- ------------------------------ -----------
TEXT
--------------------------------------------------------------------------
SYS      DBA_TAB_STATS_HISTORY                  876
select u.name, o.name, null, null, h.savtime
  from   sys.user$ u, sys.obj$ o, sys.wri$_optstat_tab_history h
  where  h.obj# = o.obj# and o.type# = 2 and o.owner# = u.user#
  union all
  -- partitions
  select u.name, o.name, o.subname, null, h.savtime
  from   sys.user$ u, sys.obj$ o, sys.wri$_optstat_tab_history h
  where  h.obj# = o.obj# and o.type# = 19 and o.owner# = u.user#
  union all
  -- sub partitions
  select u.name, osp.name, ocp.subname, osp.subname, h.savtime
  from  sys.user$ u,  sys.obj$ osp, obj$ ocp,  sys.tabsubpart$ tsp,
        sys.wri$_optstat_tab_history h
  where h.obj# = osp.obj# and osp.type# = 34 and osp.obj# = tsp.obj# and
        tsp.pobj# = ocp.obj# and osp.owner# = u.user#
  union all
  -- fixed tables
  select 'SYS', t.kqftanam, null, null, h.savtime
  from  sys.x$kqfta t, sys.wri$_optstat_tab_history h
  where
  t.kqftaobj = h.obj#

Now that is interesting. Let’s go look at that table:

desc sys.wri$_optstat_tab_history
 Name                                      Null?    Type
 ----------------------------------------- -------- ---------------------------
 OBJ#                                      NOT NULL NUMBER
 SAVTIME                                            TIMESTAMP(6) WITH TIME ZONE
 FLAGS                                              NUMBER
 ROWCNT                                             NUMBER
 BLKCNT                                             NUMBER
 AVGRLN                                             NUMBER
 SAMPLESIZE                                         NUMBER
 ANALYZETIME                                        DATE
 CACHEDBLK                                          NUMBER
 CACHEHIT                                           NUMBER
 LOGICALREAD                                        NUMBER
 SPARE1                                             NUMBER
 SPARE2                                             NUMBER
 SPARE3                                             NUMBER
 SPARE4                                             VARCHAR2(1000)
 SPARE5                                             VARCHAR2(1000)
 SPARE6                                             TIMESTAMP(6) WITH TIME ZONE

Some of those columns look interesting. Unforturnately CACHEDBLK onwards are empty, even in version 11 {but it shows that Oracle have built in the ability to use those yet-to-be used columns about average cached blocks and cache hit ratios for segments you might have spotted in eg DBA_TAB_STATISTICS}.
Could there be an index version? Of course there is :-

desc sys.wri$_optstat_ind_history
 Name                                      Null?    Type
 ----------------------------------------- -------- ---------------------------
 OBJ#                                      NOT NULL NUMBER
 SAVTIME                                            TIMESTAMP(6) WITH TIME ZONE
 FLAGS                                              NUMBER
 ROWCNT                                             NUMBER
 BLEVEL                                             NUMBER
 LEAFCNT                                            NUMBER
 DISTKEY                                            NUMBER
 LBLKKEY                                            NUMBER
 DBLKKEY                                            NUMBER
 CLUFAC                                             NUMBER
 SAMPLESIZE                                         NUMBER
 ANALYZETIME                                        DATE
 GUESSQ                                             NUMBER
 CACHEDBLK                                          NUMBER
 CACHEHIT                                           NUMBER
 LOGICALREAD                                        NUMBER
 SPARE1                                             NUMBER
 SPARE2                                             NUMBER
 SPARE3                                             NUMBER
 SPARE4                                             VARCHAR2(1000)
 SPARE5                                             VARCHAR2(1000)
 SPARE6                                             TIMESTAMP(6) WITH TIME ZONE

I’ve written a couple of scripts to extract data from these underlying tables, with the extra data you can grab. Use these links to download the table script and the index script.
Here below is the index script though:-

-- show_ish_full
-- Martin Widlake 14/4/08
-- this gets the index stats history - there is no index equiv of dba_tab_stats_hist
-- that I can see in 10.2 (or 11)
--
col owner form a12
col ind_full_name form a40
col stat_upd_time form a15
col blevel form 99 head bl
select owner,ind_full_name,
to_char(stat_upd_time,'MMDD HH24:MI:SS') stat_upd_time
,numrows,numleafs,blevel,dist_kys,lf_p_ky, dbl_p_ky,clufac,samp_size,obj_hash
,greatest(nvl(samp_size,1),1)/greatest(nvl(numrows,1),1) samp_pct
 from (
select u.name owner, o.name ind_name,o.name ind_full_name
, h.savtime stat_upd_time
,h.rowcnt numrows,h.leafcnt numleafs, h.blevel blevel,h.distkey dist_kys
,h.lblkkey lf_p_ky,h.dblkkey dbl_p_ky,h.clufac clufac, h.samplesize samp_size,h.obj# obj_hash
  from   sys.user$ u, sys.obj$ o, sys.wri$_optstat_ind_history h
  where  h.obj# = o.obj# and o.type# = 1 and o.owner# = u.user#
  union all
  -- partitions
  select u.name owner, o.name ind_name, o.name||'-'||o.subname ind_full_name
, h.savtime stat_upd_time
,h.rowcnt numrows,h.leafcnt numleafs, h.blevel blevel,h.distkey dist_kys
,h.lblkkey lf_p_ky,h.dblkkey dbl_p_ky,h.clufac clufac, h.samplesize samp_size,h.obj# obj_hash
  from   sys.user$ u, sys.obj$ o, sys.wri$_optstat_ind_history h
  where  h.obj# = o.obj# and o.type# = 20 and o.owner# = u.user#
  union all
  -- sub partitions
  select u.name owner, osp.name ind_name, osp.name||'-'||ocp.subname||'='|| osp.subname ind_full_name
, h.savtime stat_upd_time
,h.rowcnt numrows,h.leafcnt numleafs, h.blevel blevel,h.distkey dist_kys
,h.lblkkey lf_p_ky,h.dblkkey dbl_p_ky,h.clufac clufac, h.samplesize samp_size,h.obj# obj_hash
  from  sys.user$ u,  sys.obj$ osp, obj$ ocp,  sys.indsubpart$ tsp,
        sys.wri$_optstat_ind_history h
  where h.obj# = osp.obj# and osp.type# = 35 and osp.obj# = tsp.obj# and
        tsp.pobj# = ocp.obj# and osp.owner# = u.user#
) where ind_name like upper(nvl('&indname','whoops')||'%')
and owner like upper('&indown')||'%'
order by owner,ind_name,stat_upd_time desc
/
clear colu

I’ll leave you to play with these as you wish, especially as I ran out of time to finish this blog an hour ago and now owe my current employer some of my weekend in return, but I’ll finish off with a use for the scripts.

One thing to bear in mind, these tables are holding the stats that were replaced, not the stats that were created. So they hold historical information.

If you read the earlier posts, you may remember that the automatic stats job can choke and fail on extremely large segments. Well, sometimes {and I have to admit, only sometimes} you can spot what segments it failed on. That is because, at the end of the window when the job gets stopped, it attempts to clean up after itself. It appears to check for stats that have changed during the run but it failed to process correctly and resets them. It copes the stats back. So you sometimes see this:-

@show_tsh_full
OWNER        TAB_FULL_NAME                            STAT_UPD_TIME           NUMROWS    NUMBLKS AVG_RL  SAMP_SIZE OBJ_HASH                           SAMP_PCT
------------ ---------------------------------------- -------------------- ---------- ---------- ------ ---------- --------                           --------
TABOWNER1   HUGETABLE4                               0522 06:00:01        1473990266   27319245    128    5660375    54714                               .004
TABOWNER1   HUGETABLE4                               0522 05:48:22        1327119100   24544873    128   13271191    54714                               .010

In the above case, you can see that this massive table had stats set at 5:48am and then again at 6:00.01am. That second record is the automated stats swapping the last known good stats back in place. That is the segment it choked on.

How do you stop the automatic stats job chocking on this extent? You do a manual gather of course. You will see the above job attempted a 0.1% sample size that almost worked at 05:48. The value it swapped back is 0.04 – which was the sample size of the manual gather I did a few days earlier to get over the last choke. {I also tweaked the histograms I gathered to save time}. I had not yet finished converting the manual fix to an automated one when I recorded the above.

Why do I think that the automatic job sets the timestamp in the WRI$_OPTSTAT_xxx_HIST tables when it swaps in the new stats and not when it starts? Because of the time of the entries after a large object has been analysed. Why do I think it checks for stats that have changed since the start of the run and replaces them rather than keeping track of the information as it goes? Because, when I first came across this choking issue, I was veryifying the problem after 10pm at night. When I realised there was an issue at around 1am, I started manually gathering stats. It took me until 4am, I checked they were in place and went to bed. Imagine my bad temper the next day when I found out that the automatic job had gone and re-wound most of the stats I had gathered, resetting them to what they had been at 10am the previous night. It was a seriously bad temper.

Addition – I’ve had a comment that may or may not be spam, saying they do not understand “the last bit” (If you are a real person molamola, I apologise). Reading it, it is maybe not clear, so this is what happened and why I was angry :-)

1) I had realised the auto stats job was failing overnight.
2) I logged on from home at just before 10pm and watched the system
3) at 10pm the auto stats job started and collected stats on some small tables and indexes
4) around 11,11:30pm I realised is was not progressing,
5) from 11:30 to 3am in the morning, I created my own gather_table_stats and gather_index_stats statements and ran them.
6) at around 4am I went to bed, having checked stats were gathered and execution plans were good.
7) Got to work around 9am, a bit tired.
8) Found lots of performance issues due to old stats.
9) Shouted randomly at people and got angry, started re-gathering stats.
10) Worked out that at 6am, when the auto stats job finished, all the tables/indexes IT had intended to gather stats on but had failed to do so, it reset the stats to what they had been set to at 10pm the previous night, when it started.
ie very old and bad stats.

Follow

Get every new post delivered to your Inbox.

Join 166 other followers