Format of Date Histograms February 6, 2010
Posted by mwidlake in internals.Tags: data dictionary, histograms, statistics
9 comments
Oracle annoyingly handles dates internally in several formats. Dates are stored in tables as seven bytes, each byte representing century, year-in-century, month, day, hour, minute and second, shifted by different amounts. For the high/low values seen in DBA_TAB_COLUMNS or DBA_TAB_COL_STATISTICS, it is stored as a RAW value, where a two-digit hex string represents the century, year-in-century,month etc as before – see this post on decoding high and low column values and check out the comments for corrections.
So what about histograms? You might want to know what is in the histogram for a date column in order to better understand the decisions made by the CBO. Below, I pull out the histogram for an example date column {I’ve trimmed the output a little to save space}:
select table_name ,column_name ,endpoint_value end_val ,endpoint_number rowcount from all_tab_histograms where table_name ='TEST_TABLE_X' and owner ='TEST_1' and column_name = 'PLACED_DT' order by endpoint_number TABLE_NAME COLUMN_NAME END_VAL ROWCOUNT --------------- ------------- ------------- --------------- TEST_TABLE_X PLACED_DT 2,452,258 0 TEST_TABLE_X PLACED_DT 2,454,334 1 TEST_TABLE_X PLACED_DT 2,454,647 2 TEST_TABLE_X PLACED_DT 2,454,737 3 TEST_TABLE_X PLACED_DT 2,454,820 4 TEST_TABLE_X PLACED_DT 2,454,867 5 TEST_TABLE_X PLACED_DT 2,454,929 6 TEST_TABLE_X PLACED_DT 2,454,981 7 TEST_TABLE_X PLACED_DT 2,455,006 8 TEST_TABLE_X PLACED_DT 2,455,024 9 TEST_TABLE_X PLACED_DT 2,455,039 10 TEST_TABLE_X PLACED_DT 2,455,050 11 ... TEST_TABLE_X PLACED_DT 2,455,205 42 TEST_TABLE_X PLACED_DT 2,455,207 43 TEST_TABLE_X PLACED_DT 2,455,209 44 TEST_TABLE_X PLACED_DT 2,455,211 45 TEST_TABLE_X PLACED_DT 2,455,213 46 TEST_TABLE_X PLACED_DT 2,455,215 47 TEST_TABLE_X PLACED_DT 2,455,216 48 TEST_TABLE_X PLACED_DT 2,455,218 49 TEST_TABLE_X PLACED_DT 2,455,220 50 TEST_TABLE_X PLACED_DT 2,455,221 51 TEST_TABLE_X PLACED_DT 2,455,222 52 TEST_TABLE_X PLACED_DT 2,455,222 53 TEST_TABLE_X PLACED_DT 2,455,222 54 TEST_TABLE_X PLACED_DT 2,455,222 55 TEST_TABLE_X PLACED_DT 2,455,223 56 TEST_TABLE_X PLACED_DT 2,455,223 57 TEST_TABLE_X PLACED_DT 2,455,223 58 TEST_TABLE_X PLACED_DT 2,455,223 59 ... TEST_TABLE_X PLACED_DT 2,455,223 69 TEST_TABLE_X PLACED_DT 2,455,223 70 TEST_TABLE_X PLACED_DT 2,455,223 71 TEST_TABLE_X PLACED_DT 2,455,223 72 TEST_TABLE_X PLACED_DT 2,455,223 73 TEST_TABLE_X PLACED_DT 2,455,224 74 TEST_TABLE_X PLACED_DT 2,455,226 75
Well, it looks like a seven digit number so it must be representing the date in a similar way to as described above, yes? No. The format is obviously something different {Oh come ON Oracle, some internal standards would be nice once in a while!
}
I pulled out the minimum and maximum values from the DBA_TAB_COLUMNS table and translated them:-
COLUMN_NAME NUM_DIST LOW_V HI_V ------------ ---------- ------------------- ------------------- PLACED_DT 375,428 2001-12-13 17:31:38 2010-01-28 23:51:38
So the above low values and high values will pretty much match the first and last values in the histograms, which are 2452258 and 2455226. Any guesses? Those values from the histogram are very close to each other, only 2968 different to cover around 9 years of values. 9 times 365…
The histogram is representing the dates in Julian format, ie number of days since 1st Jan 4712BC. As a quick proof of this:
select to_date('2452258','J'),to_date('2455226','J')
from dual;
TO_DATE('2452258' TO_DATE('2455226'
----------------- -----------------
14-DEC-2001 00:00 29-JAN-2010 00:00
Well, what do you know. Very close to the 13th Dec 2001 and 28th Jan 2010
This of course makes sense, storing the date as an ascending numeric means it is simple to calculate the width of the range and how many values per day there are.
Imagine how complex it would be to do this if the date was stored in the bizarre way we humans deal with it – an ascending numeric for year, a cycling 12 digit number for month and a varying cyclic number for the day of the month. And that is ignoring other calendars in common use.
However, I’m looking at this method of representing the date and something bothers me. There is no allowance for the time portion. Maybe column histograms can’t cope with time? I feel a couple of tests coming on…
ADDITIONAL.
If you have seen the comments, you will know that Jonathan commented to ask if I might have truncated/been selective with the data in creating my test table as he is sure there is a fractional section representing the time portion.
Well, I did say I needed to check further so maybe I would have spotted my error on my own… ![]()
This is the script I use to pull out histogram data (and yes, I trimed the output from the script before copying it into this post, so it does not quite match the script):
-- chk_hist.sql
-- Martin Widlake 7/4/4
-- pull out the histograms on a table
set pause on pages 32
spool chk_hist.lst
col owner form A8
col table_name form A15
col column_name form a20
col rowcount form 99,999,999,999
col end_val form 99,999,999,999
select owner
,table_name
,column_name
,endpoint_value end_val
,endpoint_number rowcount
from all_tab_histograms
where table_name like upper(nvl('&tab_name','WHOOPS')||'%')
and owner like upper('&tab_own')||'%'
and column_name like upper('&col_name')||'%'
order by table_name,column_name,endpoint_number
/
spool off
clear colu
Hmm wonder what happens if I change the “col end_val form 99,999,999,999″ so that it would show fractions…
col end_val form 999,999,999.99999999
TABLE_NAME COLUMN_NAM END_VAL ROWCOUNT --------------- ---------- --------------------- ------------ TEST_TABLE_X PLACED_DT 2,452,257.73030093 0 TEST_TABLE_X PLACED_DT 2,454,333.76546296 1 TEST_TABLE_X PLACED_DT 2,454,647.32561343 2 TEST_TABLE_X PLACED_DT 2,454,737.25017361 3 TEST_TABLE_X PLACED_DT 2,454,820.02204861 4 TEST_TABLE_X PLACED_DT 2,454,866.98009259 5 TEST_TABLE_X PLACED_DT 2,454,928.66848380 6 TEST_TABLE_X PLACED_DT 2,454,980.94815972 7 TEST_TABLE_X PLACED_DT 2,455,005.68413194 8 TEST_TABLE_X PLACED_DT 2,455,023.67142361 9 TEST_TABLE_X PLACED_DT 2,455,039.03236111 10 TEST_TABLE_X PLACED_DT 2,455,050.39246528 11
Ahhhhh……
Thanks Jonathan ![]()
It is still a Julian date, but with the time as a fraction as he said.
Friday Philosophy – Statistically Significant November 27, 2009
Posted by mwidlake in performance, VLDB.Tags: Humour, statistics
3 comments
There are very few generalist Oracle DBAs around, and very very very few expert generalist Oracle DBAs. Tom Kyte might count, but I’m sure if you pressed him he would admit to knowing nothing about {some key aspect of being a general DBA}. That is because to be an expert on something you have to spend a lot of time on it. Oh, and learn. {I’ve spent a lot of time around my wife but she still confuses me, I just don’t seem to be able to learn}. Oracle is simply too wide a topic to be able to spend a lot of time on all of it, even if by “Oracle” you mean just the core RDBMS technology, which I do. You have to pick an area.
Pete Finnigan specialises in security, Julian Dyke on RAC and Internals, Doug Burns on the relationship between the database and cuddly toys. Oh and ASH/AWR.
So I ask myself, what is my specialty?
Well, if I go on the last 5 working years of my life, it would probably be Oracle Database Statistics. Which is quite ironic given my woeful attempts with Statistics when I tried that maths ‘A’ level all those years back {for non-UK people, an ‘A’ level is what you do at age 17- 18. It’s that point in maths when most of us are convinced logic is replaced by magic}. I’ll go further and say I specialise in Oracle Database Statistics on VLDBS. Maybe even more specific, Gathering and Maintaining Oracle Database Statistics on VLDBs.
Not a very sexy-sounding specialty is it, even in the context of IT technical specialties. I am sure that if I was to tell a lady I wish to impress that I was “a specialist in gathering and maintaining Oracle database statistics on VLDBs” then I would soon be standing alone as she looked for an accountant to talk to {I refer back to my comment on my wife, I repeatedly try to impress her with this line and she never fails to walk away}. Heck, I could spend all my time at the UKOUG Conference next week and struggle to find someone who would respond positively to such a claim to greatness.
But the situation is that I have had to deal with the failures and idiosyncrasies of this area of Oracle for 4 major clients on a dozen or so systems and have discussed it with half a dozen other people who have had challenging times with it. And even now it trips me up all the time. Because, frankly, some if it is not very well implemented (choking automated stats gathering job anyone?), different parts work in different ways (if you state a statid, statown and stattab when you SET_TABLE_STATS the values go into the stats table, if you state them for GATHER_TABLE_STATS, the gathered values go into the data dictionary and the OLD ones go into the stats table – yeah, I know, if you did not know that you are now going ?huh?), some of it is wrapped up in a blanket of confusion and secrecy (what exactly DOES the automated stats job do and when you say “GATHER AUTO” what exactly will Oracle do automatically?).
Thankfully the secrecy side is reducing as Oracle (and others) say more about how some of these things are controlled or decided “automatically” , but then the world shifts under your feet with new versions of Oracle. Stats gathering under 11g is far more open to control than 10, but as my current client is not on 11g then I can’t spend too long looking at that.
So currently I am a expert in 10g Gathering and Maintaining Oracle Statistics on VLDBs. Now if that is not pretty damned specialist then I don’t know what is.
I should blog technical details on all of this {sadly I know the answers to the things I mention above}, but I suspect people would go “Hmmm, interesting…..” in that kind of “who in heck needs to know that! I’m off to do block dumps of temporary segments” way. But if you think otherwise, let me know.
Besides, I am being very, very poor at getting on with the Partition stuff I want to do, even though I have half-written about 4 more posts on it. I need to stick to that for my technical blogs for now.
That and I kind of shy away from being an expert in such an unexciting area, I might get offered {oh dear lord not more} work in it.
Thankfully I also specialise in beta testing new features of Oracle for clients in the travel and media industries, where on-site work in exotic locations is, at times, required… If anyone has any opening in that field, drop me a line
Depth of Indexes on VLDBs November 18, 2009
Posted by mwidlake in performance, Uncategorized, VLDB.Tags: Architecture, partitions, statistics
3 comments
In a couple of recent posts on partitions, on single row and range scan access I have started exploring the performance of partitions. I will continue on the main theme soon, honest.
I feel a little bad that I have not given concrete examples so far {creating very large objects for the sake of demonstrations and proof-of-concepts is a bit of an issue with VLDBs, due to the space and time requirements to do it}. So, here is just some real-world information on index depth, or BLEVEL of indexes on Very Large DataBases.
The BLEVEL is the number of levels in the index. Minus one. See this post for a full description, but in brief, if an index has a BLEVEL of 3, that means there is a root node, a level of branch nodes, a second level of branch nodes and then the leaf nodes. 4 levels. 4 IOs to check if the index holds the value you are looking for. And then one or more IOs against the table to get the actual records from the table, if required (one block if there is only one row, one or more blocks if there are several rows).
The below is from a 25TB datawarehouse system. I am looking for the maximum index depth on the database:-
select max(blevel) from dba_indexes
MAX(BLEVEL)
———–
4
OK, how many indexes are this deep?…
select owner,index_name from dba_indexes
where blevel=4
OWNER INDEX_NAME
—————————— ———————-
ERIC W_IIIIIIII_URLS_PK
1 row selected.
Oh. I have a pretty massive database and the deepest index I have is BLEVEL 4 and there is one of them. So the index height is 5. And access to that table for a given referral ID is going to take 5 IOs to work down the index and one against the table.
What is more, the index is an IOT
INDEX_NAME INDEX_TYPE
—————————— —————————
W_REFERRAL_URLS_PK IOT – TOP
so, in fact, to get to the table record it is just the 5 IOs as the table record is held in the IOT.
So with a 25TB data warehouse, the deepest normal indexes are BLEVEL 3.
select index_name,num_rows/leaf_blocks from dba_indexes where BLEVEL =3 order by num_rows desc INDEX_NAME NUM_ROWS LEAF_BLOCKS ------------------------------ ---------------- ------------ W_WL_SH_API_XIIIIIIIIIII_ID 4,585,108,192 13,406,015 W_WL_SHIIIIIIIIIIIXXXX_ID 4,564,350,002 16,892,898 W_WL_SIIIIIIIIIIIIIIIIIIIIII_ID 4,422,775,820 17,189,536 W_WL_SHIIIIIIIIIIIIIIIIIIIIII_ID 4,182,087,545 12,243,438 W_WL_GX_RESIIIIIIIIIII_LOG_ID1 3,690,374,216 14,613,388 IDX_W_WL_LIIIIIIIIIIIESS_ID 351,507,905 1,958,109 WL_LOGIN_SUIIIIIIIIIIIIE_ID 348,004,861 1,550,180 IND_IIII_LEG_ID 312,727,000 972,690 IND_TMLR_MARIIIIIIIIIII_ID 306,988,247 1,121,592 PK_TBL_IIIIIIIIIIIUNNER 305,566,240 939,775 PK_IIIIIIIIIIIGS 284,208,000 863,000
I asked an old friend, Tony Webb, to check out the depth of indexes on the largest databases they have, which range from a few TB to several 10s of TB. {one is 100TB plus, but a lot of the data is in LOBs, so the actual number of records is only billions, not 10′s of billions
} No index had a BLEVEL greater than 3.
Hi Martin,
I ran the first query on DAVE, DEE(our biggest db), DOZY and MICK. The max depth results were 3,2,3 and 3. Our largest database only had 2. In fact some of the other largish dbs also only have a max depth of 2.
That enough for your blog Martin?
Now, some of you may be smelling a Rat, and you would be right. At both sites, the VLDBs have been designed with partitioning as a key part of the Physical Implementation. All the largest objects are partitioned. What impact does partitioning have on BLEVEL?
The depth of the index segment is dependent on the number of keys per block.
Let us say you have 8k blocks {this is a questionably low block size for a data warehouse, but it is what some of these systems have} and your index is on a couple of columns, totalling 20 bytes per key on average, including rowid. Allowing block overhead, that is approx 8000/20 entries = 400 max….
Blow that, that’s whooly theory, I’ll look at my real 25TB database for number of rows per leaf block…
select leaf_blocks, num_rows\leaf_blocks rows_per_leaf from dba_ind_statistics where num_rows > 100000000 order by leaf_blocks desc LEAF_BLOCKS ROWS_PER_LEAF ------------ ------------- 30,211,949 325.602273 28,717,447 262.762331 26,308,608 382.505959 25,429,862 385.047045 24,054,626 309.92864 23,660,747 382.571434 22,458,058 411.063205 22,316,976 346.620625 18,875,827 379.952198 17,451,901 338.600909 17,189,536 257.29466
From the above let us take a low figure of 300 average enteries per block. For a BLEVEL 0 , a one block index, that would be 300 rows that are referenced. For a level 1 index, 300 entries in the root block will point to 300 leaf blocks, so that will be 300*300 rows…90,000 entires.
Go down another BLEVEL to 2 (and a height of 3) and you have a root node referencing 300 Branch nodes, referenceing 300 Leaf nodes each referencing 300 records. 300*300*300 = 27 million.
Another level (BLEVEL 3), another factor of 300 and that is 8.1 billion.
BLEVEL 0= 300
BLEVEL 1 = 90,000
BLEVEL 2 = 27,000,000
BLEVEL 3 = 8,100,000,000
BLEVEL 4 = 6,480,000,000,000 (6,480 billion entries).
You can see that with an 8k block size and being very pessimistic about the number of entries per block (I used a low average to calculate enttries per block where as in reality most indexes will fill the block at the root and branch nodes to close to full before throwing a new level) your have to increase your data volume by amost 300 to throw a new level of index.
On average, knowing nothing about the actual number of records in an index, you would have to reduce your number of indexed rows by 150 times (half of 300) to have a 50% chance of lowering the BLEVEL of a locally partitioned index by one. ie 150 or more partitions.
with a 32k block size, you would have to reduce your data volume by more like 620 times to drop the BLEVEL (not 600 as you have less wastage from block overhead with larger blocks).
To reduce it by two it would be 300*150 I think, wiht 8k block size {can any statisticians, like Mr Lewis, help me out on this one?}. If I am right, that is 45,000 partitions. In that 25TB datawarehouse, no table has more than 4,000 partitions.
That is why I say, when you partition a table and have locally partitioned indexes, you might reduce the index level by 1. Might.
However, none of these systems has more than 4000 partitions per table. That might sound a lot, but it is only going to reduce an index BLEVEL by 1. Almost certainly it will, but if you have read the previous postings, that is not going to really help index lookups be that more efficient
I’ll just add a couple of last comments.
- If you have an index BLEVEL of 4 or 5 or more and do not have a VLDB, you might want to look at why {hints, it could be an IOT, it could be an index on several concatenated VARCHAR2 columns, it could be you are using 2k block size, it could be that you regularly delete a large pecentage of your data and then re-insert more data, it could be you have a one-table database, hehe.}.
- Point one does not mean you should rebuild said index. I almost never rebuild indexes. The need to do so is one of those Oracle myths. There are situations where index rebuilds are advantageous, but they are pretty damned rare. If you regularly rebuild all your indexes or you are thinking of doing so, can I suggest you book a “meeting” for an afternoon and sit down with a cup of tea and google/bing/yahoo or whatever and check out index rebuilding first? Ignore any sites you land on offering database health checks or training courses on cruise liners.
That’s all for now.
BLEVEL and Height of Indexes November 13, 2009
Posted by mwidlake in internals, performance.Tags: data dictionary, performance, statistics
6 comments
{Update – this post describes blevel, demonstrates it’s growth with the number if index entries and the impact on index-to-table lookups. for a pretty graphic of BLEVEL try looking at this page}
I got something wrong on a couple of postings recently, namely the relationship between BLEVEL and the number of blocks needed to read “down” an index, the true depth or HEIGHT of the index {I used to know this but I forgot, but heck no one pinged me on the two posts in question, so I got away with it
– I’ve updated the postings already.}
BLEVEL is the number of branch levels (including the root node) in a B-Tree index. Height is the actual depth of the index. Height is BLEVEL plus one. So when you see BLEVEL of 3 against an index in DBA_INDEXES/DBA_IND_STATISTICS, that means the index has a root node, a first level of Branch blocks, then a second level of Branch blocks and finally the Leaf blocks (which hold the indexed values and rowids to the table entries).
Thus to scan the index for one unique entry, Oracle will need to read the root node to locate the correct branch node in branch level one, read that to find the correct branch node in branch level 2 and that will lead to the correct leaf block. That is four blocks to read. The leaf block contains the index entry and the rowid of the relevant data block, which allows oracle to go directly to that block, for the fifth block read.
{I’m having trouble finding a nice diagram of this {{ I hate the one in the Oracle manuals}}, not even on Mr Foote’s or Mr Lewis’s pages, so if you spot one before I do, let me know and I’ll update this page with a relevant link}.
{Update 18 months later – I finally drew a nice diagram of the index-rowid-table_row path.}
Some documentation on the Web mentions HEIGHT being held in the index stats table. This is SYS.INDEX_STATS, not the DBA_IND_STATISTICS table, and SYS.INDEX_STATS is only populated when you run the old “ANLAYZE INDEX index_name VALIDATE STRUCTURE” command, so ignore that. You should not really be using the old ANALYZE command any more.
The below demonstrates the increasing BLEVEL and the number of consistent gets to select one record {it’s more complicated if you select more than one}
NAME VALUE
------------------------------ --------------------
compatible 10.2.0.3.0
cpu_count 8
db_block_size 8192
db_file_multiblock_read_count 16
optimizer_mode ALL_ROWS
sga_target 0
sort_area_size 65536
create table test_bl
(id number(8) not null
,status number(1) not null
,num_1 number(3) not null -- random 20
,num_2 number(3) -- random 20
,num_3 number(5) -- cycle smoothly
,num_4 number(5) -- cycle smoothly
,vc_1 varchar2(10)
,vc_2 varchar2(10)
,vc_pad varchar2(2000))
tablespace users
/
Table created.
insert into test_bl(id,status,num_1,num_2,num_3,num_4
,vc_1,vc_2,vc_pad)
select rownum,decode(mod(rownum,100),0,1
,0)
,trunc(dbms_random.value(1,20))
,trunc(dbms_random.value(1,30))
,mod(rownum,10)+1
,mod(rownum,100)+1
,dbms_random.string('U',10)
,lpad(chr(mod(rownum,6)+65),5,chr(mod(rownum,6)+65) )
,lpad('A',100,'A')
from dba_objects
where rownum < 500
/
499 rows created.
commit;
Commit complete.
-- now add a pK on the ID
alter table test_bl
add constraint tb_pk primary key (id)
using index
tablespace users
/
Table altered.
begin
dbms_stats.gather_table_stats(user,'TEST_BL');
END;
/
PL/SQL procedure successfully completed.
select index_name,blevel,leaf_blocks
from dba_indexes
where owner=user
and index_name like 'TB%'
/
INDEX_NAME BLEVEL LEAF_BLOCKS
------------------------------ ---------- -----------
TB_PK 0 1
So I’ve created the test table, put 499 records in it and added the index, via a primary key constraint. The index created has one leaf block in it and a BLEVEL of 0.
Now let’s select a record via it {and the reason I put 499 records in the table is so that oracle decides to use the index and not a full table scan, which would be a likely choice by CBO with a very small table}.
set autotrace on
select vc_1 from test_bl where id=54
/
VC_1
----------
BRFVRHEMWP
1 row selected.
Execution Plan
----------------------------------------------------------
| Id| Operation | Name | Rows |Bytes| Cost|
----------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 15 | 1 (0)|
| 1 | TABLE ACCESS BY INDEX ROWID| TEST_BL | 1 | 15 | 1 (0)|
|*2 | INDEX UNIQUE SCAN | TB_PK | 1 | | 0 (0)|
----------------------------------------------------------
Statistics
----------------------------------------------------------
1 recursive calls
0 db block gets
2 consistent gets
0 physical reads
0 redo size
339 bytes sent via SQL*Net to client
338 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1 rows processed
select vc_1 from test_bl where id=54
/
VC_1
----------
BRFVRHEMWP
1 row selected.
Execution Plan
----------------------------------------------------------
|Id | Operation | Name | Rows |Bytes| Cost
----------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 15 | 1 (0)|
| 1 | TABLE ACCESS BY INDEX ROWID| TEST_BL | 1 | 15 | 1 (0)|
|*2 | INDEX UNIQUE SCAN | TB_PK | 1 | | 0 (0)|
-----------------------------------------------------------
Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
2 consistent gets
0 physical reads
I generally run test selects twice, to remove any parsing and recursive SQL overhead {I’ll remove these from the rest of this post}. So, 2 consistent gets. That would be one on the index and one on the table then.
Note the cost of the index index unique scan – 0. See end.
Now I’ll add more data and grow the index.
test102>set autotrace off echo off
insert into test_bl(id,status,num_1,num_2,num_3,num_4
,vc_1,vc_2,vc_pad)
select rownum+500,decode(mod(rownum,100),0,1
,0)
,trunc(dbms_random.value(1,20))
,trunc(dbms_random.value(1,30))
,mod(rownum,10)+1
,mod(rownum,100)+1
,dbms_random.string('U',5)
,lpad(chr(mod(rownum,6)+65),5,chr(mod(rownum,6)+65) )
,lpad('A',100,'A')
from dba_objects
where rownum < 5500
/
5499 rows created.
begin
dbms_stats.gather_table_stats(user,'TEST_BL');
END;
/
PL/SQL procedure successfully completed.
select index_name,blevel,leaf_blocks
from dba_indexes
where owner=user
and index_name like 'TB%'
/
INDEX_NAME BLEVEL LEAF_BLOCKS
------------------------------ ---------- -----------
TB_PK 1 11
So now we have a BLEVEL of 1 and 11 leaf blocks. That will be a root node and below it the leaf blocks. Let’s try a select:
select vc_1 from test_bl where id=454
/
VC_1
----------
IQGSEOCCCH
Execution Plan
----------------------------------------------------------
|Id | Operation | Name | Rows |Bytes| Cost|
----------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 11 | 2 (0)|
| 1 | TABLE ACCESS BY INDEX ROWID| TEST_BL | 1 | 11 | 2 (0)|
|*2 | INDEX UNIQUE SCAN | TB_PK | 1 | | 1 (0)|
-----------------------------------------------------------
Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
3 consistent gets
0 physical reads
3 consistent gets, one for the root node, one for the relevant leaf block and one for the data block holding the record.
Now I’ll insert about 400,000 more records to cause the index to become one level deeper. (You might be interested to know that 300,000 records was not enough to cause a layer of branch nodes to be created, though as I am indexing an ascending numerical column, each index entry is not exactly huge. This does show that the BLEVEL does not scale with data volume – but it ‘does not scale’ in a very beneficial way. You need to massively increase the volume of data between increasing BLEVELs.)
I will then select my record:
99999 rows created.
99999 rows created.
99999 rows created.
99999 rows created.
begin
dbms_stats.gather_table_stats(ownname=>user,tabname =>'TEST_BL'
,estimate_percent=> 10);
END;
/
PL/SQL procedure successfully completed.
INDEX_NAME BLEVEL LEAF_BLOCKS
------------------------------ ---------- -----------
TB_PK 2 761
set autotrace on
select vc_1 from test_bl where id=454
/
VC_1
----------
IQGSEOCCCH
1 row selected.
Execution Plan
----------------------------------------------------------
|Id | Operation | Name | Rows |Bytes| Cost|
--------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 12 | 3 (0)|
| 1 | TABLE ACCESS BY INDEX ROWID| TEST_BL | 1 | 12 | 3 (0)|
|*2 | INDEX UNIQUE SCAN | TB_PK | 1 | | 2 (0)|
---------------------------------------------------------
Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
4 consistent gets
0 physical reads
The index BLEVEL has gone up to 2 {index height is 3} and now 4 consistent gets are needed to fetch the record.
You may have noticed that the estimated cost of the INDEX_UNIQUE_SCAN is the same as the BLEVEL, which is not really correct. After all, in the first example the cost was 0 and there has to be a read of the index leaf block! The costing makes more sense when it is part of the calculation for scanning an index and then visiting the table for all found records:-
“basic index range scan cost = index blevel + ceil(index selectivity x leaf blocks) + ceil(table selectivity x clustering factor)”
In words this formula means “go down the index to the leaf nodes (this is the BLEVEL), scan the number of leaf nodes expected for this index value, then visit the number of table blocks this set of index entries would map to”.
For more information on the formula, I’d plug part of that formula into google (or bing or whatever takes your fancy, search-engine-wise). The original is this page by Richard Foote but there are some good notes by others as well.
There are a lot of references on the web about the cost of accesing an index being the BLEVEL, but remember, if it is a unique access it is the BLEVEL plus one, and oracle seems (in my little tests anyway) to be underestimating the cost by 1. I think this reference to the BLEVEL and the costs might be leading to people into mistaking the BLEVEL as the actual height of the index.
Privileges required to FLUSH_DATABASE_MONITORING_INFO October 19, 2009
Posted by mwidlake in development, performance.Tags: data dictionary, privileges, statistics, system development
1 comment so far
I’m doing some work at the moment on gathering object statistics and it helps me a lot to have access to the number of changed records in SYS.DBA_TAB_MODIFICATIONS. To ensure you have the latest information in this table, you need to first flush any data out of memory with DBMS_STATS.FLUSH_DATABASE_MONITORING_INFO. For the live system, the DBAs rather understandably {and sensibly} want all users to run with the least access privileges they need, so granting DBA role to my user is out.
I googled for the actual system privilege or privileges needed to flush_database_monitoring_info and drew a blank, so I have had to find out for myself. And being a nice guy {who am I kidding}, I am now recording the info for anyone else who is interested to find:
On 10.2.0.3, to execute DBMS_STATS.FLUSH_DATABASE_MONITORING INFO you need the ANALYZE ANY system privilege.
{Not ANALYZE ANY DICTIONARY, which would make more sense to me}
For those who like such things, here is the proof. I had to use two sessions, thus the constant displaying of system time.
-- current user privs DWPERFDEV1> @usr_privs enter user whos privs you wish to see> dwperf GRANTEE TYPE PRIVILEGE adm ---------------------------------------------------------------- DWPERF SYSP CREATE JOB NO DWPERF SYSP CREATE PROCEDURE NO DWPERF SYSP CREATE PUBLIC SYNONYM NO DWPERF SYSP CREATE SESSION NO DWPERF SYSP CREATE SYNONYM NO DWPERF SYSP CREATE TABLE NO DWPERF SYSP CREATE TRIGGER NO DWPERF SYSP DEBUG CONNECT SESSION NO DWPERF SYSP DROP PUBLIC SYNONYM NO DWPERF SYSP EXECUTE ANY PROCEDURE NO DWPERF SYSP SELECT ANY DICTIONARY NO DWPERF SYSP SELECT ANY TABLE NO DWPERF ROLE CONNECT NO DWPERF OBJP SYS.DBMS_UTILITY-EXECUTE NO DWPERF_ROLE SYSP ANALYZE ANY NO DWPERFDEV1> @showtime Date Time -------------------------------------------------------- 19-OCT-2009 13:29:16 DWPERFDEV1> exec dbms_stats.flush_database_monitoring_info BEGIN dbms_stats.flush_database_monitoring_info; END; * ERROR at line 1: ORA-20000: Insufficient privileges ORA-06512: at "SYS.DBMS_STATS", line 2148 ORA-06512: at "SYS.DBMS_STATS", line 14135 ORA-06512: at line 1 DEV1> @showtime Date Time ------------------------------------------------ 19-OCT-2009 13:29:30 DEV1> grant analyze any dictionary to dwperf; Grant succeeded. DWPERFDEV1> @showtime Date Time ------------------------------------------ 19-OCT-2009 13:29:40 DWPERFDEV1> exec dbms_stats.flush_database_monitoring_info BEGIN dbms_stats.flush_database_monitoring_info; END; * ERROR at line 1: ORA-20000: Insufficient privileges ORA-06512: at "SYS.DBMS_STATS", line 2148 ORA-06512: at "SYS.DBMS_STATS", line 14135 ORA-06512: at line 1 DEV1> @showtime Date Time --------------------------------------------- 19-OCT-2009 13:30:46 DEV1> grant analyze any to dwperf; Grant succeeded. DWPERFDEV1> @showtime Date Time ------------------------------------ 19-OCT-2009 13:31:20 DWPERFDEV1> exec dbms_stats.flush_database_monitoring_info PL/SQL procedure successfully completed. -- SUCCESS! DEV1> @showtime Date Time ------------------------------------------- 19-OCT-2009 13:31:38 DEV1> revoke analyze any from dwperf Revoke succeeded. DWPERFDEV1> @showtime Date Time ------------------------------------------------------ 19-OCT-2009 13:31:57 DWPERFDEV1> exec dbms_stats.flush_database_monitoring_info BEGIN dbms_stats.flush_database_monitoring_info; END; * ERROR at line 1: ORA-20000: Insufficient privileges ORA-06512: at "SYS.DBMS_STATS", line 2148 ORA-06512: at "SYS.DBMS_STATS", line 14135 ORA-06512: at line 1
Of course, I’ll soon find something else that breaks due to my minimum privs before the end of the day, but it’s not easy creating more secure systems {note, I don’t say Secure, just more secure, as in less open!}.
Decrypting Histogram Data #3 – is 7 or 15 Characters Significant? September 15, 2009
Posted by mwidlake in internals, performance.Tags: data dictionary, histograms, performance, statistics
2 comments
At last I am going to wrap up about looking at Histogram data and endpoint_value. Sorry about the delay, life is a tad complex at present. This is also a very, very long post. You may want to skip to the end to get the conclusions
I first posted about how to translate endpoint_values.
I then posted in the second post a week ago about how endpoint_value is limited to 7 or so characters for varchar2 columns.
{Addition – this later post discusses the content of endpoinit_value for dates}
Coskan contacted me to say he believed I had missed something, and of course he was right – but I’d laid big hints that I was using theory and not practical test and there was more to know. {There is usually more to know, only practical tests confirm whether your theory works or not. And even after that, it seems there is still more to know!}
OK, I have a test table, created with the following {oh, this is under 10.2.0.3, Enterprise Edition on linux}:
create table hist_test
as select rownum id
,lpad(chr(mod(rownum,6)+65),5,chr(mod(rownum,6)+65) )vc_1
,lpad('A',2,'A')||dbms_random.string('U',1) vc_2
,lpad('B',3,'B')||dbms_random.string('U',1) vc_3
,lpad('C',4,'C')||dbms_random.string('U',1) vc_4
,lpad('D',5,'D')||dbms_random.string('U',1) vc_5
,lpad('E',6,'E')||dbms_random.string('U',1) vc_6
,lpad('F',7,'F')||dbms_random.string('U',1) vc_7
,lpad('G',8,'G')||dbms_random.string('U',1) vc_8
,lpad('H',16,'H')||dbms_random.string('U',1)) vc_16
,lpad('I',40,'I')||dbms_random.string('U',1) vc_40
from dba_objects
where rownum < 1001
I need to skew the data in some columns so that different WHERE clauses should see different expected numbers of rows
I now need to add some skew to the data so you can see the histogram in action .
update hist_test set
VC_5=lpad(‘D’,5,’D')||’A’
,VC_7=lpad(‘F’,7,’F')||’A’
,VC_16=lpad(‘H’,16,’H')||’A’
,VC_40=lpad(‘I’,40,’I')||’A’
where mod(id,10)=1
/
Gather histogram stats, 10 buckets:
dbms_stats.gather_table_stats(ownname=>user,tabname=>’HIST_TEST’
,method_opt=>’FOR ALL COLUMNS SIZE 10′)
And looking at ALL_TAB_COLUMNS we can see the general stucture of the data (just the VC_ columns are shown)
COL_NAME NUM_DIST LOW_V HI_V N_BUCK AVG_L ------------------------------------------------------------- ID 1,000 1 1000 10 4 VC_1 6 AAAAA FFFFF 6 6 VC_2 26 AAA AAZ 10 4 VC_3 26 BBBA BBBZ 10 5 VC_4 26 CCCCA CCCCZ 10 6 VC_5 26 DDDDDA DDDDDZ 10 7 VC_6 26 EEEEEEA EEEEEEZ 10 8 VC_7 26 FFFFFFFA FFFFFFFZ 10 9 VC_8 26 GGGGGGGGA GGGGGGGGZ 10 10 VC_16 26 HHHHHHHHHHHHHHHHA HHHHHHHHHHHHHHHHZ 10 18 VC_40 1 IIIIIIIIIIIIIIIIIIII IIIIIIIIIIIIIIIIIIII 1 42
I have created a table with varchar2 columns set to a fixed sting of characters the length indicated by the column name with the last column varying from A to Z. So VC_5 contains EEEEEx, where x is A to Z. 26 distinct values each column.
If you look at the above, all columns are showing as having 26 values and 10 buckets EXCEPT VC_40, but it does have 26 distinct values:
select count(distinct(VC_40)) from hist_test;
COUNT(DISTINCT(VC_40))
———————-
26
I have my function to unpack the endpoint_value for varchar2 columns from the number stored to the character string it represents. My function is not perfect, but it is reasonably good. Go checkout post one for the function.
Looking at the histograms with my function you can see the below {I show only some records}
colname END_VAL ROWCOUNT MOD_REAL
------------------------------------------------------------------
VC_5 354460798875962000000000000000000000 1 DDDDDA
VC_5 354460798875972000000000000000000000 2 DDDDDD
VC_5 354460798875986000000000000000000000 3 DDDDDG
{snip}
VC_5 354460798876071000000000000000000000 9 DDDDDY
VC_5 354460798876080000000000000000000000 10 DDDDDZ
VC_7 364886116489977000000000000000000000 1 FFFFFGK
VC_7 364886116489977000000000000000000000 2 FFFFFGK
VC_7 364886116489977000000000000000000000 3 FFFFFGK
VC_7 364886116489977000000000000000000000 4 FFFFFGK
{snip}
VC_7 364886116489977000000000000000000000 9 FFFFFGK
VC_7 364886116489977000000000000000000000 10 FFFFFGK
VC_16 375311434103976000000000000000000000 1 HHHHHI:
VC_16 375311434103976000000000000000000000 2 HHHHHI:
VC_16 375311434103976000000000000000000000 3 HHHHHI:
{snip}
VC_16 375311434103976000000000000000000000 9 HHHHHI:
VC_16 375311434103976000000000000000000000 10 HHHHHI:
VC_40 380524092910976000000000000000000000 1,000 IIIIIJM
For VC_5, there are 10 buckets, all with different endpoint_values.
The VC_7 has 10 buckets, but most have the same endpoint_value.
The VC_16 has 10 records in the HISTOGRAM table, all with the same endpoint_value.
VC_40 has only one record in the HISTOGRAM table.
Basically, if I am right, WHERE filters on a column holding values that are the same for the first seven or so characters will not be well supported by the histograms. We should see the estimated cost for such a WHERE filter to be wrong.
I have skewed the data so that Histograms should show more expected values for DDDDDA than DDDDDS, for FFFFFFFA than FFFFFFFS etc
update hist_test
set
VC_5=lpad(‘D’,5,’D')||’A’
,VC_7=lpad(‘F’,7,’F')||’A’
,VC_16=lpad(‘H’,16,’H')||’A’
,VC_40=lpad(‘I’,40,’I')||’A’
where mod(id,10)=1
/
And now I’ll see how the CBO estimates the rows coming back for some WHERE clauses.
select count(*) from hist_test
where VC_5 = 'DDDDDA'
COUNT(*)
----------
139
Execution Plan
----------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)|
----------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 7 | 8 (0)|
| 1 | SORT AGGREGATE | | 1 | 7 | |
|* 2 | TABLE ACCESS FULL| HIST_TEST | 100 | 700 | 8 (0)|
----------------------------------------------------------
select count(*) from hist_test
where VC_5 = 'DDDDDS'
COUNT(*)
----------
35
Execution Plan
----------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)|
----------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 7 | 8 (0)|
| 1 | SORT AGGREGATE | | 1 | 7 | |
|* 2 | TABLE ACCESS FULL| HIST_TEST | 49 | 343 | 8 (0)|
---------------------------------------------------------------------
For column VC_5 it all works as we would expect. the CBO estimates 100 values for DDDDDA and 49 for DDDDDS. I’ve shown the actual count(*)’s as well to show there is a real variation in numbers. There are good reasons why the estimates do not match reality, I won’t go into them now but this does highlight that histograms can help but do have flaws.
What about the other rows? We are only interested in the WHERE clause and the estimate for the number of rows from the table access, so I’ll show only those.
where VC_7 = 'FFFFFFFA'
|* 2 | TABLE ACCESS FULL| HIST_TEST | 100 | 900 | 8 (0)|
where VC_7 = 'FFFFFFFS'
|* 2 | TABLE ACCESS FULL| HIST_TEST | 49 | 441 | 8 (0)|
where VC_16 = 'HHHHHHHHHHHHHHHHA'
|* 2 | TABLE ACCESS FULL| HIST_TEST | 100 | 7200 | 8 (0)|
where VC_16 = 'HHHHHHHHHHHHHHHHS'
|* 2 | TABLE ACCESS FULL| HIST_TEST | 47 | 216 | 8 (0)|
where VC_40 = 'IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIA'
|* 2 | TABLE ACCESS FULL| HIST_TEST | 1000 | 42000 | 8 (0)|
where VC_40 = 'IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIS'
|* 2 | TABLE ACCESS FULL| HIST_TEST | 1000 | 42000 | 8 (0)|
{and just to show there is no typo in the string of 'I's above...
where VC_40 = lpad('I',40,'I')||'S'
|* 2 | TABLE ACCESS FULL| HIST_TEST | 1000 | 42000 | 8 (0)|
In the above you can see that the CBO estimates the costs for VC_7 exactly the same as it did for VC_5. Despite the endpoint_values no longer distinguishing the buckets.
Even more telling is tha for VC_16 the CBO also is able to predict there are more values ending in A than S.
As for VC_40, the CBO utterly fails to spot the skew. In fact it utterly fails to spot that there is more than one value in the column (look back at the section showing min and max values and number of distinct values), and there is only one histogram bucket. so I supposed it is no surprise.
How does the CBO still detect the skew for VC_7 and VC_16, even though the endpoint_valuess are the same?
Because it swaps to using column ENDPOINT_ACTUAL_VALUE.
ENDPOINT_ACTUAL_VALUE is only populated for varchar2 columns when the precision of endpoint_value is exceeded:
colname ROWCOUNT MOD_REAL ENDPOINT_ACTUAL_VALUE ------- --------------------------------------------- VC_5 7 DDDDDR VC_5 8 DDDDDV VC_5 9 DDDDDY VC_5 10 DDDDDZ VC_6 0 EEEEEF8 EEEEEEA VC_6 1 EEEEEF8 EEEEEEC VC_6 2 EEEEEF8 EEEEEEF VC_6 3 EEEEEF8 EEEEEEH VC_6 4 EEEEEF8 EEEEEEK VC_6 5 EEEEEF8 EEEEEEN VC_6 6 EEEEEF8 EEEEEEP VC_6 7 EEEEEFn EEEEEES VC_6 8 EEEEEFn EEEEEEU VC_6 9 EEEEEFn EEEEEEX VC_6 10 EEEEEFn EEEEEEZ VC_7 1 FFFFFGK FFFFFFFA VC_7 2 FFFFFGK FFFFFFFC VC_7 3 FFFFFGK FFFFFFFF VC_7 4 FFFFFGK FFFFFFFI VC_7 5 FFFFFGK FFFFFFFK VC_7 6 FFFFFGK FFFFFFFO VC_7 7 FFFFFGK FFFFFFFR VC_7 8 FFFFFGK FFFFFFFT VC_7 9 FFFFFGK FFFFFFFW VC_7 10 FFFFFGK FFFFFFFZ VC_16 1 HHHHHI: HHHHHHHHHHHHHHHHA VC_16 2 HHHHHI: HHHHHHHHHHHHHHHHD VC_16 3 HHHHHI: HHHHHHHHHHHHHHHHG VC_16 4 HHHHHI: HHHHHHHHHHHHHHHHJ VC_16 5 HHHHHI: HHHHHHHHHHHHHHHHL VC_16 6 HHHHHI: HHHHHHHHHHHHHHHHO VC_16 7 HHHHHI: HHHHHHHHHHHHHHHHS VC_16 8 HHHHHI: HHHHHHHHHHHHHHHHU VC_16 9 HHHHHI: HHHHHHHHHHHHHHHHX VC_16 10 HHHHHI: HHHHHHHHHHHHHHHHZ VC_40 1,000 IIIIIJM
You can see that VC_5 columns have no ENDPOINT_ACTUAL_VALUEs but VC_6, VC_7 and VC_16 do. VC_40 does not.
So, at what point does the CBO stop storing values in ENDPOINT_ACTUAL_VALUES? 32 characters. I created another test table and these are the low/high values, number of distinct values and number of histogram buckets:
COL_NAME NUM_DIST LOW_V HI_V N_BUCK AVG_L
————————————————————–
ID 1,000 1 1000 10 4
NUM_1 10 0 9 10 3
VC_1 6 AAAAA FFFFF 6 6
VC_30 26 AAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAA 10 32
VC_31 26 BBBBBBBBBBBBBBBBBBBB BBBBBBBBBBBBBBBBBBBB 10 33
VC_32 1 CCCCCCCCCCCCCCCCCCCC CCCCCCCCCCCCCCCCCCCC 1 34
VC_33 1 DDDDDDDDDDDDDDDDDDDD DDDDDDDDDDDDDDDDDDDD 1 35
VC_34 1 EEEEEEEEEEEEEEEEEEEE EEEEEEEEEEEEEEEEEEEE 1 36
VC_35 1 FFFFFFFFFFFFFFFFFFFF FFFFFFFFFFFFFFFFFFFF 1 37[/sourcecode ]
The name of the column is the length of the fixed string, so the columns are actually one character longer, as can be seen by the “AVG_L” column.
And just to check {as after all this is just a nice theory until you test}, I skew the number of records ending in ‘A’ for VC_31 and VC_32:-
update hist_test2
set
VC_31=lpad(‘B’,31,’B')||’A’
,VC_32=lpad(‘C’,32,’C')||’A’
where mod(id,10)=1
And see what the CBO thinks the rows to be identified are:-
where VC_31 = lpad(‘B’,31,’B')||’A’
|* 2 | TABLE ACCESS FULL| HIST_TEST2 | 100 | 3300 |
where VC_31 = lpad(‘B’,31,’B')||’S’
|* 2 | TABLE ACCESS FULL| HIST_TEST2 | 49 | 1617 |
where VC_32 = lpad(‘C’,32,’C')||’A’
|* 2 | TABLE ACCESS FULL| HIST_TEST2 | 1000 | 34000 |
where VC_32 = lpad(‘C’,32,’C')||’S’
|* 2 | TABLE ACCESS FULL| HIST_TEST2 | 1000 | 34000 | [/sourcecode ]
Yes. The CBO can see the skew for the 32 characters of VC_31 and not for the 33 characters of VC_32
So in conclusion
- endpoint_value is only accurate for varchar2 columns up to around 7 characters.
- If the varchar2 exceeds the precision of the endpoint_value column then endpoint_actual_value is used.
- endpoint_actual_value is used for varchar2 columns up to 32 characters and not for columns less than 7 characters.
- Histograms will not help with columns having many values with the same leading 32 or more characters.
- If you column has the same leading 32 characters or more, even the number of distinct values is not gathered by DBMS_STATS.
- You can’t tell how SQL/CBO will work from theory, you have to test. And still consider you knowledge as open to correction after that
Another Day, Another Obscure Oracle Error September 11, 2009
Posted by mwidlake in internals, performance.Tags: data dictionary, rant, statistics
8 comments
I seem to be spending my life hitting odd bugs or errors in Oracle’s performance-related areas at the moment. I guess part of it is because I am using features not widely touched, part is I’m not working as the SYS or SYSTEM user {in my experience a lot of people working on such things do it when connected as sysdba or at least as SYSTEM} and part is that Larry Ellisson personally hates me {he fills in for God when it comes to Oracle}.
I’m seeing this a lot today,and it seems virtually unknown:-
ERROR at line 1:
ORA-20003: Unable to gather system statistics
ORA-06512: at “SYS.DBMS_STATS”, line 15882
ORA-06512: at line 1
I’m also seeing this, but this error is documented:-
ERROR at line 1:
ORA-20000: Unable to gather system statistics : insufficient privileges
ORA-06512: at “SYS.DBMS_STATS”, line 15822
ORA-06512: at line 1
The rest of this post will explain my path to only partial enlightenment. If you have hit the above errors and found this page via Google, this might help you out and you might like to skip towards the end.
If you just read my blog, regard this as a very long “tweet” or whatever the hell they are. Again, feel free to skip to the end. Or just skip.
The task I’m doing which is causing me to hit these issues is that I’m trying to assure myself of the ability to role back gathering system statistics before getting the DBA team to do it on a production system. {I am not in the DBA team on this one}.
No system statistics have been gathered yet so I have the default set, I’m sure many of you will recognise them:
SNAME PNAME PVAL1 PVAL2 --------------------------------------------------- SYSSTATS_MAIN CPUSPEED SYSSTATS_MAIN CPUSPEEDNW 1,567.798516 SYSSTATS_MAIN IOSEEKTIM 10.000000 SYSSTATS_MAIN IOTFRSPEED 4,096.000000 SYSSTATS_MAIN MAXTHR SYSSTATS_MAIN MBRC SYSSTATS_MAIN MREADTIM SYSSTATS_MAIN SLAVETHR SYSSTATS_MAIN SREADTIM SYSSTATS_INFO DSTART 07-25-2006 12:39 SYSSTATS_INFO DSTOP 07-25-2006 12:39 SYSSTATS_INFO FLAGS 1.000000 SYSSTATS_INFO STATUS COMPLETED
I decided to pull the current system statistics into a statistics table:-
exec dbms_stats.create_stat_table(‘DWPERF’,'AD_BF_STATS’,'DW_COMMON_MG’)
EXEC DBMS_STATS.EXPORT_SYSTEM_STATS
(‘AD_BF_STATS’,'SYSSTAT’||TO_CHAR(SYSDATE,’YYMMDDHH24MI’),’DWPERF’)
SELECT * FROM AD_BF_STATS;
SYSSTAT0909101309 S 4 1 COMPLETED
07-25-2006 12:39 07-25-2006 12:39 CPU_SERIO
SYSSTAT0909101309 S 4 1
PARIO
10 4096 1567.79852
Two records created, first is the header record about what this statid is all about, the second contains the stored info. I’ve highlighted the relevant records.
All well and good, I can save my stats. Why do I want to? Because we may gather system statistics on an infrequent but regular basis, and I want a record of them. The automtically stored history only goes back a month {by default}.
Of course, there is no need for me to explicitly export the stats to the table, it can be done as part of the gathering process, by stating the statistics table, owner and an ID when I do the gather, so I did:-
exec dbms_stats.gather_system_stats(gathering_mode => ‘INTERVAL’,interval => 15,stattab=> ‘
AD_BF_STATS’,statown=> ‘DWPERF’ ,statid=>’SYSSTAT’||TO_CHAR(SYSDATE,’YYMMDDHH24MI’))
I went off for a cup of coffee, came back, checked my system stats.
SNAME PNAME PVAL1 PVAL2 --------------------------------------------------- SYSSTATS_MAIN CPUSPEED SYSSTATS_MAIN CPUSPEEDNW 1,567.798516 SYSSTATS_MAIN IOSEEKTIM 10.000000 SYSSTATS_MAIN IOTFRSPEED 4,096.000000 SYSSTATS_MAIN MAXTHR SYSSTATS_MAIN MBRC SYSSTATS_MAIN MREADTIM SYSSTATS_MAIN SLAVETHR SYSSTATS_MAIN SREADTIM SYSSTATS_INFO DSTART 07-25-2006 12:39 SYSSTATS_INFO DSTOP 07-25-2006 12:39 SYSSTATS_INFO FLAGS 1.000000 SYSSTATS_INFO STATUS COMPLETED
Oh. No change. No system stats gathered. Damn.
Well, it’s a test box, there may be too little worload to detect anything – like none! I checked the SGA for recent code and the only SQL was that for recording the stats gathering
.
So, repeated with me running some workload in another window for 15 minutes.
Still no change, still no system stats. Double Damn!
I remembered from reading around that if MREADTIME is less than SREADTIME the stats gathered could be ignored so I tried again.
And again.
And then decided it would not magically work, there was a problem.
I’ll gather the system stats with START and STOP and really hammer the box for a while, ensuring with autotrace
exec dbms_stats.gather_system_stats(gathering_mode => ‘START’,stattab=>’AD_BF_STATS’,statown
=> ‘DWPERF’ ,statid=>’SYSSTAT’||TO_CHAR(SYSDATE,’YYMMDDHH24MI’))
work-work-work-work-work
exec dbms_stats.gather_system_stats(gathering_mode => ‘STOP’);
BEGIN dbms_stats.gather_system_stats(gathering_mode => ‘STOP’); END;
*
ERROR at line 1:
ORA-20003: Unable to gather system statistics
ORA-06512: at “SYS.DBMS_STATS”, line 15882
ORA-06512: at line 1
Oh.
Oh Hell. Well, at least I now know that the chances are my interval systenm stats collection failed with this, but silently.
I bet if I dug around I would find a few failed jobs recorded where the INTERVAL version of the stats collection failed with this. I have not checked, I am just up to my eyes right now.
I found nothing at all via Google or Metalink about this exact error for gathering system stats. BUT maybe I lacked the correct priveleges on my non-SYS account. So I granted ANALYZE AND DICTIONARY and ANALYZE ANY to my user, as those or the priveleges mentioned for many of the DBMS_STATS management procedures.
Nope, did not help.
Blow it, If DBMS_STATS won’t let me keep the stats in my table, I’ll just rely on the default store of them, and not state my stats table in the call:
DWD2_DWPERF> exec dbms_stats.gather_system_stats(gathering_mode => ‘START’)
BEGIN dbms_stats.gather_system_stats(gathering_mode => ‘START’); END;
*
ERROR at line 1:
ORA-20000: Unable to gather system statistics : insufficient privileges
ORA-06512: at “SYS.DBMS_STATS”, line 15822
ORA-06512: at line 1
Oh. Well, that’s interesting.
If I don’t state a stats table the gather fails when initiated.
If I state a stats table the gather fails when I stop.
How frustrating.
I decided to give up. Then I decided I won’t be beaten and to grant myself DBA role and see if it works.
DWDEV2> exec dbms_stats.gather_system_stats(gathering_mode => ‘START’)
PL/SQL procedure successfully completed.
YEEEESSSSSS!!!
But will it now error when I stop the gather…
DWDEV2> exec dbms_stats.gather_system_stats(gathering_mode => ‘STOP’)
PL/SQL procedure successfully completed.
Deep Joy.
And it gathered stats.
SNAME PNAME PVAL1 PVAL2 ------------------------------------------ SYSSTATS_MAIN CPUSPEED 1,545.000000 SYSSTATS_MAIN CPUSPEEDNW 1,567.798516 SYSSTATS_MAIN IOSEEKTIM 10.000000 SYSSTATS_MAIN IOTFRSPEED 4,096.000000 SYSSTATS_MAIN MAXTHR SYSSTATS_MAIN MBRC SYSSTATS_MAIN MREADTIM SYSSTATS_MAIN SLAVETHR SYSSTATS_MAIN SREADTIM 5.000000 SYSSTATS_INFO DSTART 09-11-2009 13:02 SYSSTATS_INFO DSTOP 09-11-2009 13:02 SYSSTATS_INFO FLAGS 1.000000 SYSSTATS_INFO STATUS COMPLETED
{ignore the lack of mreadtime and sreadtime, nothing happend on the box in the few seconds I was collecting for}
SO, the lack of DBA role was a problem but had been hidden by my attempts to use a stats table. Now surely I can do what I wanted to and store the stats in my stats table…
exec dbms_stats.gather_system_stats(gathering_mode => ‘START’,stattab=>’AD_BF_STATS’,statown
=> ‘DWPERF’ ,statid=>’SYSSTAT’||TO_CHAR(SYSDATE,’YYMMDDHH24MI’))
PL/SQL procedure successfully completed.
Wehay.
work-work-work-work-work
exec dbms_stats.gather_system_stats(gathering_mode => ‘STOP’)
BEGIN dbms_stats.gather_system_stats(gathering_mode => ‘STOP’); END;
*
ERROR at line 1:
ORA-20003: Unable to gather system statistics
ORA-06512: at “SYS.DBMS_STATS”, line 15882
ORA-06512: at line 1
AAAAAGGGGHHHHHH!!!!!!!!!
Let’s check back.
exec dbms_stats.gather_system_stats(gathering_mode => ‘START’)
PL/SQL procedure successfully completed.
work-work-work-work-work
exec dbms_stats.gather_system_stats(gathering_mode => ‘STOP’)
PL/SQL procedure successfully completed.
That works, because I am not involving a stats table. If you look back at the two errors, they come from different (but very similar) line numbers. I’d missed that for a couple of hours, I blame dyslexia.
Conclusion:
Gathering and storing System Stats is as buggy and random as the rest of the dbms_stats functionality. {yes, I am peeved about this}. It Seems..
- If you gather system stats with the interval method, it can silently fail. This might only be if you involve a stats table.
- You need the DBA role to gather system stats, otherwise you get ORA-20000, via ORA-06512 at line 15822
- analyze and dictionary and analyze any are not enough to allow a non DBA account to gather system stats.
- If you try and store the previous system stats in your stats table as part of the gather you get ORA-20003, via ORA-06512 at line 15882 {different line}, even if you have DBA role.
- If you just rely on Oracle preserving the previous version of the stats, you can gather system stats fine.
- You can work around the issue by exporting the system stats to your stats table as a single step first, then gathering system stats without involving your stats table.
There is no metalink for the ORA-20003 via line 15882 in metalink and no google hit for it, so I reckon it is a damned rare event. And since the last time I blogged about a damned rare error, I see 1 or 2 hits on that blog page every day since
.
Decrypting Histogram Data August 11, 2009
Posted by mwidlake in internals, performance.Tags: data dictionary, histograms, performance, statistics
3 comments
I’m looking at Histograms at the moment and I’ve hit a problem I always hit when looking at histograms. So I’ve decided once and for all to see if I can find a solution.
The value held in DBA_TAB_HISTOGRAMS.ENDPOINT_VALUE for VARCHAR2 columns is a number with no apparent link to reality.
{Before anyone who has read my previous posts suggests it, I’ve already looked at the view and the relevant column resolves to data in either sys.histgrm$ or sys.hist_head$ and the underlying values held are still numbers and not processed in the view, so they are numbers. I think}.
{{I have a sneaking suspicion this could turn into a set of increasingly obscure postings…}}
{{{THIS one might be my most obscure so far!}}}
Let’s have a look:
select --owner table_name ,column_name ,endpoint_value end_val ,endpoint_number rowcount from all_tab_histograms where table_name ='TEST_P' and owner='WIDLAKEM' order by table_name,column_name,endpoint_number / TABLE_NAME COLUMN_NAME END_VAL ROWCOUNT --------------- --------------- ---------- --------------- TEST_P STATUS 0 41,523 TEST_P STATUS 1 42,065 TEST_P STATUS 2 42,099 TEST_P VC_1 3.3882E+35 0 TEST_P VC_1 3.4951E+35 1 TEST_P VC_1 3.6487E+35 2 TEST_P VC_1 3.7558E+35 3 TEST_P VC_1 3.9095E+35 4 TEST_P VC_1 4.0162E+35 5 TEST_P VC_1 4.1697E+35 6 TEST_P VC_1 4.3235E+35 7 TEST_P VC_1 4.4305E+35 8 TEST_P VC_1 4.5841E+35 9 TEST_P VC_1 4.6914E+35 10
So what in heck is 4.6914E+35 {last value listed}?
Well, it is a number of course. If I set my column heading appropriately I can see this:-
col end_val form 999,999,999,999,999,999,999,999,999,999,999,999 col column_name form a10 select column_name ,endpoint_value end_val ,endpoint_number rowcount from all_tab_histograms where table_name ='TEST_P' and owner='WIDLAKEM' and column_name='VC_1' order by table_name,column_name,endpoint_number / COLUMN_NAM END_VAL ROWCOUNT ---------- ------------------------------------------------ ---------- VC_1 338,822,823,410,931,000,000,000,000,000,000,000 0 VC_1 349,512,448,932,628,000,000,000,000,000,000,000 1 VC_1 364,867,102,368,954,000,000,000,000,000,000,000 2 VC_1 375,575,265,424,979,000,000,000,000,000,000,000 3 VC_1 390,949,334,699,583,000,000,000,000,000,000,000 4 VC_1 401,618,352,253,998,000,000,000,000,000,000,000 5 VC_1 416,972,612,321,579,000,000,000,000,000,000,000 6 VC_1 432,345,727,450,272,000,000,000,000,000,000,000 7 VC_1 443,054,364,035,286,000,000,000,000,000,000,000 8 VC_1 458,409,658,691,912,000,000,000,000,000,000,000 9 VC_1 469,139,289,515,351,000,000,000,000,000,000,000 10
The ENDPOINT_VALUE is an encoded representation of the varchar2 value, based on the trick of turning each character into it’s ascii equivalent (0-255) and then multiplying it by 256*no-of-characters-in to the string and adding it all together. Let’s use “Cat” as an example.
- Last character is ‘t’, ascii value is 116=116;
- Next-to-last character is ‘b’, ascii value 97, *(256)= 24832;
- Chr 3 in is ‘C’, ascii value 67, *(256*256) =4390912
- Total is 4390912+24832+116=4415860
What Oracle actually does is slightly different {and I could have this slightly wrong, my brain waved a white flag a couple of hours back} it takes the first 15 characters and multiplies the first character ascii value by power(256*15), second character ascii value by power(256*14) etc until it runs out of characters or gets to character 15.
How do I know this? I decoded the ‘C’ in the kernel
. No, I didn’t, I found a link to an encode script Jonathan Lewis had written {it’s also in his latest book, or at least in the set of scripts the book tells you how to download}. I’ve lost the original link I found, and thus a nice little article on the whole topic, but this link will do until I can re-find the original. {NB this referenced article mentions using the hexstr function which may be neater but, as I said earlier, my brain has given up}.
I’ve outrageously stolen Jonathan’s script which encodes a varchar2 string into the relevant numeric and used it as a template to create a decode version too. Just a couple of notes:
- I can imagine this might not work if you have multibyte characters.
- Oracle confuse things by doing something like translating the 15 character into a 36-digit string – and then taking only the first 15 significant digits. This kind of messes up the untranslate.
So, I create three functions:
- hist_chartonum that converts a varchar2 into something very similar to what is stored in the histogram views. Jonathan will find it incredibly familiar. An optional second paramter turns off the truncation that oracle does for the histogram data, so you can get the full glorious 36 significant digits for 15 characters if you wish.
- hist_numtochar which converts the numeric back to a string. It gets it slightly wrong due to the truncation of the least significant 20 or so characters. Again, an optional second paramater allows it to not replicate the trunaction and work with all 15 characters.
- hist_numtochar2 which is an attempt to allow for the truncation errors. I add power(256,9) to the value if it has been truncated. It seems to make eg the numeric representation of AAAAA translate back to AAAAA not AAAA@. Yes, it’s a fudge.
This is the script:-
-- cre_hist_funcs.sql
-- Heavily borrowed from Jonathan Lewis, sep 2003
-- MDW 11/8/09 - all mistakes mine.
-- hist_chartonum converts a string to a number in a very similar way
-- to how Oracle does so for storing in column ENDPOINT_VALUE in XXX_TAB_HISTOGRAMS
-- Optionally get it to not truncate the value in the same way to only 15 digits.
-- hist_numtochar converts the ENDPOINT_VALUE back to human-readable format.
-- Optinally get it to go beyond 15 numerics (7 or 8 characters)
-- hist_numtochar2 attempts to allow for truncation errors.
-- JLs version has been checked against 8.1.7.4, 9.2.0.4 and 10.1.0.2
-- I've used on 10.2.0.3
set timing off
create or replace function hist_chartonum(p_vc varchar2
,p_trunc varchar2 :='Y') return number
is
m_vc varchar2(15) := substr(rpad(p_vc,15,chr(0)),1,15);
m_n number := 0;
begin
for i in 1..15 loop
/* dbms_output.put(ascii(substr(m_vc,i,1)));
dbms_output.put(chr(9));
dbms_output.put_Line(to_char( power(256,15-i) * ascii(substr(m_vc,i,1)),
'999,999,999,999,999,999,999,999,999,999,999,999'
)
); */
m_n := m_n + power(256,15-i) * ascii(substr(m_vc,i,1));
end loop;
-- this converts it from a 36 digit number to the 15-digit number used
-- in the ENDPOINT_VALUE Column.
If p_trunc = 'Y' then
m_n := round(m_n, -21);
end if;
-- dbms_output.put_line(to_char(m_n,'999,999,999,999,999,999,999,999,999,999,999,999'));
return m_n;
end;
/
create or replace function hist_numtochar(p_num number
,p_trunc varchar2 :='Y') return varchar2
is
m_vc varchar2(15);
m_n number :=0;
m_n1 number;
m_loop number :=7;
begin
m_n :=p_num;
if length(to_char(m_n))<36 then
--dbms_output.put_line ('input too short');
m_vc:='num format err';
else
if p_trunc !='Y' then
m_loop :=15;
end if;
--dbms_output.put_line(to_char(m_N,'999,999,999,999,999,999,999,999,999,999,999,999'));
for i in 1..m_loop loop
m_n1:=trunc(m_n/(power(256,15-i)));
-- dbms_output.put_line(to_char(m_n1));
if m_n1!=0 then m_vc:=m_vc||chr(m_n1);
end if;
dbms_output.put_line(m_vc);
m_n:=m_n-(m_n1*power(256,15-i));
end loop;
end if;
return m_vc;
end;
/
create or replace function hist_numtochar2(p_num number
,p_trunc varchar2 :='Y') return varchar2
is
m_vc varchar2(15);
m_n number :=0;
m_n1 number;
m_loop number :=7;
begin
m_n :=p_num;
if length(to_char(m_n))<36 then
--dbms_output.put_line ('input too short');
m_vc:='num format err';
else
if p_trunc !='Y' then
m_loop :=15;
else
m_n:=m_n+power(256,9);
end if;
--dbms_output.put_line(to_char(m_N,'999,999,999,999,999,999,999,999,999,999,999,999'));
for i in 1..m_loop loop
m_n1:=trunc(m_n/(power(256,15-i)));
-- dbms_output.put_line(to_char(m_n1));
if m_n1!=0 then m_vc:=m_vc||chr(m_n1);
end if;
dbms_output.put_line(m_vc);
m_n:=m_n-(m_n1*power(256,15-i));
end loop;
end if;
return m_vc;
end;
/
rem
rem Sample of use:
rem
col id1 format 999,999,999,999,999,999,999,999,999,999,999,999
col id3 format 999,999,999,999,999,999,999,999,999,999,999,999
select
'short' text
,hist_chartonum('short') id1
,hist_numtochar(hist_chartonum('short')) id2
,hist_chartonum('short','N') id3
,hist_numtochar(hist_chartonum('short','N'),'N') id4
from dual;
select
'alongteststring' text
,hist_chartonum('alongteststring') id1
,hist_numtochar(hist_chartonum('alongteststring')) id2
,hist_chartonum('alongteststring','N') id3
,hist_numtochar(hist_chartonum('alongteststring','N'),'N') id4
from dual;
select
'amuchlongerteststring' text
,hist_chartonum('amuchlongerteststring') id1
,hist_numtochar(hist_chartonum('amuchlongerteststring')) id2
,hist_chartonum('amuchlongerteststring','N') id3
,hist_numtochar(hist_chartonum('amuchlongerteststring','N'),'N') id4
from dual;
spool off
The test at the end produces the following output:-
strng histgrm_translation ------------------------------------------------ translate_back ------------------------------------------------ untrimmed_translation ------------------------------------------------ untrimmed_translate_back ------------------------------------------------ short 599,232,339,077,851,000,000,000,000,000,000,000 short 599,232,339,077,850,549,055,205,294,263,500,800 short alongteststring 505,852,124,009,532,000,000,000,000,000,000,000 alongtm 505,852,124,009,531,849,435,547,374,314,942,055 alongteststring amuchlongerteststring 505,872,878,384,947,000,000,000,000,000,000,000 amuchly 505,872,878,384,946,809,648,430,989,359,543,156 amuchlongertest
Final comment. Oracle trims the value stored in sys.histgrm$ and knackers up the untranslate. Why? Well, this is only me thinking out aloud, but this reduces the number from 36 to 15 significant digits and numerics are stored as one byte per two digits plus an offset byte {am I missing a length byte?}. So 10 bytes or so. This will keep the size of data stored down. I mean, as I posted here, the sys.histgrm table gets big enough as it is!
Tomorrow {or whenever my brain recovers} I’ll show how I use it and one or two oddities I have found or come across from other people’s blogs.
Why is my SYSTEM Tablespace so Big? August 3, 2009
Posted by mwidlake in internals, VLDB.Tags: data dictionary, statistics, VLDB
25 comments
How big is your SYSTEM tablespace? 512M? 1GB? 4GB? Bigger? Is it bigger than the whole of the first critical database you ever looked after? Why has the SYSTEM tablespace got so big?
You can easily check the size of the SYSTEM tablespace with any oracle development/admin GUI or with this script.
@dbf_lst
Enter value for ts_name: system
old 8: where tablespace_name like upper('&ts_name'||'%')
new 8: where tablespace_name like upper('system'||'%')
continue>
FILE_NAME F_ID TS_NAME SIZE_M THOU_BL
----------------------------------------- ----- ---------------- -------- -------
+DISCG/sid/datafile/system.272.596722817 1 SYSTEM 24,920 3,190
That is 24GB. It is not the biggest one I have seen, but it is up there in the top 3.
Not so many versions ago, it was not unusual to have a system tablespace of less then 128MB in size. In fact, if you did not use stored PL/SQL this was probably the case up to and through 8i and into 9i. So – it’s PL/SQL causing the bloat? No, probably not. Ahhh, what about all that AWR automatic real-time monitoring gubbins you hear so much about? Nope, that goes into SYSAUX.
No, it’s probably down to one or both of two internal tables, SYS.AUD$ and SYS.HISTGRM$.
Let’s check out what are the biggest objects in this particular SYSTEM tablespace:
select owner,segment_name,segment_type
,bytes/(1024*1024) size_m
from dba_segments
where tablespace_name = 'SYSTEM'
and bytes/(1024*1024) >1
order by size_m desc
OWNER SEGMENT_NAME SEGMENT_TYPE SIZE_M
------------ -------------------- ------------------ --------
SYS C_OBJ#_INTCOL# CLUSTER 13,313
SYS AUD$ TABLE 5,688
SYS I_H_OBJ#_COL# INDEX 1,863
SYS I_AUD1 INDEX 1,606
SYS HIST_HEAD$ TABLE 311
SYS SOURCE$ TABLE 224
SYS IDL_UB1$ TABLE 224
SYS C_FILE#_BLOCK# CLUSTER 208
SYS INDPART$ TABLE 160
SYS OBJ$ TABLE 144
SYS I_HH_OBJ#_COL# INDEX 128
SYS I_HH_OBJ#_INTCOL# INDEX 128
SYS I_OBJ2 INDEX 80
SYS I_FILE#_BLOCK# INDEX 62
SYS TABPART$ TABLE 57
SYS I_OBJ1 INDEX 49
{snip}
To be a little obtuse, I’ll start with the second largest object, SYS.AUD$, which you can see is getting close to 6GB in size {Oh, and SYS.I_AUD1, the fourth largest object, is the PK index on AUD$}.
AUD$ is the underlying table that holds all of the system auditing information that is turned on and off with the SQL AUDIT command. You can use AUDIT to monitor particular users, changes to system priveleges, access on key tables or particular activity like table changes. Go check the SQL manual if you want to know more.
The records for this captured activity all go into this table. On this particular system a few key priveleges and logons by certain users are the only things being audited, but the data has built up over 3 or 4 years.
On version 9 and 10 the options for managing this table are…poor. Bascially, you have to do it yourself. I think I am right to say that it is the only SYS-owned table that Oracle Corp say it is OK for you to move into another tablespace yourself and that you are allowed to delete and truncate data from. If you AUDIT a lot of things or activities you probably need to delete some data. {If you AUDIT a lot of things or activities and you don’t delete any of the data, you probably did not turn on AUDIT very long ago or have not checked the size of your SYSTEM tablespace for a while…}.
Oh, and one final thing on the AUD$ table. It is lacking indexes – it has only one {ignoring LOB indexes}. This is a good thing as indexes slow down inserts quite significantly {I use a rule of thumb that if it takes “N“ms to insert a record into a table, it will take “2.2*N“ms if there is an index on the table, “3.4*N“ms if there are two indexes, “4.6*N“ms if there are three… Get the idea? However, I have not checked this for real in years. Treat this statement as a potential myth}. I would recommend you do not add further indexes to AUD$. If you need to investigate the information in there, I would strongly suggest you extract what you want into a new table with a CTAS statement, then index that copy table and do your work on that. You do not want to slow down further everything that is already slowed down a little by being audited.
What about the largest thing in the SYSTEM tablespace? It’s SYS.C_OBJ#_INTCOL# nd is over 13GB in size. I said it would be SYS.HISTGRM$ didn’t I?
Well, SYS.C_OBJ#_INTCOL# is a cluster. Clusters are structures, segments, that hold one or more tables. All records for the same key across all those tables are held together, in the same block or blocks. The key is either a normal index or a hash. The idea is that it speeds up selecting records from different tables with the same key – they are all in the same block or blocks. It does work too, but they seem to have fallen out of favour with Oracle designers, but are worth considering.
In this particular cluster there is only one table, SYS.HISTGRM$:
select owner,table_name from dba_tables where cluster_name='C_OBJ#_INTCOL#' OWNER TABLE_NAME ------------------------------ ------------------------ SYS HISTGRM$ 1 row selected.
So, SYS.HISTGRM$ is the only table in the cluster. It {and it’s cluster index, on columns OBJ# and INTCOL#} are hiding in this cluster segment. It IS the biggest thing in this particular SYSTEM tablespace. The only index, other than the cluster index, on the table SYS.HISTGRM$ is I_H_OBJ#_COL# and is the third largest object in the SYSTEM tablespace.
--ind_cols
select
INDEX_OWNER ind_owner
,INDEX_NAME ind_Name
,TABLE_NAME tab_Name
,substr(column_position,1,3) Psn
,COLUMN_NAME Col_Name
from dba_ind_columns
where table_name like upper(nvl('&tab_name','WHOOPS')||'%')
order by 3,1,2,4,5
--eof
@ind_cols
Enter value for tab_name: histgrm$
IND_OWNER IND_NAME TAB_NAME PSN COL_NAME
----------- ------------------ ------------------ --------- --------------
SYS I_H_OBJ#_COL# HISTGRM$ 1 OBJ#
2 COL#
2 rows selected.
Elapsed: 00:00:02.07
-- you find the cluster index by looking on DBA_INDEXES and
--DBA_IND_COLUMNS where the "TABLE_NAME" is the name of the cluster
@ind_cols
Enter value for tab_name: C_OBJ#_INTCOL#
IND_OWNER IND_NAME TAB_NAME PSN COL_NAME
----------- ------------------ ------------------ --------- --------------
SYS I_OBJ#_INTCOL# C_OBJ#_INTCOL# 1 OBJ#
2 INTCOL#
2 rows selected.
Elapsed: 00:00:00.93
What is HISTGRM$? Do you think it looks like it could be something to do with column Histograms? Well it is, it’s the underlying data dictionary table that holds column histograms. It started to become common to see a large HISTGRM$ table with Oracle 9 systems with many partitions {and of course subpartitions, if used}, where statistics were gathered on these partitioned tables aggressively and included a METHOD_OPT clause that causes column statistics to be collected. eg FOR ALL INDEXED COLUMNS SIZE 200 or FOR ALL COLUMNS SIZE 100.
Where most people have seen it balloon in size though is with Oracle 10 and the introduction of…the Automated Stats gathering job {so, there is a connection to previous recent posts
}.
Every night, every weekend, any partition or sub-partiton that is 10% or more different to last time stats were gathered, get stats gathered. The automatic job gathers states with METHOD_OPT=AUTO. Which means your Oracle software decides which columns to gather histogram stats on and how many buckets it collects.
Whoever wrote that part of the stats gathering feature really liked to gather information.
If you have lots of partitions {or sub-partitions} and gather stats with the METHOD_OPT=AUTO, the two factors added together result in a massive HISTGRM$ table and thus a massive cluster.
Is space important? No, of course not, disc acreage is cheap, just buy bigger discs
. {That, of course, was a highly questionable suggestion}.
Being serious, I’m personally not too concerend about my SYSTEM tablespace being that big, after all compared to the size of databases that use thousands of partitions, 32GB is usually not a major concern, but I know of people who did not expect their system tablespace to get that big and have issues with simple running out of strage available to them.
I am more concerned about the impact on parse times and general data dictionary perforance of such large objects. I’ve found the data dictionary can be very slow for some queries when there are large numbers of extents, objects, columns…Histograms are just in there with the mix. I’ve had to work around such slow data dictionary access several times in the past and I suspect {though have not had time to in any way support this with evidence} that parsing is slowed down by unneccessary histograms. As an example of this, which actually has nothing to do with SYS.HISTGRM$, if you look back up this post at the code listing the indexes on the cluster and table, the code took between a second and two seconds to run. Normally it takes less than 1/10 of a second.
Can you do anything about the size of these objects? Well, with SYS.AUD$ Oracle allow you to delete from it and move the table to a new tablespace. If you use the SQL AUDIT command, you should include housekeeping the SYS.AUD$ table in your overall implementation.
SYS.C_OBJ#_INTCOL# and SYS.HISTGRM in it are slightly trickier. It is not easy to reduce the space used by these objects and thus the SYSTEM tablespace once they have grown large, but you can stop them getting bigger by collecting fewer column histogram statistics on your partitioned tables. Obviously that is not as simple as it sounds, but many sites with large numbers of objects do turn off the automated stats gathering and run their own code.
Automatic Statistics Gathering Fails #3 July 29, 2009
Posted by mwidlake in internals, performance.Tags: data dictionary, performance, statistics
7 comments
I’m steeling a few minutes from my lunch time to finish off this thread on the automated statistics gathering failing.
<<first post … <second post
There is another way to spot what the automated stats gathering choked on but it is not that reliable. But it does involve digging a little into Oracle’s internals, which is fun.
Under 10.2 onwards, when you gather stats on a segment, either directly or via a schema, database or the automatic stats collection, Oracle stores the stats that are replaced for you automatically {I say 10.2 as I am not sure that segment-level gather_table/index_stats are automatically backed up like this. I have some old notes saying it seemed not but I am not sure if this was 9i, 10.1 or me just getting confused at the time. If anyone knows or can check back, let me know
}
This means you can get the stats back using the dbma_stats.restore_xxxxxxxx_stats procedures. eg dbms_stats.restore_table_stats(ownname=>user,tabname=>’TEST_P’,as_of_timestamp=>systimestamp-1);
This will restore the stats of my table TEST_P to what they were at this time yesterday. You do not need to create a stattab table and store the prevous stats manually.
I’m not going to blog any more right now about this handy feature, the “PL/SQL packages and types” manual will tell you what you need, but I will comment that by default you can only go back 31 days.
Oracle gives you a table to see the history of stats stored, DBA_TAB_STATS_HIST
desc dba_tab_stats_history Name Null? Type ----------------------------------------- -------- ---------------------------- OWNER VARCHAR2(30) TABLE_NAME VARCHAR2(30) PARTITION_NAME VARCHAR2(30) SUBPARTITION_NAME VARCHAR2(30) STATS_UPDATE_TIME TIMESTAMP(6) WITH TIME ZONE select owner,table_name, partition_name,stats_update_time from dba_tab_stats_history where table_name='TEST_P' and owner=user OWNER TABLE_NAME PARTITION_NAME STATS_UPDATE_TIME -------- ---------- --------------- -------------------------------- WIDLAKEM TEST_P 28-JUL-09 13.59.43.647785 +00:00 WIDLAKEM TEST_P ID_10K 28-JUL-09 13.59.43.647785 +00:00 WIDLAKEM TEST_P ID_30K 28-JUL-09 13.59.43.647785 +00:00 WIDLAKEM TEST_P ID_MAX 28-JUL-09 13.59.43.647785 +00:00 WIDLAKEM TEST_P ID_40K 28-JUL-09 13.59.43.647785 +00:00 WIDLAKEM TEST_P ID_20K 28-JUL-09 13.59.43.647785 +00:00 WIDLAKEM TEST_P 28-JUL-09 13.37.42.043557 +00:00 WIDLAKEM TEST_P ID_MAX 28-JUL-09 13.37.42.043557 +00:00 WIDLAKEM TEST_P ID_40K 28-JUL-09 13.37.42.043557 +00:00 WIDLAKEM TEST_P ID_30K 28-JUL-09 13.37.42.043557 +00:00 WIDLAKEM TEST_P ID_20K 28-JUL-09 13.37.42.043557 +00:00 WIDLAKEM TEST_P ID_10K 28-JUL-09 13.37.42.043557 +00:00 WIDLAKEM TEST_P 28-JUL-09 09.11.40.098445 +00:00 WIDLAKEM TEST_P ID_MAX 28-JUL-09 09.11.40.098445 +00:00 WIDLAKEM TEST_P ID_40K 28-JUL-09 09.11.40.098445 +00:00 WIDLAKEM TEST_P ID_30K 28-JUL-09 09.11.40.098445 +00:00 WIDLAKEM TEST_P ID_20K 28-JUL-09 09.11.40.098445 +00:00 WIDLAKEM TEST_P ID_10K 28-JUL-09 09.11.40.098445 +00:00
You can see that I probably gathered stats on my test table 3 times yesterday, each time I gathered at a granularity of all, ie global and partition stats. Note also, all partitions and the table get the same timestamp. I think this is because oracle records the timestamp as when the stats for the set of segments was swapped into the data dictionary {note, not started}.
That’s quite useful. However, there is no similar view for indexes. That’s a shame. But you can recover index stats and this view is saying nothing about the stats as they were?…Is there more to be found?… Let’s go see what that view is looking it
@vw_txt
Enter value for vw_name: dba_tab_stats_history
OWNER VIEW_NAME TEXT_LENGTH
-------- ------------------------------ -----------
TEXT
--------------------------------------------------------------------------
SYS DBA_TAB_STATS_HISTORY 876
select u.name, o.name, null, null, h.savtime
from sys.user$ u, sys.obj$ o, sys.wri$_optstat_tab_history h
where h.obj# = o.obj# and o.type# = 2 and o.owner# = u.user#
union all
-- partitions
select u.name, o.name, o.subname, null, h.savtime
from sys.user$ u, sys.obj$ o, sys.wri$_optstat_tab_history h
where h.obj# = o.obj# and o.type# = 19 and o.owner# = u.user#
union all
-- sub partitions
select u.name, osp.name, ocp.subname, osp.subname, h.savtime
from sys.user$ u, sys.obj$ osp, obj$ ocp, sys.tabsubpart$ tsp,
sys.wri$_optstat_tab_history h
where h.obj# = osp.obj# and osp.type# = 34 and osp.obj# = tsp.obj# and
tsp.pobj# = ocp.obj# and osp.owner# = u.user#
union all
-- fixed tables
select 'SYS', t.kqftanam, null, null, h.savtime
from sys.x$kqfta t, sys.wri$_optstat_tab_history h
where
t.kqftaobj = h.obj#
Now that is interesting. Let’s go look at that table:
desc sys.wri$_optstat_tab_history Name Null? Type ----------------------------------------- -------- --------------------------- OBJ# NOT NULL NUMBER SAVTIME TIMESTAMP(6) WITH TIME ZONE FLAGS NUMBER ROWCNT NUMBER BLKCNT NUMBER AVGRLN NUMBER SAMPLESIZE NUMBER ANALYZETIME DATE CACHEDBLK NUMBER CACHEHIT NUMBER LOGICALREAD NUMBER SPARE1 NUMBER SPARE2 NUMBER SPARE3 NUMBER SPARE4 VARCHAR2(1000) SPARE5 VARCHAR2(1000) SPARE6 TIMESTAMP(6) WITH TIME ZONE
Some of those columns look interesting. Unforturnately CACHEDBLK onwards are empty, even in version 11 {but it shows that Oracle have built in the ability to use those yet-to-be used columns about average cached blocks and cache hit ratios for segments you might have spotted in eg DBA_TAB_STATISTICS}.
Could there be an index version? Of course there is :-
desc sys.wri$_optstat_ind_history Name Null? Type ----------------------------------------- -------- --------------------------- OBJ# NOT NULL NUMBER SAVTIME TIMESTAMP(6) WITH TIME ZONE FLAGS NUMBER ROWCNT NUMBER BLEVEL NUMBER LEAFCNT NUMBER DISTKEY NUMBER LBLKKEY NUMBER DBLKKEY NUMBER CLUFAC NUMBER SAMPLESIZE NUMBER ANALYZETIME DATE GUESSQ NUMBER CACHEDBLK NUMBER CACHEHIT NUMBER LOGICALREAD NUMBER SPARE1 NUMBER SPARE2 NUMBER SPARE3 NUMBER SPARE4 VARCHAR2(1000) SPARE5 VARCHAR2(1000) SPARE6 TIMESTAMP(6) WITH TIME ZONE
I’ve written a couple of scripts to extract data from these underlying tables, with the extra data you can grab. Use these links to download the table script and the index script.
Here below is the index script though:-
-- show_ish_full
-- Martin Widlake 14/4/08
-- this gets the index stats history - there is no index equiv of dba_tab_stats_hist
-- that I can see in 10.2 (or 11)
--
col owner form a12
col ind_full_name form a40
col stat_upd_time form a15
col blevel form 99 head bl
select owner,ind_full_name,
to_char(stat_upd_time,'MMDD HH24:MI:SS') stat_upd_time
,numrows,numleafs,blevel,dist_kys,lf_p_ky, dbl_p_ky,clufac,samp_size,obj_hash
,greatest(nvl(samp_size,1),1)/greatest(nvl(numrows,1),1) samp_pct
from (
select u.name owner, o.name ind_name,o.name ind_full_name
, h.savtime stat_upd_time
,h.rowcnt numrows,h.leafcnt numleafs, h.blevel blevel,h.distkey dist_kys
,h.lblkkey lf_p_ky,h.dblkkey dbl_p_ky,h.clufac clufac, h.samplesize samp_size,h.obj# obj_hash
from sys.user$ u, sys.obj$ o, sys.wri$_optstat_ind_history h
where h.obj# = o.obj# and o.type# = 1 and o.owner# = u.user#
union all
-- partitions
select u.name owner, o.name ind_name, o.name||'-'||o.subname ind_full_name
, h.savtime stat_upd_time
,h.rowcnt numrows,h.leafcnt numleafs, h.blevel blevel,h.distkey dist_kys
,h.lblkkey lf_p_ky,h.dblkkey dbl_p_ky,h.clufac clufac, h.samplesize samp_size,h.obj# obj_hash
from sys.user$ u, sys.obj$ o, sys.wri$_optstat_ind_history h
where h.obj# = o.obj# and o.type# = 20 and o.owner# = u.user#
union all
-- sub partitions
select u.name owner, osp.name ind_name, osp.name||'-'||ocp.subname||'='|| osp.subname ind_full_name
, h.savtime stat_upd_time
,h.rowcnt numrows,h.leafcnt numleafs, h.blevel blevel,h.distkey dist_kys
,h.lblkkey lf_p_ky,h.dblkkey dbl_p_ky,h.clufac clufac, h.samplesize samp_size,h.obj# obj_hash
from sys.user$ u, sys.obj$ osp, obj$ ocp, sys.indsubpart$ tsp,
sys.wri$_optstat_ind_history h
where h.obj# = osp.obj# and osp.type# = 35 and osp.obj# = tsp.obj# and
tsp.pobj# = ocp.obj# and osp.owner# = u.user#
) where ind_name like upper(nvl('&indname','whoops')||'%')
and owner like upper('&indown')||'%'
order by owner,ind_name,stat_upd_time desc
/
clear colu
I’ll leave you to play with these as you wish, especially as I ran out of time to finish this blog an hour ago and now owe my current employer some of my weekend in return, but I’ll finish off with a use for the scripts.
One thing to bear in mind, these tables are holding the stats that were replaced, not the stats that were created. So they hold historical information.
If you read the earlier posts, you may remember that the automatic stats job can choke and fail on extremely large segments. Well, sometimes {and I have to admit, only sometimes} you can spot what segments it failed on. That is because, at the end of the window when the job gets stopped, it attempts to clean up after itself. It appears to check for stats that have changed during the run but it failed to process correctly and resets them. It copes the stats back. So you sometimes see this:-
@show_tsh_full OWNER TAB_FULL_NAME STAT_UPD_TIME NUMROWS NUMBLKS AVG_RL SAMP_SIZE OBJ_HASH SAMP_PCT ------------ ---------------------------------------- -------------------- ---------- ---------- ------ ---------- -------- -------- TABOWNER1 HUGETABLE4 0522 06:00:01 1473990266 27319245 128 5660375 54714 .004 TABOWNER1 HUGETABLE4 0522 05:48:22 1327119100 24544873 128 13271191 54714 .010
In the above case, you can see that this massive table had stats set at 5:48am and then again at 6:00.01am. That second record is the automated stats swapping the last known good stats back in place. That is the segment it choked on.
How do you stop the automatic stats job chocking on this extent? You do a manual gather of course. You will see the above job attempted a 0.1% sample size that almost worked at 05:48. The value it swapped back is 0.04 – which was the sample size of the manual gather I did a few days earlier to get over the last choke. {I also tweaked the histograms I gathered to save time}. I had not yet finished converting the manual fix to an automated one when I recorded the above.
Why do I think that the automatic job sets the timestamp in the WRI$_OPTSTAT_xxx_HIST tables when it swaps in the new stats and not when it starts? Because of the time of the entries after a large object has been analysed. Why do I think it checks for stats that have changed since the start of the run and replaces them rather than keeping track of the information as it goes? Because, when I first came across this choking issue, I was veryifying the problem after 10pm at night. When I realised there was an issue at around 1am, I started manually gathering stats. It took me until 4am, I checked they were in place and went to bed. Imagine my bad temper the next day when I found out that the automatic job had gone and re-wound most of the stats I had gathered, resetting them to what they had been at 10am the previous night. It was a seriously bad temper.
Addition – I’ve had a comment that may or may not be spam, saying they do not understand “the last bit” (If you are a real person molamola, I apologise). Reading it, it is maybe not clear, so this is what happened and why I was angry
1) I had realised the auto stats job was failing overnight.
2) I logged on from home at just before 10pm and watched the system
3) at 10pm the auto stats job started and collected stats on some small tables and indexes
4) around 11,11:30pm I realised is was not progressing,
5) from 11:30 to 3am in the morning, I created my own gather_table_stats and gather_index_stats statements and ran them.
6) at around 4am I went to bed, having checked stats were gathered and execution plans were good.
7) Got to work around 9am, a bit tired.
8) Found lots of performance issues due to old stats.
9) Shouted randomly at people and got angry, started re-gathering stats.
10) Worked out that at 6am, when the auto stats job finished, all the tables/indexes IT had intended to gather stats on but had failed to do so, it reset the stats to what they had been set to at 10pm the previous night, when it started.
ie very old and bad stats.
