Data Dictionary Performance #2 September 28, 2009
Posted by mwidlake in internals, performance.Tags: data dictionary, performance
4 comments
A couple of posts ago I demonstrated that the DBA_ views are general faster than the ALL_ views and that adding an owner filter to DBA_ views is potentially quicker than selecting from the USER_ views.
Something else I do automatically, to help performance, is always specify and link via the schema owner.
The below is based on a 10.2.0.3 database with fixed object statistics gathered, though a long while ago. A have a test partitioned table called TEST_P and I want to see how many blocks are in the table and also in the table partitions, so I link between DBA_TABLES and DBA_TAB_PARTITIONS. I only specify the table name as I know there is only one table called TEST_P:
select dbta.table_name,dbta.blocks ,dtp.partition_name,dtp.blocks from dba_tables dbta ,dba_tab_partitions dtp where dbta.table_name=dtp.table_name and dbta.table_name='TEST_P' TABLE_NAME BLOCKS PARTITION_ BLOCKS ---------- ---------- ---------- ---------- TEST_P 7394 ID_MAX 370 TEST_P 7394 ID_40K 1756 TEST_P 7394 ID_30K 1756 TEST_P 7394 ID_20K 1756 TEST_P 7394 ID_10K 1756 5 rows selected. Elapsed: 00:00:00.04 Statistics --------------------------------------------- 0 recursive calls 0 db block gets 782 consistent gets
Now I do the same but specify the schema owner in the join.
select dbta.table_name,dbta.blocks ,dtp.partition_name,dtp.blocks from dba_tables dbta ,dba_tab_partitions dtp where dbta.table_name =dtp.table_name AND DBTA.OWNER =DTP.TABLE_OWNER and dbta.table_name ='TEST_P' TABLE_NAME BLOCKS PARTITION_ BLOCKS ---------- ---------- ---------- ---------- TEST_P 7394 ID_MAX 370 TEST_P 7394 ID_40K 1756 TEST_P 7394 ID_30K 1756 TEST_P 7394 ID_20K 1756 TEST_P 7394 ID_10K 1756 5 rows selected. Elapsed: 00:00:00.04 Statistics -------------------------------------------- 0 recursive calls 0 db block gets 580 consistent gets
Just for the record, I have run these statements several time so the consistent gets are constant. The form with the “AND DBTA.OWNER =DTP.TABLE_OWNER” line is consistently running at 580 consistent gets and the version without at 782 consistent gets.
If I now add in the where clause to look only for tables within the schema I am interested in, the code is even more efficient:
select dbta.table_name,dbta.blocks ,dtp.partition_name,dtp.blocks from dba_tables dbta ,dba_tab_partitions dtp where dbta.table_name =dtp.table_name AND DBTA.OWNER =DTP.TABLE_OWNER and dbta.table_name ='TEST_P' and dbta.owner =user TABLE_NAME BLOCKS PARTITION_ BLOCKS ---------- ---------- ---------- ---------- TEST_P 7394 ID_10K 1756 TEST_P 7394 ID_20K 1756 TEST_P 7394 ID_30K 1756 TEST_P 7394 ID_40K 1756 TEST_P 7394 ID_MAX 370 5 rows selected. Elapsed: 00:00:00.03 Statistics -------------------------------------------- 0 recursive calls 0 db block gets 137 consistent gets
Down to 137 consistent gets.
It’s maybe not at all suprising that fully specifying the schema owner of the table we are interested in allows a more efficient path through the data dictionary, but it is something we all often fail to specify as it means writing more SQL, and it works quickly enough. For one table.
But then if you convert a quick check script into part of your system housekeeping scripts and then use that housekeeping script on a very large database, things can start to run very slowly. I’ve never seen that happen, not with one of my own scripts, honest 🙂
The reason to always specify the schema owner, at all times when possible, when drilling down the data dictionary, is that so many things link back to sys.obj$ and it has an index with a leading column on owner {well,owner#}. The whole data dictionary tends towards 3rd normal form and parent-child relationships through IDs (name#, obj#, user#, ts#). Schema and name identify the objects and are one of the few links into this structure, which then links to other tables via obj# and other hash IDs.
IND_NAME TAB_NAME PSN COL_NAME --------------- ------------------ --------- ------------- I_OBJ1 OBJ$ 1 OBJ# I_OBJ2 OBJ$ 1 OWNER# 2 NAME 3 NAMESPACE 4 REMOTEOWNER 5 LINKNAME 6 SUBNAME I_OBJ3 OBJ$ 1 OID$
So my advice is, if you are looking at data dictionary views where the schema owner is involved, always include that column in the join and always filter by schema owner if you can. And if things are still slow, pull out the view code and look at the indexes on the underlying sys.$ views. After all, the data dictionary is, to all intents and purposes, just a normal normalised database.