Update to Decoding High and Low values February 24, 2010
Posted by mwidlake in internals, performance.Tags: data dictionary, statistics
9 comments
After what seems an age, I finally got a little time to enhance my col_stats code with the extra info people provided as comments to
this original posting on it. I also now handle the translation of dates correctly and also timestamps.
I had a real struggle with timestamps. I knew I could find the answer in Tom Kyte’s “Expert Oracle Database Architecture” as one of the comments said so, but I tried to work it out myself as my copy of the book was at my Mothers {no, she is not learning Oracle Databases, I left it there by accident}. As the other elements of timestamps are held as numerical elements stored as hex (number of days, number of months) I tried to interpret it like that. I was being too complex, it is just an 8-digit hex number – to_number(value,’XXXXXXXX’).
Anyway, this is new-improved, cleanes whiter-than-white script {I’ve increased the width of myt blog to better show code like the belwo, but try clicking on “show source” if you want a plain text version}:
-- col_stats
-- Martin Widlake mdw 21/03/2003
-- MDW 11/12/09 enhanced to include more translations of low_value/high_value
-- pilfered from Gary Myers blog
-- MDW 20/02/10 added in the handling of timestamps.
col owner form a6 word wrap
col table_name form a15 word wrap
col column_name form a22 word wrap
col data_type form a12
col M form a1
col num_vals form 99999,999
col dnsty form 0.9999
col num_nulls form 99999,999
col low_v form a30
col low_v2 form a18
col hi_v form a30
col data_type form a10
col low_value form a25
col high_value form a25
set lines 110
break on owner nodup on table_name nodup
spool col_stats.lst
select --owner
-- ,table_name
column_name
,data_type
,decode (nullable,'N','Y','N') M
,num_distinct num_vals
,num_nulls
,density dnsty
,decode(substr(data_type,1,9) -- as there are several timestamp types
,'NUMBER' ,to_char(utl_raw.cast_to_number(low_value))
,'VARCHAR2' ,to_char(utl_raw.cast_to_varchar2(low_value))
,'NVARCHAR2' ,to_char(utl_raw.cast_to_nvarchar2(low_value))
,'BINARY_DO',to_char(utl_raw.cast_to_binary_double(low_value))
,'BINARY_FL' ,to_char(utl_raw.cast_to_binary_float(low_value))
,'DATE',rtrim(
to_char(100*(to_number(substr(low_value,1,2),'XX')-100)
+ (to_number(substr(low_value,3,2),'XX')-100),'fm0000')||'-'||
to_char(to_number(substr(low_value,5,2),'XX'),'fm00')||'-'||
to_char(to_number(substr(low_value,7,2),'XX'),'fm00')||' '||
to_char(to_number(substr(low_value,9,2),'XX')-1,'fm00')||':'||
to_char(to_number(substr(low_value,11,2),'XX')-1,'fm00')||':'||
to_char(to_number(substr(low_value,13,2),'XX')-1,'fm00'))
,'TIMESTAMP',rtrim(
to_char(100*(to_number(substr(low_value,1,2),'XX')-100)
+ (to_number(substr(low_value,3,2),'XX')-100),'fm0000')||'-'||
to_char(to_number(substr(low_value,5,2),'XX'),'fm00')||'-'||
to_char(to_number(substr(low_value,7,2),'XX'),'fm00')||' '||
to_char(to_number(substr(low_value,9,2),'XX')-1,'fm00')||':'||
to_char(to_number(substr(low_value,11,2),'XX')-1,'fm00')||':'||
to_char(to_number(substr(low_value,13,2),'XX')-1,'fm00')
||'.'||to_number(substr(low_value,15,8),'XXXXXXXX') )
) low_v
,decode(substr(data_type,1,9) -- as there are several timestamp types
,'NUMBER' ,to_char(utl_raw.cast_to_number(high_value))
,'VARCHAR2' ,to_char(utl_raw.cast_to_varchar2(high_value))
,'NVARCHAR2' ,to_char(utl_raw.cast_to_nvarchar2(high_value))
,'BINARY_DO',to_char(utl_raw.cast_to_binary_double(high_value))
,'BINARY_FL' ,to_char(utl_raw.cast_to_binary_float(high_value))
,'DATE',rtrim(
to_char(100*(to_number(substr(high_value,1,2),'XX')-100)
+ (to_number(substr(high_value,3,2),'XX')-100),'fm0000')||'-'||
to_char(to_number(substr(high_value,5,2),'XX'),'fm00')||'-'||
to_char(to_number(substr(high_value,7,2),'XX'),'fm00')||' '||
to_char(to_number(substr(high_value,9,2),'XX')-1,'fm00')||':'||
to_char(to_number(substr(high_value,11,2),'XX')-1,'fm00')||':'||
to_char(to_number(substr(high_value,13,2),'XX')-1,'fm00'))
,'TIMESTAMP',rtrim(
to_char(100*(to_number(substr(high_value,1,2),'XX')-100)
+ (to_number(substr(high_value,3,2),'XX')-100),'fm0000')||'-'||
to_char(to_number(substr(high_value,5,2),'XX'),'fm00')||'-'||
to_char(to_number(substr(high_value,7,2),'XX'),'fm00')||' '||
to_char(to_number(substr(high_value,9,2),'XX')-1,'fm00')||':'||
to_char(to_number(substr(high_value,11,2),'XX')-1,'fm00')||':'||
to_char(to_number(substr(high_value,13,2),'XX')-1,'fm00')
||'.'||to_char(to_number(substr(low_value,15,8),'XXXXXXXX')))
, high_value
) hi_v
,low_value,high_value
from dba_tab_columns
where owner like upper('&tab_own')
and table_name like upper(nvl('&tab_name','WHOOPS')||'%')
ORDER BY owner,table_name,COLUMN_ID
/
clear colu
spool off
clear breaks
Yes, I know, it is a considerable chunk of code just to get the high and low values for columns, as identified by the last stats gather to be run on the table. {it is important to remember that these high and low values will only be as accurate as the stats gather that was run and when. An “estimate_percent=> 100″ run a minute ago will give probaby give accurate values, an “estimate_percent=>0.1′” run last month is not – which is usually what I am checking for}. I could make the script a lot neater by writing a set of functions to call to get the translation, as was suggested as a comment on the original posting and is in fact somethign I have done in the past, but one is often not allowed to put new stored PL/SQL code onto production systems, so a script which does it is, for me, best.
I should show some proof of it working, I guess. The below will create a table with a nice spread of column type and three records with a reasonable spread of data:
-- test_col_hilo
-- M Widlake 11/12/09
--
--testing script to show high and low values for various column types
--
set lines 90 pages 1000 trims on pause off
--set sqlpro 'test102>'
set sqlpro'>'
set echo on
set autotrace off
spool test_col_hilo.lst
-- basic setup info
col name form a30
col value form a20
select substr(name,1,30) name,substr(value,1,20) value
from v$parameter
where name in ('db_block_size','compatible','cpu_count','db_file_multiblock_read_count'
,'optimizer_mode','sga_target','sort_area_size')
order by name
/
drop table test_hilo purge;
--
--
COL TABLE_NAME FORM A15
COL TABLE_OWNER FORM A10
create table test_hilo
(id number(8) not null
,numsmall number(4)
,numlarge number
,datesmall date
,datelarge date
,ts timestamp
,ts0 timestamp(0)
,ts3 timestamp(3)
,ts6 timestamp(6)
,ts9 timestamp(9)
,vcsmall varchar2(10)
,vclarge varchar2(100)
,vcodd varchar2(20)
,nvcsmall nvarchar2(10)
,nvclarge nvarchar2(100)
,bfsmall binary_float -- NB precision is only 6 or 7 digits
,bflarge binary_float
,bdsmall binary_double
,bdlarge binary_double
)
tablespace users
/
-- insert low values
insert into test_hilo
(id,numsmall,numlarge
,datesmall,datelarge
,ts,ts0,ts3
,ts6,ts9
,vcsmall,vclarge
,vcodd
,nvcsmall,nvclarge
,bfsmall,bflarge
,bdsmall,bdlarge
)
values
(1
,-10,-123456789.12345
,trunc(sysdate-1000),sysdate-500000
,systimestamp,systimestamp
,to_timestamp('01-FEB-2010 12:34:56.123','DD-MON-YYYY HH24:MI:SS.FF')
,to_timestamp('01-FEB-2010 12:34:56.1234567','DD-MON-YYYY HH24:MI:SS.FF')
,to_timestamp('01-FEB-2010 12:34:56.987654321','DD-MON-YYYY HH24:MI:SS.FF')
,'ABCDE','AABCABCDEABCDEFGABCDEFGHI'
,'Abc!"£$%^&*()deF'
,'ABCDE','AABCABCDEABCDEFGABCDEFGHI'
,-1.23,-12345.6
,12345,1234567890
)
/
-- insert mid values
insert into test_hilo
(id,numsmall,numlarge
,datesmall,datelarge
,ts,ts0,ts3
,ts6,ts9
,vcsmall,vclarge
,vcodd
,nvcsmall,nvclarge
,bfsmall,bflarge
,bdsmall,bdlarge
)
values
(2
,15,1515151515151
,trunc(sysdate-10),sysdate-5000
,systimestamp,systimestamp
,to_timestamp('05-FEB-2010 12:34:56.123','DD-MON-YYYY HH24:MI:SS.FF')
,to_timestamp('05-FEB-2010 12:34:56.1234567','DD-MON-YYYY HH24:MI:SS.FF')
,to_timestamp('05-FEB-2010 12:34:56.987654321','DD-MON-YYYY HH24:MI:SS.FF')
,'CCCCC','CCBCABCDEABCDEFGABCDEFGHI'
,'CbaaBC'
,'EFGHI','ABCDEFGHIJKLMNOPQRSTUV'
,1.23,12345.6
,54321,5432112345
)
/
--insert high values
insert into test_hilo
(id,numsmall,numlarge
,datesmall,datelarge
,ts,ts0,ts3
,ts6,ts9
,vcsmall,vclarge
,vcodd
,nvcsmall,nvclarge
,bfsmall,bflarge
,bdsmall,bdlarge
)
values
(3
,99,9898989898989
,trunc(sysdate-1),sysdate+1000
,systimestamp+10,systimestamp+10
,to_timestamp('20-FEB-2010 18:17:16.876','DD-MON-YYYY HH24:MI:SS.FF')
,to_timestamp('21-FEB-2010 17:16:15.555555','DD-MON-YYYY HH24:MI:SS.FF')
,to_timestamp('22-FEB-2010 16:15:14.123456789','DD-MON-YYYY HH24:MI:SS.FF')
,'ZYXWV','ZZZZZYZZXZZWZZVZZUZZTZZSZZRZZQZZP'
,'Z;#[]{}~@:'
,'VWXYZ','ZYXWVUTSRQPONMLKJIHGFEDCBA'
,9.87,98765.4
,987654321,987654321.1234567
)
/
commit;
EXEC DBMS_STATS.GATHER_TABLE_STATS(OWNNAME=> USER, TABNAME=>'TEST_HILO')
set autotrace off
set echo off
--
spool off
clear colu
--
And the below is the translation of the values – I used a cut-down version of my script, so it woud fit on the blog better:
COLUMN_NAME DATA_TYPE LOW_V HI_V
------------ ------------- ----------------------------- -----------------------------
LOW_VALUE HIGH_VALUE
------------------------- -------------------------
ID NUMBER 1 3
C102 C104
NUMSMALL NUMBER -10 99
3E5B66 C164
NUMLARGE NUMBER -123456789.12345 9898989898989
3A644E38220C59433366 C70A5A5A5A5A5A5A
DATESMALL DATE 2007-05-31 00:00:00 2010-02-23 00:00:00
786B051F010101 786E0217010101
DATELARGE DATE 0641-03-10 17:36:47 2012-11-20 17:36:47
6A8D030A122530 78700B14122530
TS TIMESTAMP(6) 2010-02-24 17:36:47.255015000 2010-03-06 17:36:47.255015000
786E02181225300F333858 786E0306122530
TS0 TIMESTAMP(0) 2010-02-24 17:36:47. 2010-03-06 17:36:47.
786E0218122530 786E0306122530
TS3 TIMESTAMP(3) 2010-02-01 12:34:56.123000000 2010-02-20 18:17:16.123000000
786E02010D23390754D4C0 786E02141312113436B300
TS6 TIMESTAMP(6) 2010-02-01 12:34:56.123457000 2010-02-21 17:16:15.123457000
786E02010D2339075BCDE8 786E0215121110211D18B8
TS9 TIMESTAMP(9) 2010-02-01 12:34:56.987654321 2010-02-22 16:15:14.987654321
786E02010D23393ADE68B1 786E021611100F075BCD15
VCSMALL VARCHAR2 ABCDE ZYXWV
4142434445 5A59585756
VCLARGE VARCHAR2 AABCABCDEABCDEFGABCDEFGHI ZZZZZYZZXZZWZZVZZUZZTZZSZZRZZ
QZZ
4141424341424344454142434 5A5A5A5A5A595A5A585A5A575
4454647414243444546474849 A5A565A5A555A5A545A5A535A
5A525A5A515A5A
VCODD VARCHAR2 Abc!"£$%^&*()deF Z;#[]{}~@:
4162632122C2A324255E262A2 5A3B235B5D7B7D7E403A
829646546
NVCSMALL NVARCHAR2 ABCDE VWXYZ
00410042004300440045 0056005700580059005A
NVCLARGE NVARCHAR2 AABCABCDEABCDEFG ZYXWVUTSRQPONMLK
0041004100420043004100420 005A005900580057005600550
0430044004500410042004300 0540053005200510050004F00
44004500460047 4E004D004C004B
BFSMALL BINARY_FLOAT 3.53999972E+000 -9.86999989E+000
40628F5B C11DEB85
BFLARGE BINARY_FLOAT 3.64494306E-004 -9.87653984E+004
39BF1999 C7C0E6B3
BDSMALL BINARY_DOUBLE -1.2345E+004 -9.87654321E+008
C0C81C8000000000 C1CD6F3458800000
BDLARGE BINARY_DOUBLE -9.8765432112345672E+008 -5.432112345E+009
C1CD6F34588FCD6E C1F43C774D900000
I could go on and look at handling intervals and time zones but I leave this to you as an exercise. In other words, I have no real need for those data types right now and can’t justify the time! I hope the above is enough to answer whatever questions you may have about the high and low values of your columns…
