jump to navigation

Update to Decoding High and Low values February 24, 2010

Posted by mwidlake in internals, performance.
Tags: ,
10 comments

After what seems an age, I finally got a little time to enhance my col_stats code with the extra info people provided as comments to
this original posting on it. I also now handle the translation of dates correctly and also timestamps.

I had a real struggle with timestamps. I knew I could find the answer in Tom Kyte’s “Expert Oracle Database Architecture” as one of the comments said so, but I tried to work it out myself as my copy of the book was at my Mothers {no, she is not learning Oracle Databases, I left it there by accident}. As the other elements of timestamps are held as numerical elements stored as hex (number of days, number of months) I tried to interpret it like that. I was being too complex, it is just an 8-digit hex number – to_number(value,’XXXXXXXX’).

Anyway, this is new-improved, cleanes whiter-than-white script {I’ve increased the width of myt blog to better show code like the belwo, but try clicking on “show source” if you want a plain text version}:

-- col_stats
-- Martin Widlake mdw 21/03/2003
-- MDW 11/12/09 enhanced to include more translations of low_value/high_value
-- pilfered from Gary Myers blog
-- MDW 20/02/10 added in the handling of timestamps.
col owner        form a6 word wrap
col table_name   form a15 word wrap
col column_name  form a22 word wrap
col data_type    form a12
col M            form a1
col num_vals     form 99999,999
col dnsty        form 0.9999
col num_nulls    form 99999,999
col low_v        form a30
col low_v2       form a18
col hi_v         form a30
col data_type    form a10
col low_value    form a25
col high_value   form a25
set lines 110
break on owner nodup on table_name nodup
spool col_stats.lst
select --owner
--      ,table_name
      column_name
      ,data_type
      ,decode (nullable,'N','Y','N')  M
      ,num_distinct num_vals
      ,num_nulls
      ,density dnsty
,decode(substr(data_type,1,9) -- as there are several timestamp types
  ,'NUMBER'       ,to_char(utl_raw.cast_to_number(low_value))
  ,'VARCHAR2'     ,to_char(utl_raw.cast_to_varchar2(low_value))
  ,'NVARCHAR2'    ,to_char(utl_raw.cast_to_nvarchar2(low_value))
  ,'BINARY_DO',to_char(utl_raw.cast_to_binary_double(low_value))
  ,'BINARY_FL' ,to_char(utl_raw.cast_to_binary_float(low_value))
  ,'DATE',rtrim(
               to_char(100*(to_number(substr(low_value,1,2),'XX')-100)
                      + (to_number(substr(low_value,3,2),'XX')-100),'fm0000')||'-'||
               to_char(to_number(substr(low_value,5,2),'XX'),'fm00')||'-'||
               to_char(to_number(substr(low_value,7,2),'XX'),'fm00')||' '||
               to_char(to_number(substr(low_value,9,2),'XX')-1,'fm00')||':'||
               to_char(to_number(substr(low_value,11,2),'XX')-1,'fm00')||':'||
               to_char(to_number(substr(low_value,13,2),'XX')-1,'fm00'))
  ,'TIMESTAMP',rtrim(
               to_char(100*(to_number(substr(low_value,1,2),'XX')-100)
                      + (to_number(substr(low_value,3,2),'XX')-100),'fm0000')||'-'||
               to_char(to_number(substr(low_value,5,2),'XX'),'fm00')||'-'||
               to_char(to_number(substr(low_value,7,2),'XX'),'fm00')||' '||
               to_char(to_number(substr(low_value,9,2),'XX')-1,'fm00')||':'||
               to_char(to_number(substr(low_value,11,2),'XX')-1,'fm00')||':'||
               to_char(to_number(substr(low_value,13,2),'XX')-1,'fm00')
              ||'.'||to_number(substr(low_value,15,8),'XXXXXXXX')  )
       ) low_v
,decode(substr(data_type,1,9) -- as there are several timestamp types
  ,'NUMBER'       ,to_char(utl_raw.cast_to_number(high_value))
  ,'VARCHAR2'     ,to_char(utl_raw.cast_to_varchar2(high_value))
  ,'NVARCHAR2'    ,to_char(utl_raw.cast_to_nvarchar2(high_value))
  ,'BINARY_DO',to_char(utl_raw.cast_to_binary_double(high_value))
  ,'BINARY_FL' ,to_char(utl_raw.cast_to_binary_float(high_value))
  ,'DATE',rtrim(
               to_char(100*(to_number(substr(high_value,1,2),'XX')-100)
                      + (to_number(substr(high_value,3,2),'XX')-100),'fm0000')||'-'||
               to_char(to_number(substr(high_value,5,2),'XX'),'fm00')||'-'||
               to_char(to_number(substr(high_value,7,2),'XX'),'fm00')||' '||
               to_char(to_number(substr(high_value,9,2),'XX')-1,'fm00')||':'||
               to_char(to_number(substr(high_value,11,2),'XX')-1,'fm00')||':'||
               to_char(to_number(substr(high_value,13,2),'XX')-1,'fm00'))
  ,'TIMESTAMP',rtrim(
               to_char(100*(to_number(substr(high_value,1,2),'XX')-100)
                      + (to_number(substr(high_value,3,2),'XX')-100),'fm0000')||'-'||
               to_char(to_number(substr(high_value,5,2),'XX'),'fm00')||'-'||
               to_char(to_number(substr(high_value,7,2),'XX'),'fm00')||' '||
               to_char(to_number(substr(high_value,9,2),'XX')-1,'fm00')||':'||
               to_char(to_number(substr(high_value,11,2),'XX')-1,'fm00')||':'||
               to_char(to_number(substr(high_value,13,2),'XX')-1,'fm00')
              ||'.'||to_char(to_number(substr(low_value,15,8),'XXXXXXXX')))
  ,  high_value
       ) hi_v
,low_value,high_value
from dba_tab_columns
where owner      like upper('&tab_own')
and   table_name like upper(nvl('&tab_name','WHOOPS')||'%')
ORDER BY owner,table_name,COLUMN_ID
/
clear colu
spool off
clear breaks

Yes, I know, it is a considerable chunk of code just to get the high and low values for columns, as identified by the last stats gather to be run on the table. {it is important to remember that these high and low values will only be as accurate as the stats gather that was run and when. An “estimate_percent=> 100″ run a minute ago will give probaby give accurate values, an “estimate_percent=>0.1′” run last month is not – which is usually what I am checking for}. I could make the script a lot neater by writing a set of functions to call to get the translation, as was suggested as a comment on the original posting and is in fact somethign I have done in the past, but one is often not allowed to put new stored PL/SQL code onto production systems, so a script which does it is, for me, best.

I should show some proof of it working, I guess. The below will create a table with a nice spread of column type and three records with a reasonable spread of data:

-- test_col_hilo
-- M Widlake 11/12/09
--
--testing script to show high and low values for various column types
--
set lines 90 pages 1000 trims on pause off
--set sqlpro 'test102>'
set sqlpro'>'
set echo on
set autotrace off
spool test_col_hilo.lst
-- basic setup info
col name form a30
col value form a20
select substr(name,1,30) name,substr(value,1,20) value
from v$parameter
where name in ('db_block_size','compatible','cpu_count','db_file_multiblock_read_count'
              ,'optimizer_mode','sga_target','sort_area_size')
order by name
/
drop table test_hilo purge;
--
--
COL TABLE_NAME FORM A15
COL TABLE_OWNER FORM A10
create table test_hilo
(id          number(8) not null
,numsmall    number(4)
,numlarge    number
,datesmall   date
,datelarge   date
,ts          timestamp
,ts0        timestamp(0)
,ts3	     timestamp(3)
,ts6	     timestamp(6)
,ts9	     timestamp(9)
,vcsmall     varchar2(10)
,vclarge     varchar2(100)
,vcodd       varchar2(20)
,nvcsmall    nvarchar2(10)
,nvclarge    nvarchar2(100)
,bfsmall     binary_float -- NB precision is only 6 or 7 digits
,bflarge     binary_float
,bdsmall     binary_double
,bdlarge     binary_double
)
tablespace users
/
-- insert low values
 insert into test_hilo
(id,numsmall,numlarge
,datesmall,datelarge
,ts,ts0,ts3
,ts6,ts9
,vcsmall,vclarge
,vcodd
,nvcsmall,nvclarge
,bfsmall,bflarge
,bdsmall,bdlarge
)
values
(1
,-10,-123456789.12345
,trunc(sysdate-1000),sysdate-500000
,systimestamp,systimestamp
,to_timestamp('01-FEB-2010 12:34:56.123','DD-MON-YYYY HH24:MI:SS.FF')
,to_timestamp('01-FEB-2010 12:34:56.1234567','DD-MON-YYYY HH24:MI:SS.FF')
,to_timestamp('01-FEB-2010 12:34:56.987654321','DD-MON-YYYY HH24:MI:SS.FF')
,'ABCDE','AABCABCDEABCDEFGABCDEFGHI'
,'Abc!"£$%^&*()deF'
,'ABCDE','AABCABCDEABCDEFGABCDEFGHI'
,-1.23,-12345.6
,12345,1234567890
)
/
-- insert mid values
 insert into test_hilo
(id,numsmall,numlarge
,datesmall,datelarge
,ts,ts0,ts3
,ts6,ts9
,vcsmall,vclarge
,vcodd
,nvcsmall,nvclarge
,bfsmall,bflarge
,bdsmall,bdlarge
)
values
(2
,15,1515151515151
,trunc(sysdate-10),sysdate-5000
,systimestamp,systimestamp
,to_timestamp('05-FEB-2010 12:34:56.123','DD-MON-YYYY HH24:MI:SS.FF')
,to_timestamp('05-FEB-2010 12:34:56.1234567','DD-MON-YYYY HH24:MI:SS.FF')
,to_timestamp('05-FEB-2010 12:34:56.987654321','DD-MON-YYYY HH24:MI:SS.FF')
,'CCCCC','CCBCABCDEABCDEFGABCDEFGHI'
,'CbaaBC'
,'EFGHI','ABCDEFGHIJKLMNOPQRSTUV'
,1.23,12345.6
,54321,5432112345
)
/
--insert high values
 insert into test_hilo
(id,numsmall,numlarge
,datesmall,datelarge
,ts,ts0,ts3
,ts6,ts9
,vcsmall,vclarge
,vcodd
,nvcsmall,nvclarge
,bfsmall,bflarge
,bdsmall,bdlarge
)
values
(3
,99,9898989898989
,trunc(sysdate-1),sysdate+1000
,systimestamp+10,systimestamp+10
,to_timestamp('20-FEB-2010 18:17:16.876','DD-MON-YYYY HH24:MI:SS.FF')
,to_timestamp('21-FEB-2010 17:16:15.555555','DD-MON-YYYY HH24:MI:SS.FF')
,to_timestamp('22-FEB-2010 16:15:14.123456789','DD-MON-YYYY HH24:MI:SS.FF')
,'ZYXWV','ZZZZZYZZXZZWZZVZZUZZTZZSZZRZZQZZP'
,'Z;#[]{}~@:'
,'VWXYZ','ZYXWVUTSRQPONMLKJIHGFEDCBA'
,9.87,98765.4
,987654321,987654321.1234567
)
/
commit;
EXEC DBMS_STATS.GATHER_TABLE_STATS(OWNNAME=> USER, TABNAME=>'TEST_HILO')
set autotrace off
set echo off
--
spool off
clear colu
--

And the below is the translation of the values – I used a cut-down version of my script, so it woud fit on the blog better:

COLUMN_NAME  DATA_TYPE     LOW_V                         HI_V
------------ ------------- ----------------------------- -----------------------------
LOW_VALUE                 HIGH_VALUE
------------------------- -------------------------
ID           NUMBER        1                             3
C102                      C104
NUMSMALL     NUMBER        -10                           99
3E5B66                    C164
NUMLARGE     NUMBER        -123456789.12345              9898989898989
3A644E38220C59433366      C70A5A5A5A5A5A5A
DATESMALL    DATE          2007-05-31 00:00:00           2010-02-23 00:00:00
786B051F010101            786E0217010101
DATELARGE    DATE          0641-03-10 17:36:47           2012-11-20 17:36:47
6A8D030A122530            78700B14122530
TS           TIMESTAMP(6)  2010-02-24 17:36:47.255015000 2010-03-06 17:36:47.255015000
786E02181225300F333858    786E0306122530
TS0          TIMESTAMP(0)  2010-02-24 17:36:47.          2010-03-06 17:36:47.
786E0218122530            786E0306122530
TS3          TIMESTAMP(3)  2010-02-01 12:34:56.123000000 2010-02-20 18:17:16.123000000
786E02010D23390754D4C0    786E02141312113436B300
TS6          TIMESTAMP(6)  2010-02-01 12:34:56.123457000 2010-02-21 17:16:15.123457000
786E02010D2339075BCDE8    786E0215121110211D18B8
TS9          TIMESTAMP(9)  2010-02-01 12:34:56.987654321 2010-02-22 16:15:14.987654321
786E02010D23393ADE68B1    786E021611100F075BCD15
VCSMALL      VARCHAR2      ABCDE                         ZYXWV
4142434445                5A59585756
VCLARGE      VARCHAR2      AABCABCDEABCDEFGABCDEFGHI     ZZZZZYZZXZZWZZVZZUZZTZZSZZRZZ
                                                         QZZ
4141424341424344454142434 5A5A5A5A5A595A5A585A5A575
4454647414243444546474849 A5A565A5A555A5A545A5A535A
                          5A525A5A515A5A
VCODD        VARCHAR2      Abc!"£$%^&*()deF              Z;#[]{}~@:
4162632122C2A324255E262A2 5A3B235B5D7B7D7E403A
829646546
NVCSMALL     NVARCHAR2     ABCDE                         VWXYZ
00410042004300440045      0056005700580059005A
NVCLARGE     NVARCHAR2     AABCABCDEABCDEFG              ZYXWVUTSRQPONMLK
0041004100420043004100420 005A005900580057005600550
0430044004500410042004300 0540053005200510050004F00
44004500460047            4E004D004C004B
BFSMALL      BINARY_FLOAT  3.53999972E+000               -9.86999989E+000
40628F5B                  C11DEB85
BFLARGE      BINARY_FLOAT  3.64494306E-004               -9.87653984E+004
39BF1999                  C7C0E6B3
BDSMALL      BINARY_DOUBLE -1.2345E+004                  -9.87654321E+008
C0C81C8000000000          C1CD6F3458800000
BDLARGE      BINARY_DOUBLE -9.8765432112345672E+008      -5.432112345E+009
C1CD6F34588FCD6E          C1F43C774D900000

I could go on and look at handling intervals and time zones but I leave this to you as an exercise. In other words, I have no real need for those data types right now and can’t justify the time! I hope the above is enough to answer whatever questions you may have about the high and low values of your columns…

Follow

Get every new post delivered to your Inbox.

Join 152 other followers