jump to navigation

Command Line or GUI – Which is Best? February 18, 2010

Posted by mwidlake in performance.
Tags: , ,
15 comments

At present I am suffering ever so slightly from “split personality disorder”* in respect of my liking for Command Line and GUI interfaces.

On the one hand, much to my colleagues mild reproach, I use SQL*PLus and not PL/SQL Developer for my day-to-day work. Even worse, I got sick of using notepad to hack around scripts {I am working in a windows client environment and you simply can’t use MS Word with SQL files!} so I have retrograded recently and downloaded a windows-complient ‘vi’ interface and it is soooooo nice to be able to use powerful ‘vi’ commands on my files once more. “:.,$s/^ /,/”. Ahhh, it is so much easier. I can do stuff in 3 seconds in ‘vi’ that would take me 10 minutes in Notepad in a large, complex file. That and, I’m sorry, but notepad seems to be unable to manage a 100MB file, despite me having 2GB of real memory and a decent machine, but ‘vi’ has no problem with it at all.
Even more retrograde, I have direct telnet access to my linux servers and I am getting back to that as it makes me so much more productive. “ls -alrt ^d” for all directories anyone? “df -k .” to see how many data files I can add? Yep, it’s all arcane and means so little to many modern IT “Java/Struts/CDE” people but boy it is direct and fast. I might even dig out that book on SED and AWK.

On the other hand, I have finally (after much very painful discussions back and forth) got agreement that my site probably has access to AWR, ASH and all that good performance repository stuff. So I am hacking around with the OEM screens that focus on performance and snapshots and stuff. Now, I am traditionally not a big fan of GUI DBA tools. Partly it is because I am a bit old and stuck in my ways and partly it is because GUIs are really just “menus of options”. You are limited to what options are available in your DBA GUI tool and you have a harder time learning all the options available or what is really going on “under the covers”.

But with AWR and all those graphs, links and utilities, you can drill down into real problems real time or in the past so effectively that, well, once they start using this tool properly they will not need me at all. It is a fantastic boon to performance management and problem resolution, as well as proactive performance management.

So there you are, I am with Doug Burns on this one, in that I have Learned to Love Pictures. When the Pictures are well thought out and well connected and simple enough to help make sense of a complex system {and Oh Boy Oracle performance has become sooo Complex!!!!}

So right now, I spend half my day in vi/linux/command line world and half of it in pretty picture/GUI world. I think what really makes me happy is to leave behind the half-way-house of text-like Windows World {Windows SQL*Plus, Notepad}.

Just to finish, you can’t mention AWR without someone raising the ugly issue of licence cost and how Evil Oracle Corp were to charge for it. Well, I think it has been established that the guys and gals who developed AWR/ASH did not expect it to become a cost option but it did. And I suspect that what kept it a cost option was the community’s OutRage at it being a cost option. Anything that popular, hey, a commercial company is going to charge for. I still reckon Oracle Corp ballsed up as making it free and helping people use it a bit would have made 90% of customers’ lives easier and would have translated into user happiness and a certain % of sales for training courses to learn more, but heck my day job is to make things work, not maintain sales percentages, so my opinion counts for nowt. *sigh*

(*apologies to real sufferers of Dissociative Identity Disorder, I am using the term in the loose, non-scientific, “common usage” term of “not sure of my opinion” rather than having truly disparate personalities and memories.** And note, I certainly do not mean schizophrenia which, despite the on-going public-opinion misunderstanding, is rarely anything to do with multiple personality disorders or “spit minds” AT ALL, and is more to do with a difficulty in determining between reality and hallucination. ).

Richard Foote on the Impact of stats staying the same. February 16, 2010

Posted by mwidlake in performance.
Tags: , ,
add a comment

I just wanted to highlight this very good posting by Richard Foote. He is talking about how SQL execution plans can change when “nothing else does”. Not the table structures, not the code, note the initialisation parameters, not even the table and index stats.

But something does change, which is what day it is (or what hour it is or what week it is). Time moves on and our data does too. If the statistics on the tables does NOT move on, then the CBO thinks that the range of data in the table does not change. So, the CBO thinks your queries are getting further and further “out of range” and so would expect to find less and less data in the tables to bring back. That will lead to plan changes.

If you have noticed my preoccupation with identifying the contents of histograms and high/low values in column stats, you may appreciate that this topic is one I have been finding is a large part of my day job.

Richard explains the point very well, as always, so go have a look.

Stats Need Stats to Gather Stats February 16, 2010

Posted by mwidlake in performance.
Tags: , ,
10 comments

Did you know that you sometimes need good stats so that the stats-gathering package can gather stats in an efficient way? This is a recent, quite extreme example.

I’ve been forced to learn a lot about gathering Oracle stats using DBMS_STATS over the last 4 or 5 years. But no matter how much I learn about the “challengingly bizarre” way in which it works, it seems every week or two there is a new oddity. I plan a whole series on the topic “soon”.

This particular example is from a 10.2.0.3 system.

I am gathering partition-only table stats as we are using the ability of Oracle to roll up Partition stats to Global stats under certain specific conditions. One of the conditions is that you need stats for every partition. Plus, to get global column stats, each partition must have stats for each column. Some of our partitions lacked or had very bad stats.

So I quickly knocked up a script-generating script to create DBMST_STATS.GATHER_TABLE_STATS statements that collected, for those partitions:

  • ONLY partition stats.
  • NO cascade down to indexes
  • BLOCK sampling so it is fast {and poor, but there you go}
  • ESTIMATE_PERCENT of 2, which is quite low for block sampling
  • collect histograms as we decided the devil of having them was better than the devil of not having them.

the above is not ideal to get “good stats”, but it is quick and gets OK stats which is what we need right now. An example statement is:

begin                                                                                              
-- part SD_INFO-DY07032004 rows 34554                                                              
dbms_stats.gather_table_stats(ownname => 'ERIC',tabname=> 'SD_INFO'
,partname=> 'DY07032004'
,granularity=> 'PARTITION', estimate_percent => 2,block_sample=> true ,cascade=> FALSE
,method_opt=> 'FOR ALL COLUMNS SIZE AUTO' ,no_invalidate=> false);
END;
/

Not a big partition (34554 rows and not very wide rows) and so did not take long to gather:

PARTITION_NAME                 LAST_ANALYZED
------------------------------ --------------------
DY04012004                     12-FEB-2010 16:26:57
DY05012004                     12-FEB-2010 16:27:00
DY06012004                     12-FEB-2010 16:27:04
DY07012004                     12-FEB-2010 16:27:07
DY08012004                     12-FEB-2010 16:27:11 -- 4 seconds difference
DY09012004                     12-FEB-2010 16:27:15

I’ve collected statistics for a few thousand partitions over the last couple of days and the time taken is anything between just under half a second to 10 seconds per partition, the odd unusually large partition taking a minute or so. {I believe it takes half a second to gather stats on an empty partition, on our system at least, due to the time it takes for the internal housekeeping, including copying the old statistics information to the SYS.WRI$_OPSTAT_… tables to support restoring stats}. Sorry, I drift away from my main point.

This partition took a lot longer than 10 seconds:

begin
-- part W_ACTIVITY_FAILURE-DY02092008 rows 49425
dbms_stats.gather_table_stats(ownname => 'ERIC',tabname=> 'W_ACTIVITY_FAILURE'
,partname=> 'DY02092008'
,granularity=> 'PARTITION', estimate_percent => 2,block_sample=> true ,cascade=> FALSE
,method_opt=> 'FOR ALL COLUMNS SIZE AUTO' ,no_invalidate=> false);
END;
/

After 10 minutes it was still running. WHY? I quickly checked the number of rows in the partition and then the size of the partition segment, incase either was much larger than I expected. Neither were.
select count(*) from eric.W_ACTIVITY_FAILURE partition (DY07092008)
COUNT(*)
———-
42182

From dba_segments.
BYTES BLOCKS
———- ———-
2621440 320

There is one advantage of a DBMS_STATS statement running for a long time – you can grab from the SGA the actual code being executed for the DBMS_STATS statement. I saw this.

select substrb(dump(val,16,0,32),1,120) ep, cnt 
from (select /*+ no_parallel(t) no_parallel_index(t) dbms_stats cursor_sharing_exact
use_weak_name_resl dynamic_sampling(0) no_monitoring */
"ACCOUNT_ID" val,count(*) cnt 
from "ERIC"."W_ACTIVITY_FAILURE" t 
where TBL$OR$IDX$PART$NUM("ERIC"."W_ACTIVITY_FAILURE",0,4,0,"ROWID") = :objn
and "ACCOUNT_ID" is not null 
group by "ACCOUNT_ID") 
order by val

DBMS_STATS is scanning the ACCOUNT_ID column on this table and it is taking a long time about it. The only index on the table is on ACCOUNT_ID. It then struck us.

The index is a global index.
INDEX_NAME PARTITIONED
—————————— ————-
API_XXXXXXXXXX_ACCOUNTS_IND NO

And that global index actually lacked stats {another feature of DBMS_STATS and rolling up partition stats had led to that}.

INDEX_NAME TYP UNQ BL L_BLKS DIST_KEYS CLUSTF LB_KEY DB_KEY LST_ANL
————— ——— — —- ———- ———– ———– ———- ———- ——–
API_XXXXXXX_ NOR NON
ACCOUNTS_IND

By this time 3 partitions for this table had been processed by my code, taking around 15 minutes each one. Incredibly slow.

I did a very quick 0.01% sample size DBMS_STATS.GATHER_INDEX_STATS on that index which took about 1 minute. As soon as the partition DBMS_STATS.GATHER_TABLE_STATS statement that was in flight finished, the following similar statements on that table’s partitions took under 3 seconds each. I’ll buy a pint for the first person to guess (within 10 minutes) WHEN I collected the global index stats {You can collect from any pub in central London if you give me a few day’s notice :-) }.

PARTITION_NAME   NUM_ROWS     BLOCKS LAST_ANALYZED
-------------- ---------- ---------- ---------------------
DY01092008          31461        452 16-FEB-2010 09:51:41
DY02092008          49425        686 16-FEB-2010 10:03:44
DY03092008          54472        719 16-FEB-2010 10:16:31
DY04092008          35762        491 16-FEB-2010 10:30:52
DY05092008          42182        587 16-FEB-2010 10:44:24
DY06092008          21186        312 16-FEB-2010 10:56:13
DY07092008          20898        313 16-FEB-2010 11:12:59
DY08092008         469500       1233 16-FEB-2010 11:13:02
DY09092008         480300        741 16-FEB-2010 11:13:04
DY10092008          15724        223 16-FEB-2010 11:31:01
DY11092008          55671        732 16-FEB-2010 11:31:06
DY12092008         820100       1779 16-FEB-2010 11:31:08
DY13092008          80215       1113 16-FEB-2010 11:31:16
DY14092008          10094        155 16-FEB-2010 11:31:18
DY15092008          10268        158 16-FEB-2010 11:31:20
DY16092008          24578        330 16-FEB-2010 11:31:22
DY17092008          21012        290 16-FEB-2010 11:31:24
DY18092008          22755        318 16-FEB-2010 11:31:27
DY19092008          21276        293 16-FEB-2010 11:31:29
DY20092008          20882        281 16-FEB-2010 11:31:31
DY21092008          24131        323 16-FEB-2010 11:31:34

I will investigate this oddity further if I get time this week, but the lesson to myself so far is:

Global indexes lacking stats can result in very long stats gathering times for partitions EVEN WHEN YOU DO NOT CASCADE DOWN TO INDEXES.

Turning on SQL Audit February 2, 2010

Posted by mwidlake in development, performance, Testing.
Tags: , , ,
6 comments

<Previous post…

I want to test out using the SQL AUDIT functionality.

The first thing you have to do (having read up on it a bit) is to enable it for the database. You do this by setting the initialization parameter AUDIT_TRAIL. On version 10.2 you have the options to write the audit entires to:

  •  The DB, by setting it to DB or DB_EXTENDED {not DB,EXTENDED as the manual says, that is the old format}. This puts the entries into the table  SYS.AUD$
  • The operating system, by setting it to OS, XML or XML_EXTENDED. If you set it to XML or XML_EXTENDED then the data is written out in XML format. You also optionally set AUDIT_FILE_DEST to where you want to data to be written to.

Writing the audit trail to the OS is potentially more secure, as you can stop those cunning and devious DBAs messing with the audit trail. {I’m not so sure that this really helps that much – if anyone knows of any DBAs caught out being naughty solely as a result of using the OS to store the SQL AUDIT records, it would be a fascinating comment}

I want to write to the DB as I want to be able to get at the audit data easily and I am not sure how I want to interrogate it. I’m faster with SQL than SED and AWK.

I also decided up front I wanted to use DB_EXTENDED so that the triggering SQL statement and all bind variables are caught, so I can see more about what it triggering the audit record. I am cautious of the impact of storing CLOBs though, which these two values are stored as. I’ve had performance issues moving lots of CLOBS around and I know from some old colleagues that Secure Files are a lot faster. If Secure Files are faster, that means CLOBs are slower :-). If the audit trail seems to add too much burden on my system, swapping back to just DB will be my first step.

Now for the bad news. You can’t just turn on AUDIT. That initialization parameter is not dynamic. You can’t even enable it for your session. It will need a restart of your database.

This tells me something. Oracle needs to do some setting up for SQL AUDIT when it starts the instance. Either start a new process, enable functionality in one of it’s regular processes or set up structures in memory to cope. Or a mixture thereof. I strongly suspect the need for memory structures {but this is only because, in reality, I have done some testing and I am writing this up afterwards}.

I should not really need to say this but DON’T go turning this on for a production system without extensive testing somewhere else first. There is not a lot “Out There” about the details of the performance impact of AUDIT but the general opinion is there is some; and that is reasonable given it is going to write database records for every action audited. Also, you have no idea yet of any knock-on effects. You know, things you did not expect that causes your database to lock or crash and you to get fired.

{Question, what happens if you alter the initialization file and restart only one node of a RAC database? I don’t know and so I should test that. My current test system is not RAC, but the final destination for this stuff is RAC}.

You probably also want to check that no one has gone and tried turning on SQL AUDIT on things already. You never know if someone else decided to have a play with this and issued a load of AUDIT statements only to find nothing happened – and left what they did in place “as nothing happened”. I already know of one example of this happening…

Here is a little script I knocked up to see what is currently set to be audited:

-- what_is_audited.sql
-- Martin Widlake 11/01/10
-- simple listing of what auditing is currently set
set pages 100
set pause on
spool what_is_audited.lst
select * from dba_priv_audit_opts
order by user_name,privilege
/
select * from sys.dba_stmt_audit_opts
order by user_name,audit_option
/
select * from DBA_OBJ_AUDIT_OPTS
order by owner,object_name
/
spool off
clear col
--
-- EOF
--

And some sample output. I’m not going to explain it in this post, but you can have a look though it.

DEV3&gt; @what_is_audited
USER_NAME                      PROXY_NAME
------------------------------ ------------------------------
PRIVILEGE                                SUCCESS    FAILURE
---------------------------------------- ---------- ----------
MDW1
ALTER SYSTEM                             BY ACCESS  BY ACCESS
MDW1
AUDIT SYSTEM                             BY ACCESS  BY ACCESS
MDW1
CREATE SESSION                           BY ACCESS  BY ACCESS

ALTER SYSTEM                             BY ACCESS  BY ACCESS

AUDIT SYSTEM                             BY ACCESS  BY ACCESS

CREATE SESSION                           BY ACCESS  BY ACCESS

6 rows selected.

USER_NAME                      PROXY_NAME
------------------------------ ------------------------------
AUDIT_OPTION                             SUCCESS    FAILURE
---------------------------------------- ---------- ----------
MDW1
ALTER SYSTEM                             BY ACCESS  BY ACCESS
MDW1
CLUSTER                                  BY ACCESS  BY ACCESS
MDW1
CONTEXT                                  BY ACCESS  BY ACCESS
MDW1
CREATE SESSION                           BY ACCESS  BY ACCESS
MDW1
DATABASE LINK                            BY ACCESS  BY ACCESS
MDW1
DELETE TABLE                             BY ACCESS  BY ACCESS
MDW1
...
TYPE                                     BY ACCESS  BY ACCESS
MDW1
UPDATE TABLE                             BY ACCESS  BY ACCESS
MDW1
USER                                     BY ACCESS  BY ACCESS
MDW1
VIEW                                     BY ACCESS  BY ACCESS

ALTER SYSTEM                             BY ACCESS  BY ACCESS

CLUSTER                                  BY ACCESS  BY ACCESS

CONTEXT                                  BY ACCESS  BY ACCESS

CREATE SESSION                           BY ACCESS  BY ACCESS

DATABASE LINK                            BY ACCESS  BY ACCESS

DIMENSION                                BY ACCESS  BY ACCESS

DIRECTORY                                BY ACCESS  BY ACCESS

INDEX                                    BY ACCESS  BY ACCESS

MATERIALIZED VIEW                        BY ACCESS  BY ACCESS
...
USER                                     BY ACCESS  BY ACCESS

VIEW                                     BY ACCESS  BY ACCESS

56 rows selected.

OWNER                          OBJECT_NAME                    OBJECT_TYPE
------------------------------ ------------------------------ --------------
ALT     AUD     COM     DEL     GRA     IND     INS     LOC     REN     SEL
------- ------- ------- ------- ------- ------- ------- ------- ------- ----
UPD     REF EXE     CRE     REA     WRI     FBK
------- --- ------- ------- ------- ------- -------
MWPERF                         FORN_M_SEQ                     SEQUENCE
-/-     -/-     -/-     -/-     -/-     -/-     -/-     -/-     -/-     A/A
-/-     -/- -/-     -/-     -/-     -/-     -/-
MWPERF                         PERSON                         TABLE
A/A     A/A     A/A     A/A     A/A     A/A     A/A     A/A     A/A     A/A
A/A     -/- -/-     -/-     -/-     -/-     A/A
MWPERF                         ROAD_TYPE                      TABLE
-/-     -/-     -/-     A/A     -/-     -/-     A/A     -/-     -/-     A/A
A/A     -/- -/-     -/-     -/-     -/-     -/-

If you discover you have a lot of things set to be audited, ESPECIALLY if they are auditing select access, think about turning some or all of it off before you enable AUDITING by setting that initialization parameter.

Once you have turned on the feature, you can start testing it…

Testing Methodology – How to Test #2 January 26, 2010

Posted by mwidlake in performance, Testing.
Tags: ,
1 comment so far

<previous post …

Last post I (see link above) I said something about how I like timing how long things {usually SQL queries but you can apply this to more complex situations} take as it is simple. But I highlighted that it is an approach that misses a lot of information and also can catch you out if you do not appreciate that the first run of any test script is likely to be slow (due to parsing and data caching overheads which are reduced for subsequent itterations).

I want to continue on that theme now, of simple testing with just how long things take. And how to cope with variation.

Hands up all of you who have Oracle on a PC at home? Great, you can test what you want on a dedicated machine. No one else is running anything on your machine, no one else is using your storage, no one else is using your network (if you have a network at home) and no one is running the SQL statement from hell on your database.

BUT. You might have some download of the latest security patch set running in the background, your virus scanner may kick in (or you might just have Norton full-security-everything-running-on-your-cpu-all-the-time turned on). Your miles per gallon may vary, especially on Windows but also on Linux.

For many of us we are testing on a server at work, where developers are working, where the storage is shared, where the network is dealing with that lady behind you watching “Celebrity Strictly come roller-skate dancing impersonations” {well, she sat behind me for 4 months and she was in the Quality team, what does that tell you}. Bottom line, the performance of your server will vary, your testing needs to cope with that.

How do you cope with a varying workload on the system you are running tests on? Well, you run the new and old version of your test several times, in succession, interleaved. This is ideal when , in the very un-ideal situation of having to test on a Live situation. Testing on “Live” is wrong, in the same way that spelling “initialisation” with a Z is wrong, but it is often forced upon us.

As an example, I want to check into the questionable suggestion someone made that selecting count(*) from dba_tables where owner=user is faster than selecting count(*) from user_tables.

So, I knock up a quick sql*plus script that runs the two alternatives in quick succession, interleaved.

--testa.sql
set timi on pause off
spool testa.lst
--
prompt run 1
select count(*) from user_tables;
select count(*) from dba_tables where owner=user;
--
prompt run 2
select count(*) from user_tables;
select count(*) from dba_tables where owner=user;
--
prompt run 3
select count(*) from user_tables;
select count(*) from dba_tables where owner=user;
--
prompt run 4
select count(*) from user_tables;
select count(*) from dba_tables where owner=user;
--
prompt run 5
select count(*) from user_tables;
select count(*) from dba_tables where owner=user;
--
spool off
-- eof

Just a couple of things to mention. I always spool out to a file called the same thing as the script, with .lst at the end (which is the default file extension for spool commands but I like to state it) and I like to use “prompt” to record in the output what is going on. Here it is obvious, sometimes it is not. Oh, and I like to put –EOF at the end of my files, so I can spot cut-and-paste errors. I know, I am very “command line” centric. I love pictures too Doug. Finally, I should not run the two versions with the same one first each time, I should swap them over after every two runs (not each run as then you are running A,A,B,B,A,A,B,B. A more complex A,B,A,B,B,A,B,A is better). But the results look messy, so I didn’t.

Here is the output (trimmed a little, for the sake of brevity):-

run 1

  COUNT(*)                                                                      
----------                                                                      
        35                                                                      
1 row selected.
Elapsed: 00:00:00.58

  COUNT(*)                                                                      
----------                                                                      
        35                                                                      
1 row selected.
Elapsed: 00:00:00.73

run 2

  COUNT(*)                                                                      
----------                                                                      
        35                                                                      
1 row selected.
Elapsed: 00:00:00.28

  COUNT(*)                                                                      
----------                                                                      
        35                                                                      
1 row selected.
Elapsed: 00:00:00.10

run 3

  COUNT(*)                                                                      
----------                                                                      
        35                                                                      
1 row selected.
Elapsed: 00:00:00.28

  COUNT(*)                                                                      
----------                                                                      
        35                                                                      
1 row selected.
Elapsed: 00:00:00.10

run 4

  COUNT(*)                                                                      
----------                                                                      
        35                                                                      
1 row selected.
Elapsed: 00:00:00.28

  COUNT(*)                                                                      
----------                                                                      
        35                                                                      
1 row selected.
Elapsed: 00:00:00.10

run 5

  COUNT(*)                                                                      
----------                                                                      
        35                                                                      
1 row selected.
Elapsed: 00:00:00.29

  COUNT(*)                                                                      
----------                                                                      
        35                                                                      
1 row selected.
Elapsed: 00:00:00.09

At this point I might pull the results into Excel, as it is so good at drawing pretty graphs, but I think we can interpret the results “by eye” in this case.

Run    User Tabs   DBA_Tabs
1        0.58          0.73
2        0.28          0.10
3        0.28          0.10
4        0.28          0.10
5        0.29          0.09

AVG      0.28          0.10

My Average is ignoring the first run for each version of the code, as previously explained {parse overhead/warming caches}

You can see that version B is considerably faster than A. In this case, there is no overlap between the two result sets, by which I mean every run o version B is faster than every run of version A. Also, there is little variation in the performance of each version.

What if you got a result set like:-

Run    User Tabs   DBA_Tabs
1        0.77          0.68
2        0.35          0.25
3        0.42          0.34
4        0.48          0.38
5        0.48          0.30
6        0.41          0.29
7        0.40          0.28

AVG      0.42          0.31

If you look at the figures, both A and B slow down and then start to speed up. Something happened to slow down the system in the middle of the test and it was recovering but not back to “normal” by the end. If we had done 7 runs of A followed by 7 runs of B, B would have been hard hit by the temporary system slow-down {I opened a WORD document :-) }.

Also, there is some overlap in the results for A and B. The slowest result for B, 0.38, is slower than for the fastest result for A, 0.35. When the spread of times for test A and test B overlap like this, you need to do a few more tests to ensure the results are consistent. I did 7 but I should really have done 9 or maybe 15.
In any case, B is consistently 25% faster than A. It is Faster.

How about this result set:

Run    User Tabs   DBA_Tabs
1        0.87          0.71
2        0.41          0.42
3        0.47          0.43
4        0.37          0.42
5        0.39          0.34
6        0.51          0.38
7        0.42          0.40

AVG      0.43          0.40

The variation in the result sets is higher and if fact if you only looked at results for paired runs 2,3 and 4 you would come to the conclusion A was very slightly faster than B. In fact, B is slightly faster than A.

Or is it? There are only 6 tests {remember, ignore run 1}, the variation within the sets is high (.37 to .51 in A, .34 ot .42 in B) and the overall difference is low. You could run a statistical analysis on this, a “Student T” test I think, to see if the difference was significant. But unless I was looking at something utterly business critical at this point where the smallest fraction of improvement was important, I would not bother. I would be far better off looking for solution C or doing something else completely. If you ever find yourself doing scientific statistical analysis to decide which of two options is faster, it is probably time to go home and consider what else you could be doing to help your employer more…

Enough for tonight and HOPEFULLY the wordpress pixies will not eat the end of this post (yesterday’s effort was better than tonights I think…)

Testing Methodology – How To Test January 26, 2010

Posted by mwidlake in performance, Testing.
Tags: , ,
1 comment so far

<Previous Post…

On those rare but pleasant occasions when I find myself running a training course on performance, something I always want to press home is “How to Test”.

I believe you can learn everthing in every manual on oracle technology and internals and still be very poor at performance tuning. Similarly, I think you can be lacking an awful lot of knowledge and still be good at performance tuning. It comes down to how you test. Not just how to test, but how you as an individual design and run your tests.

I joke with whoever I am talking to that what you need most of all to test is a watch and someone calling out “start” and “Stop”. ie you need to be able to time things. It is a bit of a throw-away statement, but actually most of us will do exactly this on a regular basis. We will be working on something, run it and it will come back in a few seconds. Then we will tweak it and run it again and mentally {or in my case, just very quietly under my breath} count. We even discuss it with people like that “How long did it take that time?” Ohh, a couple of seconds faster than last time”.

I like tuning by timing very, very, very much.

Firstly, it is simple – If a SQL query runs faster, it runs faster. If you tune by looking at the explain plan and you see a full table scan being replace with a nested loop and an index look up, is it faster? It depends. If you tune by looking at the buffer gets and other statistics from the SGA (or wherever), if the “buffer gets” go down, is it faster? This depends again. If the disk reads went up, even by a realtively small amount, maybe not. If the memory usage went through the roof because of a sort that spills down to disc, well quite probably not. Obviously if all of buffer gets, disk reads and cpu usage went down, you can be pretty certain you are having a positive impact in speeding up the statement. But timing the statement from start to finish gives you a nice, simple answer on if it runs faster.

Secondly, it is simple. You do not have to look at plans, at runtime statistics, at deltas of session statistics, at trace files. All those things are good but you need more knowledge and experience to use them and time to set up, collect and study them. This is not to say you do not get a lot more out of tuning if you understand all of the stuff about trace files, explain plans etc, I hasten to add – but you do not need that to get going with performance tuning.

Thirdly, it is simple. You tell your manager’s manager that you use 47% less buffer gets, 12% less disk reads but 9% more CPU then they will ask you what all that means. You tell them that version A runs in 2 minutes and 9 seconds and version B in 34 seconds, they will understand. Better still, graph it and give them a power point…

Fourthly, it is simple. You can see that statement (a) is faster or slower than statement (b). Now you can look at the plan, at the statistics for the statement, at what wait events are occuring and start tying up book knowledge with real-world results.

Of course, timing with a watch is very crude. You have a very capable piece of computing power underlying that database of yours, so let’s get it to do the timing.

As a simple start, in SQL*Plus use “set timing on” (it can be abbreviated to “set timi on”). Most GUI SQL tools will either default to telling you the elapsed time or have an obvious option to switch on the functionality.

{Oh, I’ll just mention that if you cannot get timing in sql*plus or wherever to work you, might want to check what the initialisation parameter “TIMED_STATISTICS” is set to. Some ancient memory is telling me that if it is set to FALSE, you may not be able to use TIMING in SQL*Plus but another memory is telling me that stopped being true a while back, version 8 maybe. I tried setting TIMED_STATISTICS to false in my session on 10.2 but TIMING still worked, but then I left STATISTICS_LEVEL alone and that cause TIMED_STATISTICS to be set. It is so long ago that I worked on a system that had TIMED_STATISTICS set to false! Even a quick google did not throw up an immediate answer}.

DB10.2> select count(*) from sn
  2  /
any key> 

  COUNT(*)
----------
       116

1 row selected.

DB10.2> set timi on
DB10.2> select count(*) from sn
  2  /
any key> 

  COUNT(*)
----------
       116

1 row selected.

Elapsed: 00:00:01.09

There you go, the count of SN records, all 116, took 1.09 seconds.

Yes, I had “set pause on” and the timing includes how long it takes me to spot that the query has finished and press a key. We all do it. However, sometimes the need for user input it still missed {Something I see quite often is, for example, not pressing the button in PL/SQL developer to get ALL the rows for the SQL statement, rather than just the first screenful}. A key thing is to try and remove from testing all and any waiting for user response, as we humans are slow and irratic.

so SET PAUSE OFF.

Now you have TIMING on and pause off. Time to run something and see how long it takes, then try and speed it up and run again:

DB10.2> select count(*) from person where to_char(dob,'YYYY') = 1988
  2  /

  COUNT(*)
----------
        29
Elapsed: 00:00:01.15

DB10.2> create index per_dob on person(dob);

Index created.

Elapsed: 00:00:01.31
DB10.2> select count(*) from person where to_char(dob,'YYYY') = 1988;

  COUNT(*)
----------
        29
Elapsed: 00:00:00.14

There you go, I added an index and the query went down from 1.15 seconds to 0.14, that is 8 times faster. Timing is timing.
Well, no, and this is something you need to be careful of if you are new to tuning.

The second itteration is nearly always faster.

Why? Well, the first time you run a piece of SQL it has to be parsed for a start, and that takes some time. More importantly, the first time you select the data, it is going to be on disk. Oracle has read it in and put it into memory. The second time you query the same data, it will be found in memory. That {nearly always} makes the next access to the data a lot faster. The index was not being used as I had a function on the column in the WHERE clause and this stops the index from being used.

So having said I love testing by timing, you need to be cautious about one-off tests. Oh, and here below is proof that the index I created is making no real difference to the speed of the SQL query:

DB10.2> set autotrace on
DB10.2> select count(*) from person where to_char(dob,'YYYY') = 1988;

  COUNT(*)
----------
        29
Elapsed: 00:00:00.12

Execution Plan
-----------------------------------------------------------
| Id  | Operation          | Name   | Rows  | Bytes | Cost 
-----------------------------------------------------------
|   0 | SELECT STATEMENT   |        |     1 |     9 |   335
|   1 |  SORT AGGREGATE    |        |     1 |     9 |      
|*  2 |   TABLE ACCESS FULL| PERSON |    36 |   324 |   335
-----------------------------------------------------------

Statistics
----------------------------------------------------------
         20  recursive calls
         19  db block gets
        727  consistent gets
          0  physical reads
       1992  redo size

DB10.2> drop index per_dob
  2  /

Index dropped.

Elapsed: 00:00:00.03
DB10.2> select count(*) from person where to_char(dob,'YYYY') = 1988
  2  /

  COUNT(*)
----------
        29
Elapsed: 00:00:00.15

Execution Plan
----------------------------------------------------------
| Id  | Operation          | Name   | Rows  | Bytes | Cost
----------------------------------------------------------
|   0 | SELECT STATEMENT   |        |     1 |     9 |  335
|   1 |  SORT AGGREGATE    |        |     1 |     9 |      
|*  2 |   TABLE ACCESS FULL| PERSON |    36 |   324 |  335
----------------------------------------------------------


Statistics
----------------------------------------------------------
        261  recursive calls
         19  db block gets
        841  consistent gets
          0  physical reads
       1992  redo size

We see 0.12 seconds goes to 0.14 seconds, exactly the same explain plan for both statements, a small increase in consistent gets, no physical gets by either statement (so the data is cached).
Why is the second statement a little slower and has a large increase in recursive calls and db block gets? Because I dropped the index and the statement had to be reparsed. Now, if you are new to tuning you would almost certainly not have appreciated what the recursive calls and DB block gets were all about, it could distract you from the question of “does it run faster”. It is certainly good to know all about that, but when you are starting off, you want to keep things simple and learn in stages.

What the above demonstrates, I hope, is that the second thing you run will have an advantage and could probably run faster even though, in reality, it is using more resource. And we tend to run old code first and new code second. So swap them over, give the old code the advantage of being run second.

Do not test things once, as you can easily be caught out. Test each version several times. And ignore the first run of each version. This is not perfect advice, code in production may well be being parsed and gathering data from disk, but unless you can allow for this in your testing, I think it is generally better, for simple testing, to run each version 6 times. Of the six runs, ignore the first run of each and average the results of the other 5. Which ever one runs faster on average is, well, fastest. IF the difference is significant.

Oh. Where is the SQL AUDIT in all this? Well, ponder on why am I generating REDO for a simple select…

Decoding high_value and low_value January 3, 2010

Posted by mwidlake in internals, performance.
Tags: , ,
13 comments

There is an update to this post here>

The table DBA_TAB_COLUMNS holds the LOW_VALUE and HIGH_VALUE for columns. This information is potentially very useful to us, especially as when the CBO looks at a WHERE clause on a column for a given value, the High and Low values for that column are considered. If the value in the WHERE clause is outside the known range, the expected number of values to be found is decreased in proportion to the distance outside the range.

What is not so helpful is that Oracle stores, and displays, the information in an internal raw format. Which is utterly unhelpful to us of course. Why can’t they at least expose the information in a view in a human-readable format? Tsch.

So here is some code on how to decode low_value and high_value. I pulled most of this together a few years back but always struggled to decode the low and high values for dates, until I found this post by Gary Myers.

-- col_stats
-- Martin Widlake mdw 21/03/2003
-- MDW 11/12/09 enhanced to include more translations of low_value/high_value
-- pilfered from Gary Myers blog
col owner        form a6 word wrap
col table_name   form a15 word wrap
col column_name  form a22 word wrap
col data_type    form a12
col M            form a1
col num_vals     form 99999,999
col dnsty        form 0.9999
col num_nulls    form 99999,999
col low_v        form a18
col hi_v         form a18
col data_type    form a10
set lines 110
break on owner nodup on table_name nodup
spool col_stats.lst
select --owner
--      ,table_name
      column_name
      ,data_type
      ,decode (nullable,'N','Y','N')  M
      ,num_distinct num_vals
      ,num_nulls
      ,density dnsty
,decode(data_type
  ,'NUMBER'       ,to_char(utl_raw.cast_to_number(low_value))
  ,'VARCHAR2'     ,to_char(utl_raw.cast_to_varchar2(low_value))
  ,'NVARCHAR2'    ,to_char(utl_raw.cast_to_nvarchar2(low_value))
  ,'BINARY_DOUBLE',to_char(utl_raw.cast_to_binary_double(low_value))
  ,'BINARY_FLOAT' ,to_char(utl_raw.cast_to_binary_float(low_value))
  ,'DATE',to_char(1780+to_number(substr(low_value,1,2),'XX')
         +to_number(substr(low_value,3,2),'XX'))||'-'
       ||to_number(substr(low_value,5,2),'XX')||'-'
       ||to_number(substr(low_value,7,2),'XX')||' '
       ||(to_number(substr(low_value,9,2),'XX')-1)||':'
       ||(to_number(substr(low_value,11,2),'XX')-1)||':'
       ||(to_number(substr(low_value,13,2),'XX')-1)
,  low_value
       ) low_v
,decode(data_type
  ,'NUMBER'       ,to_char(utl_raw.cast_to_number(high_value))
  ,'VARCHAR2'     ,to_char(utl_raw.cast_to_varchar2(high_value))
  ,'NVARCHAR2'    ,to_char(utl_raw.cast_to_nvarchar2(high_value))
  ,'BINARY_DOUBLE',to_char(utl_raw.cast_to_binary_double(high_value))
  ,'BINARY_FLOAT' ,to_char(utl_raw.cast_to_binary_float(high_value))
  ,'DATE',to_char(1780+to_number(substr(high_value,1,2),'XX')
         +to_number(substr(high_value,3,2),'XX'))||'-'
       ||to_number(substr(high_value,5,2),'XX')||'-'
       ||to_number(substr(high_value,7,2),'XX')||' '
       ||(to_number(substr(high_value,9,2),'XX')-1)||':'
       ||(to_number(substr(high_value,11,2),'XX')-1)||':'
       ||(to_number(substr(high_value,13,2),'XX')-1)
,  high_value
       ) hi_v
from dba_tab_columns
where owner      like upper('&tab_own')
and   table_name like upper(nvl('&tab_name','WHOOPS')||'%')
ORDER BY owner,table_name,COLUMN_ID
/
clear colu
spool off
clear breaks

Most of the translation is done via the utl_raw.cast_to_datatype functions but no such function is provided for dates, which is where most of us come unstuck. Gary recognised that the data was being stored in hex format, with an offset for the year.

I hope the script helps some of you.
{There are also functions under DBMS_STATS, DBMS_STATS.CONVERT_RAW_VALUES, that can also be called to translate many datatypes}

If anyone knows how to decode timestamps, I’d love to know as it would save me trying to work it out/find it on the Net somewhere. I’ll add it to the post so that there is one place to find all translatations.

Here is a quick output for a test table

@col_stats
Enter value for tab_own: dwperf
old  40: where owner      like upper('&tab_own')
new  40: where owner      like upper('dwperf')
Enter value for tab_name: ad_sql_exec_p
old  41: and   table_name like upper(nvl('&tab_name','WHOOPS')||'%')
new  41: and   table_name like upper(nvl('ad_sql_exec_p','WHOOPS')||'%')
any key> 

COLUMN_NAME            DATA_TYPE  M   NUM_VALS  NUM_NULLS   DNSTY LOW_V              HI_V
---------------------- ---------- - ---------- ---------- ------- ------------------ ---------------
INST_ID                NUMBER     Y          4          0  0.2500 1                  4
SQL_ID                 VARCHAR2   Y     87,104          0  0.0000 008d71chrks14      gzw309ttkx862
PLAN_HASH_VALUE        NUMBER     Y      1,884          0  0.0005 2646229            4294043874
SPC_ID                 NUMBER     N         83          0  0.0120 1076               2269
ADDRESS                RAW        N    118,198          0  0.0000 00000003E33123A0   0000001342FF3FF8
HASH_VALUE             NUMBER     N     87,104          0  0.0000 2758710            4294676643
CREATED_DATE           DATE       N        120          0  0.0083 2009-10-23 8:19:10 2009-12-10 9:19:13
LATEST_FIRST_LOAD_TIME DATE       N     11,791          0  0.0001 2009-9-14 11:55:59 2009-12-13 9:33:24
TOTAL_LOADS            NUMBER     N         98          0  0.0102 1                  55047
TOTAL_PARSE_CALLS      NUMBER     N         92          0  0.0109 0                  2972
TOTAL_EXECUTIONS       NUMBER     N        235          0  0.0043 0                  834624
TOTAL_ROWS_PROCESSED   NUMBER     N        809          0  0.0012 0                  26946123
TOTAL_FETCHES          NUMBER     N        313          0  0.0032 0                  834624
TOTAL_BUFFER_GETS      NUMBER     N      3,016          0  0.0003 0                  3355576809
TOTAL_DISK_READS       NUMBER     N        985          0  0.0010 0                  28189240
TOTAL_DIRECT_WRITES    NUMBER     N         98          0  0.0102 0                  751289
TOTAL_SORTS            NUMBER     N        106          0  0.0094 0                  5283
TOTAL_CPU_TIME         NUMBER     N     94,401          0  0.0000 1337               12183936207
TOTAL_ELAPSED_TIME     NUMBER     N    115,203          0  0.0000 1337               139692482086
TOTAL_OPTIMIZER_COST   NUMBER     N      1,467          0  0.0007 0                  369740902209315000
                                                                                     0000000

Using AWR data for OS information December 22, 2009

Posted by mwidlake in performance.
Tags: ,
5 comments

This is a demo of using AWR data to get some OS performance information.

This posting was actually a comment by (and is the work of) Bernard Polarski on a recent post about command_types I did a week or two back. I’ve lifted it into a separate posting as (a) I thought it might be useful to people and it would not get as much exposure as a comment as it would as a post and (b) I want to try and lay it out better for Bernard (WordPress likes to garble code in comments, which it did to Bernard’s efforts) and (c) I strognly believe in nicking other people’s code, if they are happy to put it “out there”. Why re-code what someone has already done and is happy to share :-)

I tried the code (I added the SYS. ownership on tables, so it will work even if you can’t log in as sys but have access to the objects) and it worked fine for me.

I should just add, use of AWR data is covered by an oracle licence, the diagnostic pack. If you do not have a licence for it, Oracle deem the WRH$ views as not yours to view, even though they are on your box, in the database you paid for, containing data gathered by your CPUs by code that is damned tricky to turn off and is on by default. Shesh.

Any mistakes or typoes, blame me not Bernard. I took the liberty of tweaking some of the English, hope that is OK Bernard.

*************************************
Okay a challenge : Let me try to surprise you with something improbable. Some months ago while investigating an unsual load on a Linux box, the unix sysadmin told us he could not go back more than 2 weeks with ‘sar’ history. Since Oracle takes also system stats I quickly enriched smenu with an Oracle sar.
The example below gives the sar stats for the last 10 snaps, it is stored into shortcut sts and called with param ‘-sar’

( replace workd Superior_TO and Inferior_to with mathematic signs, system eat them)

/tmp: sts -sar 10

+ sqlplus -s '/ as sysdba'
+
set feed off verify off lines 190 pages 66
col inst  new_value inst noprint;
col id1      head 'Idle time(%)'    justify c
col usr1     head 'User time(%)'    justify c
col sys1     head 'Sys time(%)'     justify c
col io1      head 'Io Wait time(%)' justify c
col nice1    head 'Nice time (%)'   justify c
col snap_len head 'Interval| (Secs)' justify c
col num_cpus   new_value p_num_cpus  head 'Number of CPU';
col a1         new_value secs noprint;
col SNAP_BEGIN format a20 head 'Snap begin' justify c
col SNAP_END   format a20 head 'Snap end'   justify c

 select instance_number inst from v$instance;
 SELECT value num_cpus 
 FROM v$osstat WHERE stat_name = 'NUM_CPUS';
 prompt
 prompt Negatives values correspond to Shutdown:
 prompt
select  snap_id, snap_len, round(id1 /snap_len *100,2) id1,
                       round(usr1 /snap_len *100,2) usr1,
                       round(sys1 /snap_len *100,2) sys1,
                       round(io1 /snap_len *100,2) io1,
                       round(nice1 /snap_len *100,2) nice1 
                     , snap_begin, snap_end
from (
     select  snap_id,  id1, usr1,sys1, io1, nice1
           , snap_begin, snap_end ,
             round( extract( day from diffs) *24*60*60*60+
                    extract( hour from diffs) *60*60+
                    extract( minute from diffs )* 60 +
                    extract( second from diffs )) snap_len 
             -- above is the exact length of the snapshot in seconds
     from ( select /*+ at this stage, each row show the cumulative value.
                       r1    7500  8600
                       r2    7300  8300
                       r3    7200  8110
                    we use [max(row) - lag(row)] to have the difference 
                    between [row and row-1], to obtain differentials values:
                       r1    200   300
                       r2    100   190
                       r3    0       0
                    */
        a.snap_id,
        (max(id1)    - lag( max(id1))   over (order by a.snap_id))/100        id1 ,
        (max(usr1)   - lag( max(usr1))  over (order by a.snap_id))/100        usr1,
        ( max(sys1)  - lag( max(sys1))  over (order by a.snap_id))/100        sys1,
        ( max(io1)   - lag( max(io1))   over (order by a.snap_id))/100        io1,
        ( max(nice1) - lag( max(nice1)) over (order by a.snap_id))/100        nice1,
          -- for later display
          max(to_char(BEGIN_INTERVAL_TIME,' YYYY-MM-DD HH24:mi:ss'))          snap_begin,
          -- for later display
          max(to_char(END_INTERVAL_TIME,' YYYY-MM-DD HH24:mi:ss'))            snap_end,
          -- exact len of snap used for percentage calculation
        ( max(END_INTERVAL_TIME)-max(BEGIN_INTERVAL_TIME))                    diffs
        from ( /*+  perform a pivot table so that the 5 values selected appear 
                    on one line. The case, distibute col a.value among 5 new 
                    columns, but creates a row for each. We will use the group 
                    by (snap_id) to condense the 5 rows into one. If you don't 
                    see how this works, just remove the group by and max function, 
                    then re-add it and you will see the use of max(). 
                    Here is what you will see:
        Raw data :   1000     2222
                     1000     3333
                     1000     4444
                     1000     5555
                     1000     6666
        The SELECT CASE creates populate our inline view with structure:
                       ID    IDLE    USER   SYS   IOWAIT   NICE
                     1000    2222
                     1000             3333
                     1000                   4444
                     1000                          5555
                     1000                                  6666
         the group by(1000) condenses the rows in one:
                       ID    IDLE    USER   SYS   IOWAIT  NICE
                      1000   2222    3333   4444   5555   6666
               */
               select a.snap_id,
                  case b.STAT_NAME
                       when 'IDLE_TIME' then a.value / &p_num_cpus
                   end  id1,
                  case b.STAT_NAME
                       when 'USER_TIME' then a.value / &p_num_cpus
                  end usr1 ,
                  case b.STAT_NAME
                       when 'SYS_TIME' then  a.value / &p_num_cpus
                  end sys1 ,
                  case b.STAT_NAME
                       when 'IOWAIT_TIME' then  a.value / &p_num_cpus
                  end io1,
                  case b.STAT_NAME
                       when 'NICE_TIME' then  a.value / &p_num_cpus
                  end nice1
                  from  sys.WRH$_OSSTAT a,  sys.WRH$_OSSTAT_NAME b
                       where
                            a.dbid      = b.dbid       and
                            a.STAT_ID   = b.stat_id    and
                            instance_number = &inst    and
                            b.stat_name in ('IDLE_TIME','USER_TIME','SYS_TIME','IOWAIT_TIME','NICE_TIME') and
                            a.snap_id > (( select max(snap_id) from  sys.WRH$_OSSTAT) - 10  -1 )
                   order by 1 desc
              ) a,  sys.wrm$_snapshot s
         where  a.snap_id = s.snap_id
         group by a.snap_id
         order by snap_id desc
        )
   )where rownum < (10+1);





Number of CPU
-------------
            2

Negatives values correspond to Shutdown:


            Interval
   SNAP_ID   (Secs)   Idle time(%) User time(%) Sys time(%) Io Wait time(%) Nice time (%)      Snap begin            Snap end
---------- ---------- ------------ ------------ ----------- --------------- ------------- -------------------- --------------------
      2212       3603        87.07         9.43        2.84           16.19             0  2009-12-15 14:00:19  2009-12-15 15:00:21
      2211       3613        90.15         7.31        2.14           18.97             0  2009-12-15 13:00:06  2009-12-15 14:00:19
      2210       3593        92.58         5.91         1.4            7.91             0  2009-12-15 12:00:13  2009-12-15 13:00:06
      2209       3574        91.24         6.78        1.73            9.37             0  2009-12-15 11:00:38  2009-12-15 12:00:13
      2208       3620        90.04          7.8        1.96           11.81             0  2009-12-15 10:00:19  2009-12-15 11:00:38
      2207       3594         89.8         6.89        2.37           12.99             0  2009-12-15 09:00:25  2009-12-15 10:00:19
      2206       3611        91.75         5.81        1.87            8.29             0  2009-12-15 08:00:14  2009-12-15 09:00:25
      2205       3599        92.38         5.63        1.68            8.57             0  2009-12-15 07:00:15  2009-12-15 08:00:14
      2204       3605        91.72         6.51        1.58           10.79             0  2009-12-15 06:00:11  2009-12-15 07:00:15
      2203       3601        86.36        10.35        2.94           10.62             0  2009-12-15 05:00:10  2009-12-15 06:00:11

Ensuring Correlated Partition Exclusion #2 December 20, 2009

Posted by mwidlake in performance, Uncategorized, VLDB.
Tags: , ,
2 comments

<Previous Post

A few days ago {oh dear, it is now two weeks!} I showed how you could better control Correlated Partition Exclusion by creating a lookup table that showed, for each partition, the minimum and maximum ID (the partition key) and the minimum and maximum CRE_DATETIME (what you want to limit your query on). Using this range table in the way I described, you did not have to take an educated guess as to what range of IDs you used to include the partitions for the date range you were interested in, you could check the table.

But this only worked reliably where the ID and CRE_DATETIME increase directly in proportion to each other. It was a special case.

What about the more normal case, where there is a correlation between the two columns but it is not a perfect correlation? This would happen, for example, if the CRE_DATETIME is entered from another system, or from paper records, where the order of the records is not enforced. So some records from today get put into the system before some from yesterday. Or if the correlation is even loser than this. eg you are looking at orders for new customers. You “know” there are going to be no orders for these customers from 5 years ago but you are not sure how far back in the orders table you should go to find what you want.

You can still use the lookup table method. The lookup table in effect becomes a meta-index – an index of the segments where you will find data but not actually the rows.

To demonstrate this, I created a table where the ID and CRE_DATETIME increase in order:

create a partitioned table test_p4
(id           number(10) not null
,cre_datetime date not null
,status        number(1) not null
,num_1         number(4) not null -- random 20
,num_2         number(4) -- random 500
,num_3         number(5) -- cycle smoothly
,num_4         number(5) -- Random 10000
,vc_1          varchar2(10)
,vc_2          varchar2(10)
,vc_pad        varchar2(2000))
tablespace users
partition by range (id)
(partition id_01k values less than (1000)
tablespace users
,partition id_02k values less than (2000)
tablespace users
...
,partition id_45k values less than (45000)
tablespace users
,partition id_max values less than (maxvalue)
tablespace users
)
--
-- local partitioned indexes on the table
 IND_NAME           TAB_NAME           PSN       COL_NAME
 ------------------ ------------------ --------- ------------
 TP4_CRE_DT         TEST_P4            1         CRE_DATETIME

 TP4_PK             TEST_P4            1         ID
--
-- populate the table with data
insert into test_p4(id,cre_datetime,status,num_1,num_2,num_3,num_4
                  ,vc_1,vc_2,vc_pad)
select rownum
,to_date('01-JUL-2009','DD-MON-YYYY')+(rownum/360)
,decode(mod(rownum,100),0,1
              ,0)
,trunc(dbms_random.value(1,20)) 
,trunc(dbms_random.value(1,50)) 
,mod(rownum,10)+1
,trunc(dbms_random.value(1,10000))
,dbms_random.string('U',5)
,lpad(chr(mod(rownum,6)+65),5,chr(mod(rownum,6)+65) )
,lpad('A',1000,'A')
from dba_objects
where rownum <43000

I then messed with the data, updating 10% of records and setting the CRE_DATETIME to plus or minus a random amount up to 2 days different, so data in partitions would overlap. I then created a range table in the same way as I did for the previous post.

I ended up with a range table like the below:

MIN_CRE_DATI      MAX_CRE_DATI          MIN_ID     MAX_ID
----------------- ----------------- ---------- ----------
29-JUN-2009 20:35 05-JUL-2009 14:24          1        999
02-JUL-2009 04:50 07-JUL-2009 12:56       1000       1999
05-JUL-2009 02:34 10-JUL-2009 08:31       2000       2999
08-JUL-2009 05:23 13-JUL-2009 03:32       3000       3999
11-JUL-2009 08:07 15-JUL-2009 18:41       4000       4999
14-JUL-2009 00:14 18-JUL-2009 18:27       5000       5999
16-JUL-2009 08:58 21-JUL-2009 14:18       6000       6999
19-JUL-2009 01:28 24-JUL-2009 11:21       7000       7999
22-JUL-2009 08:02 27-JUL-2009 07:01       8000       8999
24-JUL-2009 22:06 30-JUL-2009 05:37       9000       9999
28-JUL-2009 04:59 01-AUG-2009 10:57      10000      10999
...
24-SEP-2009 01:52 28-SEP-2009 18:36      31000      31999
26-SEP-2009 16:49 01-OCT-2009 01:26      32000      32999
29-SEP-2009 13:20 04-OCT-2009 13:43      33000      33999
02-OCT-2009 08:40 07-OCT-2009 10:11      34000      34999
05-OCT-2009 04:29 10-OCT-2009 04:09      35000      35999
08-OCT-2009 02:04 12-OCT-2009 17:34      36000      36999
10-OCT-2009 20:03 15-OCT-2009 08:39      37000      37999
13-OCT-2009 15:09 18-OCT-2009 12:01      38000      38999
16-OCT-2009 06:49 21-OCT-2009 03:53      39000      39999
18-OCT-2009 20:16 23-OCT-2009 21:01      40000      40999
21-OCT-2009 12:10 26-OCT-2009 07:13      41000      41999
25-OCT-2009 01:29 29-OCT-2009 20:56      42000      42999

You can see that the MIN_CRE_DATI-MAX_CRE_DATI from record to record (partition to partition) overlap but generally increase.

How do you find the start and end of the ID range to cover a date period you are interested in? I always have to sit down with a pen a paer to work this out. It is the classic “overlapping ranges” check, but my brain cannot hold on to it. So here goes. I want all records between the 1st of October and the 6th, this year.

You want to find the partition with the lowest ID range which has a record that falls into the date range you want. ie the Maximum CRE_DATETIME record is as late or later than the range you are interested in. If the maximum CRE_DATETIME is less than the date range you are interested in, no records will be in that partition.

Here is the code to find the lowest partition

select min(min_id)
from tp4_range 
where MAX_CRE_DATI>=to_date('01-OCT-2009','DD-MON-YYYY')
MIN(MIN_ID)
-----------
      32000

Execution Plan
----------------------------------------------------------
| Id  | Operation                    | Name           | Rows  | Bytes | Cost 
----------------------------------------------------------
|   0 | SELECT STATEMENT             |                |     1 |    12 |     2
|   1 |  SORT AGGREGATE              |                |     1 |    12 |         
|   2 |   TABLE ACCESS BY INDEX ROWID| TP4_RANGE      |    12 |   144 |     2
|*  3 |    INDEX RANGE SCAN          | TP4R_MACD_MIAD |    12 |       |     1
----------------------------------------------------------

To cover the whole of the partition of interest you need to look for records with and I greater than the minimum in that partition, thus the selection of the min(min_id)

Similarly, you only want partitions where the minimum CRE_DATETIME is before the end of the date range you want. If all records in the partition have a CRE_DATETIME beyond the range, you are not interested in it.

select max(max_id)
             from tp4_range where MIN_CRE_DATI<=to_date('06-OCT-2009','DD-MON-YYYY')
MAX(MAX_ID)
-----------
      35999

Execution Plan
----------------------------------------------------------
| Id  | Operation                    | Name           | Rows  | Bytes | Cost 
----------------------------------------------------------
|   0 | SELECT STATEMENT             |                |     1 |    13 |     2
|   1 |  SORT AGGREGATE              |                |     1 |    13 |         
|   2 |   TABLE ACCESS BY INDEX ROWID| TP4_RANGE      |    37 |   481 |     2
|*  3 |    INDEX RANGE SCAN          | TP4R_MICD_MIID |    37 |       |     1
----------------------------------------------------------

Then you put these sub-queries into the query to select the data you want. Below I show the “original” code which does not use the partition range table and then using the partition range table, to prove the same number of records come back and to see the plan and statistics change:

select count(*) from test_p4
where cre_datetime between to_date('01-OCT-2009','DD-MON-YYYY')
                   and     to_date('06-OCT-2009','DD-MON-YYYY')
and num_2 = 5

  COUNT(*)
----------
        48

Execution Plan
------------------------------------------------------
| Id  | Operation                           | Name       | R
ows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
------------------------------------------------------------
------------------------------------------------------
|   0 | SELECT STATEMENT                    |            |
   1 |    11 |   356   (1)| 00:00:02 |       |       |
|   1 |  SORT AGGREGATE                     |            |
   1 |    11 |            |          |       |       |
|   2 |   PARTITION RANGE ALL               |            |
  36 |   396 |   356   (1)| 00:00:02 |     1 |    46 |
|*  3 |    TABLE ACCESS BY LOCAL INDEX ROWID| TEST_P4    |
  36 |   396 |   356   (1)| 00:00:02 |     1 |    46 |
|*  4 |     INDEX RANGE SCAN                | TP4_CRE_DT |
1788 |       |    52   (0)| 00:00:01 |     1 |    46 |
------------------------------------------------------------
Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
        676  consistent gets
          0  physical reads

Note the 676 consistent gets and the Pstart/Pstop of 1-46

select count(*) from test_p4
where cre_datetime between to_date('01-OCT-2009','DD-MON-YYYY')
                   and     to_date('06-OCT-2009','DD-MON-YYYY')
and id> (select min(min_id)
             from tp4_range where MAX_CRE_DATI
                 >=to_date('01-OCT-2009','DD-MON-YYYY'))
and id < (select max(max_id)
             from tp4_range where MIN_CRE_DATI
                <=to_date('06-OCT-2009','DD-MON-YYYY'))
and num_2 = 5

  COUNT(*)
----------
        48

Execution Plan
-----------------------------------------------------------
| Id  | Operation                            | Name
  | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
------------------------------------------------------------
-----------------------------------------------------------
|   0 | SELECT STATEMENT                     |
  |     1 |    16 |    66   (5)| 00:00:01 |       |       |
|   1 |  SORT AGGREGATE                      |
  |     1 |    16 |            |          |       |       |
|   2 |   PARTITION RANGE ITERATOR           |
  |     1 |    16 |    62   (5)| 00:00:01 |   KEY |   KEY |
|*  3 |    TABLE ACCESS BY LOCAL INDEX ROWID | TEST_P4
  |     1 |    16 |    62   (5)| 00:00:01 |   KEY |   KEY |
|   4 |     BITMAP CONVERSION TO ROWIDS      |
  |       |       |            |          |       |       |
|   5 |      BITMAP AND                      |
  |       |       |            |          |       |       |
|   6 |       BITMAP CONVERSION FROM ROWIDS  |
  |       |       |            |          |       |       |
|   7 |        SORT ORDER BY                 |
  |       |       |            |          |       |       |
|*  8 |         INDEX RANGE SCAN             | TP4_PK
  |  1788 |       |     2   (0)| 00:00:01 |   KEY |   KEY |
|   9 |          SORT AGGREGATE              |
  |     1 |    12 |            |          |       |       |
|  10 |           TABLE ACCESS BY INDEX ROWID| TP4_RANGE
  |    12 |   144 |     2   (0)| 00:00:01 |       |       |
|* 11 |            INDEX RANGE SCAN          | TP4R_MACD_MIA
D |    12 |       |     1   (0)| 00:00:01 |       |       |
|  12 |          SORT AGGREGATE              |
  |     1 |    13 |            |          |       |       |
|  13 |           TABLE ACCESS BY INDEX ROWID| TP4_RANGE
  |    37 |   481 |     2   (0)| 00:00:01 |       |       |
|* 14 |            INDEX RANGE SCAN          | TP4R_MICD_MII
D |    37 |       |     1   (0)| 00:00:01 |       |       |
|  15 |       BITMAP CONVERSION FROM ROWIDS  |
  |       |       |            |          |       |       |
|  16 |        SORT ORDER BY                 |
  |       |       |            |          |       |       |
|* 17 |         INDEX RANGE SCAN             | TP4_CRE_DT
  |  1788 |       |    52   (0)| 00:00:01 |   KEY |   KEY |
------------------------------------------------------------

Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
        350  consistent gets
          0  physical reads

The plan is admittedly more complex and the SQL can be tricky to understand if you are not used to code that looks for overlap of ranges. But Consistent gets is down to 350 and the Pstart/Pstop values are KEY-KEY. The CBO cannot tell you WHAT the ranges will be when the code is parsed {I think it should check but in 10.2.0.3 at least, the CBO is not that smart at parse time}, but it knows there will be a start and stop.

I am using tiny partitions for my example and only 45 of them for the table. When the partitions are for millions of rows and there are a couple of thousand of them, excluding partitions in a manner you can rely on is a powerful tool. Which leads onto a final word of caution.

Nothing is enforcing that the range table is maintained.

You could do things with triggers or regular re-calculation of the ranges table but this will be something you need to consider if you use ranges tables to help partition exclusion. Flipping tablespaces to read-only can help the worry go away though… :-)

As an example of the issue of needing to maintain the ranges table and also a demonstration that the ranges table does work correctly if maintainted, I’ll update a record well outside of the “expected window”, show that it does not appear in my range-check controlled code, then update the ranges table and try again.

update test_p4
set cre_datetime=to_date(’02-OCT-2009 11:15′,’DD-MON-YYYY HH24:MI’)
,NUM_2=5
where id=22100
/
3> select count(*) from test_p4
where cre_datetime between to_date(’01-OCT-2009′,’DD-MON-YYYY’)
and to_date(’06-OCT-2009′,’DD-MON-YYYY’)
and id> (select min(min_id)
from tp4_range where MAX_CRE_DATI>=to_date(’01-OCT-2009′,’DD-MON-YYYY’))
and id < (select max(max_id)
from tp4_range where MIN_CRE_DATI (select min(min_id)
from tp4_range where MAX_CRE_DATI>=to_date(’01-OCT-2009′,’DD-MON-YYYY’))
and id < (select max(max_id)
from tp4_range where MIN_CRE_DATI<=to_date('06-OCT-2009','DD-MON-YYYY'))
and num_2 = 5

COUNT(*)
———-
49
— success

I wonder if I will do the next posting on this topic in less than two weeks!

Scalar Functions Not Behaving as Scalar Functions December 8, 2009

Posted by mwidlake in performance, Uncategorized.
Tags: ,
3 comments

My thanks go to Graeme Hobbs, who I currently have the pleasure of working with, for explaining this to me.

This is all about scalar functions that do not act like scalar functions.

Where I am spending much of my time at the moment, we have an issue with a very simple sql statement. It takes on average 2 buffer gets per execution. How could that be an issue? Well, it is being executed a few thousand times an hour. By “a few thousand” read 50,000 times. Even a simple statement being executed 15 times a second is not good if it is not needed. Especially, as in this case, there is a context switch.

I got a look at the code and it is doing something like this:

HAVING SUM( col.vol_gbp ) >=
my_user.pkg_application_settings.get_number_value( ‘rp157_eric’ )

I quickly batted an email to my esteemed colleague Mr Hobbs saying “but this is a scalar. It will be executed once for the statement and the value passed into each execution of the whole query, it can’t be the source of the 50,000 executions an hour”.

Oh woe, for my ignorance has undone me. But it’s OK, Graeme not only took time to explain, but knocked up the following nice demo of WHY this is a problem. And said I could put it on my Blog. What a nice chap.

From Graeme:
******************************************************

Consider package:

CREATE OR REPLACE PACKAGE BODY ghtest
IS
l_called_count NUMBER := 0;

-- Function and procedure implementations
FUNCTION f_get_called_count
RETURN NUMBER
IS
BEGIN
RETURN( l_called_count );
END;

FUNCTION f_call
RETURN NUMBER
IS
BEGIN
l_called_count := l_called_count + 1;
RETURN 1;
END;

FUNCTION f_reset
RETURN NUMBER
IS
BEGIN
l_called_count := 0;
RETURN 1;
END;
END ghtest;

– nothing up my sleeve
SELECT ghtest.f_get_called_count
FROM DUAL;

F_GET_CALLED_COUNT
——————
0
ie no calls made yet

SQL> SELECT   id
           , COUNT( * )
        FROM ( SELECT *
                FROM MY_USR.tbl_SOME_DATA
               WHERE ROWNUM < 10 )
    GROUP BY id
      HAVING COUNT( * ) > ghtest.f_call;

ID     COUNT(*)
---------- ----------
1624361594 3

SELECT ghtest.f_get_called_count
FROM DUAL;

F_GET_CALLED_COUNT
------------------
7

NOTE the counter has gone up to 7 as there were 7 IDs found from that code -6 IDs with one record and one with 3.

SELECT ghtest.f_reset
FROM DUAL;

F_RESET
———-
1

That just blanks the counter (see the package).

-- nothing up my sleeve
SELECT ghtest.f_get_called_count
FROM DUAL;

F_GET_CALLED_COUNT
------------------
0

SELECT   id
          , COUNT( * )
       FROM ( SELECT *
               FROM my_usr.tbl_some_data
              WHERE ROWNUM < 10 )
   GROUP BY id
     HAVING COUNT( * ) > (SELECT ghtest.f_call FROM dual);

ID     COUNT(*)
---------- ----------
1624361594 3

SELECT ghtest.f_get_called_count
FROM DUAL;

F_GET_CALLED_COUNT
------------------
1

You can see that the scalar subquery
HAVING COUNT( * ) > (SELECT ghtest.f_call FROM dual);
calls f_call once.

Whereas
HAVING COUNT( * ) > ghtest.f_call;
results in it being called 7 times – the number of rows returned from the group by

That is precisely the problem.
******************************************************

Apparently our developers should know this, as it is a “known issue”. But not known to me until now (unless I knew it and forgot it)

So, hats of to Graeme.

{apologies, wordpress utterly screwed the layout/truncated text when I pasted in the example, have fixed now}

Follow

Get every new post delivered to your Inbox.

Join 166 other followers