Richard Foote on the Impact of stats staying the same. February 16, 2010
Posted by mwidlake in performance.Tags: Blogging, performance, statistics
add a comment
I just wanted to highlight this very good posting by Richard Foote. He is talking about how SQL execution plans can change when “nothing else does”. Not the table structures, not the code, note the initialisation parameters, not even the table and index stats.
But something does change, which is what day it is (or what hour it is or what week it is). Time moves on and our data does too. If the statistics on the tables does NOT move on, then the CBO thinks that the range of data in the table does not change. So, the CBO thinks your queries are getting further and further “out of range” and so would expect to find less and less data in the tables to bring back. That will lead to plan changes.
If you have noticed my preoccupation with identifying the contents of histograms and high/low values in column stats, you may appreciate that this topic is one I have been finding is a large part of my day job.
Richard explains the point very well, as always, so go have a look.
Stats Need Stats to Gather Stats February 16, 2010
Posted by mwidlake in performance.Tags: partitions, performance, statistics
10 comments
Did you know that you sometimes need good stats so that the stats-gathering package can gather stats in an efficient way? This is a recent, quite extreme example.
I’ve been forced to learn a lot about gathering Oracle stats using DBMS_STATS over the last 4 or 5 years. But no matter how much I learn about the “challengingly bizarre” way in which it works, it seems every week or two there is a new oddity. I plan a whole series on the topic “soon”.
This particular example is from a 10.2.0.3 system.
I am gathering partition-only table stats as we are using the ability of Oracle to roll up Partition stats to Global stats under certain specific conditions. One of the conditions is that you need stats for every partition. Plus, to get global column stats, each partition must have stats for each column. Some of our partitions lacked or had very bad stats.
So I quickly knocked up a script-generating script to create DBMST_STATS.GATHER_TABLE_STATS statements that collected, for those partitions:
- ONLY partition stats.
- NO cascade down to indexes
- BLOCK sampling so it is fast {and poor, but there you go}
- ESTIMATE_PERCENT of 2, which is quite low for block sampling
- collect histograms as we decided the devil of having them was better than the devil of not having them.
the above is not ideal to get “good stats”, but it is quick and gets OK stats which is what we need right now. An example statement is:
begin -- part SD_INFO-DY07032004 rows 34554 dbms_stats.gather_table_stats(ownname => 'ERIC',tabname=> 'SD_INFO' ,partname=> 'DY07032004' ,granularity=> 'PARTITION', estimate_percent => 2,block_sample=> true ,cascade=> FALSE ,method_opt=> 'FOR ALL COLUMNS SIZE AUTO' ,no_invalidate=> false); END; /
Not a big partition (34554 rows and not very wide rows) and so did not take long to gather:
PARTITION_NAME LAST_ANALYZED ------------------------------ -------------------- DY04012004 12-FEB-2010 16:26:57 DY05012004 12-FEB-2010 16:27:00 DY06012004 12-FEB-2010 16:27:04 DY07012004 12-FEB-2010 16:27:07 DY08012004 12-FEB-2010 16:27:11 -- 4 seconds difference DY09012004 12-FEB-2010 16:27:15
I’ve collected statistics for a few thousand partitions over the last couple of days and the time taken is anything between just under half a second to 10 seconds per partition, the odd unusually large partition taking a minute or so. {I believe it takes half a second to gather stats on an empty partition, on our system at least, due to the time it takes for the internal housekeeping, including copying the old statistics information to the SYS.WRI$_OPSTAT_… tables to support restoring stats}. Sorry, I drift away from my main point.
This partition took a lot longer than 10 seconds:
begin -- part W_ACTIVITY_FAILURE-DY02092008 rows 49425 dbms_stats.gather_table_stats(ownname => 'ERIC',tabname=> 'W_ACTIVITY_FAILURE' ,partname=> 'DY02092008' ,granularity=> 'PARTITION', estimate_percent => 2,block_sample=> true ,cascade=> FALSE ,method_opt=> 'FOR ALL COLUMNS SIZE AUTO' ,no_invalidate=> false); END; /
After 10 minutes it was still running. WHY? I quickly checked the number of rows in the partition and then the size of the partition segment, incase either was much larger than I expected. Neither were.
select count(*) from eric.W_ACTIVITY_FAILURE partition (DY07092008)
COUNT(*)
———-
42182
From dba_segments.
BYTES BLOCKS
———- ———-
2621440 320
There is one advantage of a DBMS_STATS statement running for a long time – you can grab from the SGA the actual code being executed for the DBMS_STATS statement. I saw this.
select substrb(dump(val,16,0,32),1,120) ep, cnt
from (select /*+ no_parallel(t) no_parallel_index(t) dbms_stats cursor_sharing_exact
use_weak_name_resl dynamic_sampling(0) no_monitoring */
"ACCOUNT_ID" val,count(*) cnt
from "ERIC"."W_ACTIVITY_FAILURE" t
where TBL$OR$IDX$PART$NUM("ERIC"."W_ACTIVITY_FAILURE",0,4,0,"ROWID") =
bjn
and "ACCOUNT_ID" is not null
group by "ACCOUNT_ID")
order by val
DBMS_STATS is scanning the ACCOUNT_ID column on this table and it is taking a long time about it. The only index on the table is on ACCOUNT_ID. It then struck us.
The index is a global index.
INDEX_NAME PARTITIONED
—————————— ————-
API_XXXXXXXXXX_ACCOUNTS_IND NO
And that global index actually lacked stats {another feature of DBMS_STATS and rolling up partition stats had led to that}.
INDEX_NAME TYP UNQ BL L_BLKS DIST_KEYS CLUSTF LB_KEY DB_KEY LST_ANL
————— ——— — —- ———- ———– ———– ———- ———- ——–
API_XXXXXXX_ NOR NON
ACCOUNTS_IND
By this time 3 partitions for this table had been processed by my code, taking around 15 minutes each one. Incredibly slow.
I did a very quick 0.01% sample size DBMS_STATS.GATHER_INDEX_STATS on that index which took about 1 minute. As soon as the partition DBMS_STATS.GATHER_TABLE_STATS statement that was in flight finished, the following similar statements on that table’s partitions took under 3 seconds each. I’ll buy a pint for the first person to guess (within 10 minutes) WHEN I collected the global index stats {You can collect from any pub in central London if you give me a few day’s notice
}.
PARTITION_NAME NUM_ROWS BLOCKS LAST_ANALYZED -------------- ---------- ---------- --------------------- DY01092008 31461 452 16-FEB-2010 09:51:41 DY02092008 49425 686 16-FEB-2010 10:03:44 DY03092008 54472 719 16-FEB-2010 10:16:31 DY04092008 35762 491 16-FEB-2010 10:30:52 DY05092008 42182 587 16-FEB-2010 10:44:24 DY06092008 21186 312 16-FEB-2010 10:56:13 DY07092008 20898 313 16-FEB-2010 11:12:59 DY08092008 469500 1233 16-FEB-2010 11:13:02 DY09092008 480300 741 16-FEB-2010 11:13:04 DY10092008 15724 223 16-FEB-2010 11:31:01 DY11092008 55671 732 16-FEB-2010 11:31:06 DY12092008 820100 1779 16-FEB-2010 11:31:08 DY13092008 80215 1113 16-FEB-2010 11:31:16 DY14092008 10094 155 16-FEB-2010 11:31:18 DY15092008 10268 158 16-FEB-2010 11:31:20 DY16092008 24578 330 16-FEB-2010 11:31:22 DY17092008 21012 290 16-FEB-2010 11:31:24 DY18092008 22755 318 16-FEB-2010 11:31:27 DY19092008 21276 293 16-FEB-2010 11:31:29 DY20092008 20882 281 16-FEB-2010 11:31:31 DY21092008 24131 323 16-FEB-2010 11:31:34
I will investigate this oddity further if I get time this week, but the lesson to myself so far is:
Global indexes lacking stats can result in very long stats gathering times for partitions EVEN WHEN YOU DO NOT CASCADE DOWN TO INDEXES.
Turning on SQL Audit February 2, 2010
Posted by mwidlake in development, performance, Testing.Tags: audit, performance, security, Testing
6 comments
I want to test out using the SQL AUDIT functionality.
The first thing you have to do (having read up on it a bit) is to enable it for the database. You do this by setting the initialization parameter AUDIT_TRAIL. On version 10.2 you have the options to write the audit entires to:
- The DB, by setting it to DB or DB_EXTENDED {not DB,EXTENDED as the manual says, that is the old format}. This puts the entries into the table SYS.AUD$
- The operating system, by setting it to OS, XML or XML_EXTENDED. If you set it to XML or XML_EXTENDED then the data is written out in XML format. You also optionally set AUDIT_FILE_DEST to where you want to data to be written to.
Writing the audit trail to the OS is potentially more secure, as you can stop those cunning and devious DBAs messing with the audit trail. {I’m not so sure that this really helps that much – if anyone knows of any DBAs caught out being naughty solely as a result of using the OS to store the SQL AUDIT records, it would be a fascinating comment}
I want to write to the DB as I want to be able to get at the audit data easily and I am not sure how I want to interrogate it. I’m faster with SQL than SED and AWK.
I also decided up front I wanted to use DB_EXTENDED so that the triggering SQL statement and all bind variables are caught, so I can see more about what it triggering the audit record. I am cautious of the impact of storing CLOBs though, which these two values are stored as. I’ve had performance issues moving lots of CLOBS around and I know from some old colleagues that Secure Files are a lot faster. If Secure Files are faster, that means CLOBs are slower
. If the audit trail seems to add too much burden on my system, swapping back to just DB will be my first step.
Now for the bad news. You can’t just turn on AUDIT. That initialization parameter is not dynamic. You can’t even enable it for your session. It will need a restart of your database.
This tells me something. Oracle needs to do some setting up for SQL AUDIT when it starts the instance. Either start a new process, enable functionality in one of it’s regular processes or set up structures in memory to cope. Or a mixture thereof. I strongly suspect the need for memory structures {but this is only because, in reality, I have done some testing and I am writing this up afterwards}.
I should not really need to say this but DON’T go turning this on for a production system without extensive testing somewhere else first. There is not a lot “Out There” about the details of the performance impact of AUDIT but the general opinion is there is some; and that is reasonable given it is going to write database records for every action audited. Also, you have no idea yet of any knock-on effects. You know, things you did not expect that causes your database to lock or crash and you to get fired.
{Question, what happens if you alter the initialization file and restart only one node of a RAC database? I don’t know and so I should test that. My current test system is not RAC, but the final destination for this stuff is RAC}.
You probably also want to check that no one has gone and tried turning on SQL AUDIT on things already. You never know if someone else decided to have a play with this and issued a load of AUDIT statements only to find nothing happened – and left what they did in place “as nothing happened”. I already know of one example of this happening…
Here is a little script I knocked up to see what is currently set to be audited:
-- what_is_audited.sql -- Martin Widlake 11/01/10 -- simple listing of what auditing is currently set set pages 100 set pause on spool what_is_audited.lst select * from dba_priv_audit_opts order by user_name,privilege / select * from sys.dba_stmt_audit_opts order by user_name,audit_option / select * from DBA_OBJ_AUDIT_OPTS order by owner,object_name / spool off clear col -- -- EOF --
And some sample output. I’m not going to explain it in this post, but you can have a look though it.
DEV3> @what_is_audited USER_NAME PROXY_NAME ------------------------------ ------------------------------ PRIVILEGE SUCCESS FAILURE ---------------------------------------- ---------- ---------- MDW1 ALTER SYSTEM BY ACCESS BY ACCESS MDW1 AUDIT SYSTEM BY ACCESS BY ACCESS MDW1 CREATE SESSION BY ACCESS BY ACCESS ALTER SYSTEM BY ACCESS BY ACCESS AUDIT SYSTEM BY ACCESS BY ACCESS CREATE SESSION BY ACCESS BY ACCESS 6 rows selected. USER_NAME PROXY_NAME ------------------------------ ------------------------------ AUDIT_OPTION SUCCESS FAILURE ---------------------------------------- ---------- ---------- MDW1 ALTER SYSTEM BY ACCESS BY ACCESS MDW1 CLUSTER BY ACCESS BY ACCESS MDW1 CONTEXT BY ACCESS BY ACCESS MDW1 CREATE SESSION BY ACCESS BY ACCESS MDW1 DATABASE LINK BY ACCESS BY ACCESS MDW1 DELETE TABLE BY ACCESS BY ACCESS MDW1 ... TYPE BY ACCESS BY ACCESS MDW1 UPDATE TABLE BY ACCESS BY ACCESS MDW1 USER BY ACCESS BY ACCESS MDW1 VIEW BY ACCESS BY ACCESS ALTER SYSTEM BY ACCESS BY ACCESS CLUSTER BY ACCESS BY ACCESS CONTEXT BY ACCESS BY ACCESS CREATE SESSION BY ACCESS BY ACCESS DATABASE LINK BY ACCESS BY ACCESS DIMENSION BY ACCESS BY ACCESS DIRECTORY BY ACCESS BY ACCESS INDEX BY ACCESS BY ACCESS MATERIALIZED VIEW BY ACCESS BY ACCESS ... USER BY ACCESS BY ACCESS VIEW BY ACCESS BY ACCESS 56 rows selected. OWNER OBJECT_NAME OBJECT_TYPE ------------------------------ ------------------------------ -------------- ALT AUD COM DEL GRA IND INS LOC REN SEL ------- ------- ------- ------- ------- ------- ------- ------- ------- ---- UPD REF EXE CRE REA WRI FBK ------- --- ------- ------- ------- ------- ------- MWPERF FORN_M_SEQ SEQUENCE -/- -/- -/- -/- -/- -/- -/- -/- -/- A/A -/- -/- -/- -/- -/- -/- -/- MWPERF PERSON TABLE A/A A/A A/A A/A A/A A/A A/A A/A A/A A/A A/A -/- -/- -/- -/- -/- A/A MWPERF ROAD_TYPE TABLE -/- -/- -/- A/A -/- -/- A/A -/- -/- A/A A/A -/- -/- -/- -/- -/- -/-
If you discover you have a lot of things set to be audited, ESPECIALLY if they are auditing select access, think about turning some or all of it off before you enable AUDITING by setting that initialization parameter.
Once you have turned on the feature, you can start testing it…
Testing Methodology – How to Test #2 January 26, 2010
Posted by mwidlake in performance, Testing.Tags: performance, Testing
1 comment so far
Last post I (see link above) I said something about how I like timing how long things {usually SQL queries but you can apply this to more complex situations} take as it is simple. But I highlighted that it is an approach that misses a lot of information and also can catch you out if you do not appreciate that the first run of any test script is likely to be slow (due to parsing and data caching overheads which are reduced for subsequent itterations).
I want to continue on that theme now, of simple testing with just how long things take. And how to cope with variation.
Hands up all of you who have Oracle on a PC at home? Great, you can test what you want on a dedicated machine. No one else is running anything on your machine, no one else is using your storage, no one else is using your network (if you have a network at home) and no one is running the SQL statement from hell on your database.
BUT. You might have some download of the latest security patch set running in the background, your virus scanner may kick in (or you might just have Norton full-security-everything-running-on-your-cpu-all-the-time turned on). Your miles per gallon may vary, especially on Windows but also on Linux.
For many of us we are testing on a server at work, where developers are working, where the storage is shared, where the network is dealing with that lady behind you watching “Celebrity Strictly come roller-skate dancing impersonations” {well, she sat behind me for 4 months and she was in the Quality team, what does that tell you}. Bottom line, the performance of your server will vary, your testing needs to cope with that.
How do you cope with a varying workload on the system you are running tests on? Well, you run the new and old version of your test several times, in succession, interleaved. This is ideal when , in the very un-ideal situation of having to test on a Live situation. Testing on “Live” is wrong, in the same way that spelling “initialisation” with a Z is wrong, but it is often forced upon us.
As an example, I want to check into the questionable suggestion someone made that selecting count(*) from dba_tables where owner=user is faster than selecting count(*) from user_tables.
So, I knock up a quick sql*plus script that runs the two alternatives in quick succession, interleaved.
--testa.sql set timi on pause off spool testa.lst -- prompt run 1 select count(*) from user_tables; select count(*) from dba_tables where owner=user; -- prompt run 2 select count(*) from user_tables; select count(*) from dba_tables where owner=user; -- prompt run 3 select count(*) from user_tables; select count(*) from dba_tables where owner=user; -- prompt run 4 select count(*) from user_tables; select count(*) from dba_tables where owner=user; -- prompt run 5 select count(*) from user_tables; select count(*) from dba_tables where owner=user; -- spool off -- eof
Just a couple of things to mention. I always spool out to a file called the same thing as the script, with .lst at the end (which is the default file extension for spool commands but I like to state it) and I like to use “prompt” to record in the output what is going on. Here it is obvious, sometimes it is not. Oh, and I like to put –EOF at the end of my files, so I can spot cut-and-paste errors. I know, I am very “command line” centric. I love pictures too Doug. Finally, I should not run the two versions with the same one first each time, I should swap them over after every two runs (not each run as then you are running A,A,B,B,A,A,B,B. A more complex A,B,A,B,B,A,B,A is better). But the results look messy, so I didn’t.
Here is the output (trimmed a little, for the sake of brevity):-
run 1
COUNT(*)
----------
35
1 row selected.
Elapsed: 00:00:00.58
COUNT(*)
----------
35
1 row selected.
Elapsed: 00:00:00.73
run 2
COUNT(*)
----------
35
1 row selected.
Elapsed: 00:00:00.28
COUNT(*)
----------
35
1 row selected.
Elapsed: 00:00:00.10
run 3
COUNT(*)
----------
35
1 row selected.
Elapsed: 00:00:00.28
COUNT(*)
----------
35
1 row selected.
Elapsed: 00:00:00.10
run 4
COUNT(*)
----------
35
1 row selected.
Elapsed: 00:00:00.28
COUNT(*)
----------
35
1 row selected.
Elapsed: 00:00:00.10
run 5
COUNT(*)
----------
35
1 row selected.
Elapsed: 00:00:00.29
COUNT(*)
----------
35
1 row selected.
Elapsed: 00:00:00.09
At this point I might pull the results into Excel, as it is so good at drawing pretty graphs, but I think we can interpret the results “by eye” in this case.
Run User Tabs DBA_Tabs 1 0.58 0.73 2 0.28 0.10 3 0.28 0.10 4 0.28 0.10 5 0.29 0.09 AVG 0.28 0.10
My Average is ignoring the first run for each version of the code, as previously explained {parse overhead/warming caches}
You can see that version B is considerably faster than A. In this case, there is no overlap between the two result sets, by which I mean every run o version B is faster than every run of version A. Also, there is little variation in the performance of each version.
What if you got a result set like:-
Run User Tabs DBA_Tabs 1 0.77 0.68 2 0.35 0.25 3 0.42 0.34 4 0.48 0.38 5 0.48 0.30 6 0.41 0.29 7 0.40 0.28 AVG 0.42 0.31
If you look at the figures, both A and B slow down and then start to speed up. Something happened to slow down the system in the middle of the test and it was recovering but not back to “normal” by the end. If we had done 7 runs of A followed by 7 runs of B, B would have been hard hit by the temporary system slow-down {I opened a WORD document
}.
Also, there is some overlap in the results for A and B. The slowest result for B, 0.38, is slower than for the fastest result for A, 0.35. When the spread of times for test A and test B overlap like this, you need to do a few more tests to ensure the results are consistent. I did 7 but I should really have done 9 or maybe 15.
In any case, B is consistently 25% faster than A. It is Faster.
How about this result set:
Run User Tabs DBA_Tabs 1 0.87 0.71 2 0.41 0.42 3 0.47 0.43 4 0.37 0.42 5 0.39 0.34 6 0.51 0.38 7 0.42 0.40 AVG 0.43 0.40
The variation in the result sets is higher and if fact if you only looked at results for paired runs 2,3 and 4 you would come to the conclusion A was very slightly faster than B. In fact, B is slightly faster than A.
Or is it? There are only 6 tests {remember, ignore run 1}, the variation within the sets is high (.37 to .51 in A, .34 ot .42 in B) and the overall difference is low. You could run a statistical analysis on this, a “Student T” test I think, to see if the difference was significant. But unless I was looking at something utterly business critical at this point where the smallest fraction of improvement was important, I would not bother. I would be far better off looking for solution C or doing something else completely. If you ever find yourself doing scientific statistical analysis to decide which of two options is faster, it is probably time to go home and consider what else you could be doing to help your employer more…
Enough for tonight and HOPEFULLY the wordpress pixies will not eat the end of this post (yesterday’s effort was better than tonights I think…)
Testing Methodology – How To Test January 26, 2010
Posted by mwidlake in performance, Testing.Tags: explain plan, performance, Testing
1 comment so far
On those rare but pleasant occasions when I find myself running a training course on performance, something I always want to press home is “How to Test”.
I believe you can learn everthing in every manual on oracle technology and internals and still be very poor at performance tuning. Similarly, I think you can be lacking an awful lot of knowledge and still be good at performance tuning. It comes down to how you test. Not just how to test, but how you as an individual design and run your tests.
I joke with whoever I am talking to that what you need most of all to test is a watch and someone calling out “start” and “Stop”. ie you need to be able to time things. It is a bit of a throw-away statement, but actually most of us will do exactly this on a regular basis. We will be working on something, run it and it will come back in a few seconds. Then we will tweak it and run it again and mentally {or in my case, just very quietly under my breath} count. We even discuss it with people like that “How long did it take that time?” Ohh, a couple of seconds faster than last time”.
I like tuning by timing very, very, very much.
Firstly, it is simple – If a SQL query runs faster, it runs faster. If you tune by looking at the explain plan and you see a full table scan being replace with a nested loop and an index look up, is it faster? It depends. If you tune by looking at the buffer gets and other statistics from the SGA (or wherever), if the “buffer gets” go down, is it faster? This depends again. If the disk reads went up, even by a realtively small amount, maybe not. If the memory usage went through the roof because of a sort that spills down to disc, well quite probably not. Obviously if all of buffer gets, disk reads and cpu usage went down, you can be pretty certain you are having a positive impact in speeding up the statement. But timing the statement from start to finish gives you a nice, simple answer on if it runs faster.
Secondly, it is simple. You do not have to look at plans, at runtime statistics, at deltas of session statistics, at trace files. All those things are good but you need more knowledge and experience to use them and time to set up, collect and study them. This is not to say you do not get a lot more out of tuning if you understand all of the stuff about trace files, explain plans etc, I hasten to add – but you do not need that to get going with performance tuning.
Thirdly, it is simple. You tell your manager’s manager that you use 47% less buffer gets, 12% less disk reads but 9% more CPU then they will ask you what all that means. You tell them that version A runs in 2 minutes and 9 seconds and version B in 34 seconds, they will understand. Better still, graph it and give them a power point…
Fourthly, it is simple. You can see that statement (a) is faster or slower than statement (b). Now you can look at the plan, at the statistics for the statement, at what wait events are occuring and start tying up book knowledge with real-world results.
Of course, timing with a watch is very crude. You have a very capable piece of computing power underlying that database of yours, so let’s get it to do the timing.
As a simple start, in SQL*Plus use “set timing on” (it can be abbreviated to “set timi on”). Most GUI SQL tools will either default to telling you the elapsed time or have an obvious option to switch on the functionality.
{Oh, I’ll just mention that if you cannot get timing in sql*plus or wherever to work you, might want to check what the initialisation parameter “TIMED_STATISTICS” is set to. Some ancient memory is telling me that if it is set to FALSE, you may not be able to use TIMING in SQL*Plus but another memory is telling me that stopped being true a while back, version 8 maybe. I tried setting TIMED_STATISTICS to false in my session on 10.2 but TIMING still worked, but then I left STATISTICS_LEVEL alone and that cause TIMED_STATISTICS to be set. It is so long ago that I worked on a system that had TIMED_STATISTICS set to false! Even a quick google did not throw up an immediate answer}.
DB10.2> select count(*) from sn
2 /
any key>
COUNT(*)
----------
116
1 row selected.
DB10.2> set timi on
DB10.2> select count(*) from sn
2 /
any key>
COUNT(*)
----------
116
1 row selected.
Elapsed: 00:00:01.09
There you go, the count of SN records, all 116, took 1.09 seconds.
Yes, I had “set pause on” and the timing includes how long it takes me to spot that the query has finished and press a key. We all do it. However, sometimes the need for user input it still missed {Something I see quite often is, for example, not pressing the button in PL/SQL developer to get ALL the rows for the SQL statement, rather than just the first screenful}. A key thing is to try and remove from testing all and any waiting for user response, as we humans are slow and irratic.
so SET PAUSE OFF.
Now you have TIMING on and pause off. Time to run something and see how long it takes, then try and speed it up and run again:
DB10.2> select count(*) from person where to_char(dob,'YYYY') = 1988
2 /
COUNT(*)
----------
29
Elapsed: 00:00:01.15
DB10.2> create index per_dob on person(dob);
Index created.
Elapsed: 00:00:01.31
DB10.2> select count(*) from person where to_char(dob,'YYYY') = 1988;
COUNT(*)
----------
29
Elapsed: 00:00:00.14
There you go, I added an index and the query went down from 1.15 seconds to 0.14, that is 8 times faster. Timing is timing.
Well, no, and this is something you need to be careful of if you are new to tuning.
The second itteration is nearly always faster.
Why? Well, the first time you run a piece of SQL it has to be parsed for a start, and that takes some time. More importantly, the first time you select the data, it is going to be on disk. Oracle has read it in and put it into memory. The second time you query the same data, it will be found in memory. That {nearly always} makes the next access to the data a lot faster. The index was not being used as I had a function on the column in the WHERE clause and this stops the index from being used.
So having said I love testing by timing, you need to be cautious about one-off tests. Oh, and here below is proof that the index I created is making no real difference to the speed of the SQL query:
DB10.2> set autotrace on
DB10.2> select count(*) from person where to_char(dob,'YYYY') = 1988;
COUNT(*)
----------
29
Elapsed: 00:00:00.12
Execution Plan
-----------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost
-----------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 9 | 335
| 1 | SORT AGGREGATE | | 1 | 9 |
|* 2 | TABLE ACCESS FULL| PERSON | 36 | 324 | 335
-----------------------------------------------------------
Statistics
----------------------------------------------------------
20 recursive calls
19 db block gets
727 consistent gets
0 physical reads
1992 redo size
DB10.2> drop index per_dob
2 /
Index dropped.
Elapsed: 00:00:00.03
DB10.2> select count(*) from person where to_char(dob,'YYYY') = 1988
2 /
COUNT(*)
----------
29
Elapsed: 00:00:00.15
Execution Plan
----------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost
----------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 9 | 335
| 1 | SORT AGGREGATE | | 1 | 9 |
|* 2 | TABLE ACCESS FULL| PERSON | 36 | 324 | 335
----------------------------------------------------------
Statistics
----------------------------------------------------------
261 recursive calls
19 db block gets
841 consistent gets
0 physical reads
1992 redo size
We see 0.12 seconds goes to 0.14 seconds, exactly the same explain plan for both statements, a small increase in consistent gets, no physical gets by either statement (so the data is cached).
Why is the second statement a little slower and has a large increase in recursive calls and db block gets? Because I dropped the index and the statement had to be reparsed. Now, if you are new to tuning you would almost certainly not have appreciated what the recursive calls and DB block gets were all about, it could distract you from the question of “does it run faster”. It is certainly good to know all about that, but when you are starting off, you want to keep things simple and learn in stages.
What the above demonstrates, I hope, is that the second thing you run will have an advantage and could probably run faster even though, in reality, it is using more resource. And we tend to run old code first and new code second. So swap them over, give the old code the advantage of being run second.
Do not test things once, as you can easily be caught out. Test each version several times. And ignore the first run of each version. This is not perfect advice, code in production may well be being parsed and gathering data from disk, but unless you can allow for this in your testing, I think it is generally better, for simple testing, to run each version 6 times. Of the six runs, ignore the first run of each and average the results of the other 5. Which ever one runs faster on average is, well, fastest. IF the difference is significant.
Oh. Where is the SQL AUDIT in all this? Well, ponder on why am I generating REDO for a simple select…
Decoding high_value and low_value January 3, 2010
Posted by mwidlake in internals, performance.Tags: data dictionary, performance, SQL
13 comments
There is an update to this post here>
The table DBA_TAB_COLUMNS holds the LOW_VALUE and HIGH_VALUE for columns. This information is potentially very useful to us, especially as when the CBO looks at a WHERE clause on a column for a given value, the High and Low values for that column are considered. If the value in the WHERE clause is outside the known range, the expected number of values to be found is decreased in proportion to the distance outside the range.
What is not so helpful is that Oracle stores, and displays, the information in an internal raw format. Which is utterly unhelpful to us of course. Why can’t they at least expose the information in a view in a human-readable format? Tsch.
So here is some code on how to decode low_value and high_value. I pulled most of this together a few years back but always struggled to decode the low and high values for dates, until I found this post by Gary Myers.
-- col_stats
-- Martin Widlake mdw 21/03/2003
-- MDW 11/12/09 enhanced to include more translations of low_value/high_value
-- pilfered from Gary Myers blog
col owner form a6 word wrap
col table_name form a15 word wrap
col column_name form a22 word wrap
col data_type form a12
col M form a1
col num_vals form 99999,999
col dnsty form 0.9999
col num_nulls form 99999,999
col low_v form a18
col hi_v form a18
col data_type form a10
set lines 110
break on owner nodup on table_name nodup
spool col_stats.lst
select --owner
-- ,table_name
column_name
,data_type
,decode (nullable,'N','Y','N') M
,num_distinct num_vals
,num_nulls
,density dnsty
,decode(data_type
,'NUMBER' ,to_char(utl_raw.cast_to_number(low_value))
,'VARCHAR2' ,to_char(utl_raw.cast_to_varchar2(low_value))
,'NVARCHAR2' ,to_char(utl_raw.cast_to_nvarchar2(low_value))
,'BINARY_DOUBLE',to_char(utl_raw.cast_to_binary_double(low_value))
,'BINARY_FLOAT' ,to_char(utl_raw.cast_to_binary_float(low_value))
,'DATE',to_char(1780+to_number(substr(low_value,1,2),'XX')
+to_number(substr(low_value,3,2),'XX'))||'-'
||to_number(substr(low_value,5,2),'XX')||'-'
||to_number(substr(low_value,7,2),'XX')||' '
||(to_number(substr(low_value,9,2),'XX')-1)||':'
||(to_number(substr(low_value,11,2),'XX')-1)||':'
||(to_number(substr(low_value,13,2),'XX')-1)
, low_value
) low_v
,decode(data_type
,'NUMBER' ,to_char(utl_raw.cast_to_number(high_value))
,'VARCHAR2' ,to_char(utl_raw.cast_to_varchar2(high_value))
,'NVARCHAR2' ,to_char(utl_raw.cast_to_nvarchar2(high_value))
,'BINARY_DOUBLE',to_char(utl_raw.cast_to_binary_double(high_value))
,'BINARY_FLOAT' ,to_char(utl_raw.cast_to_binary_float(high_value))
,'DATE',to_char(1780+to_number(substr(high_value,1,2),'XX')
+to_number(substr(high_value,3,2),'XX'))||'-'
||to_number(substr(high_value,5,2),'XX')||'-'
||to_number(substr(high_value,7,2),'XX')||' '
||(to_number(substr(high_value,9,2),'XX')-1)||':'
||(to_number(substr(high_value,11,2),'XX')-1)||':'
||(to_number(substr(high_value,13,2),'XX')-1)
, high_value
) hi_v
from dba_tab_columns
where owner like upper('&tab_own')
and table_name like upper(nvl('&tab_name','WHOOPS')||'%')
ORDER BY owner,table_name,COLUMN_ID
/
clear colu
spool off
clear breaks
Most of the translation is done via the utl_raw.cast_to_datatype functions but no such function is provided for dates, which is where most of us come unstuck. Gary recognised that the data was being stored in hex format, with an offset for the year.
I hope the script helps some of you.
{There are also functions under DBMS_STATS, DBMS_STATS.CONVERT_RAW_VALUES, that can also be called to translate many datatypes}
If anyone knows how to decode timestamps, I’d love to know as it would save me trying to work it out/find it on the Net somewhere. I’ll add it to the post so that there is one place to find all translatations.
Here is a quick output for a test table
@col_stats
Enter value for tab_own: dwperf
old 40: where owner like upper('&tab_own')
new 40: where owner like upper('dwperf')
Enter value for tab_name: ad_sql_exec_p
old 41: and table_name like upper(nvl('&tab_name','WHOOPS')||'%')
new 41: and table_name like upper(nvl('ad_sql_exec_p','WHOOPS')||'%')
any key>
COLUMN_NAME DATA_TYPE M NUM_VALS NUM_NULLS DNSTY LOW_V HI_V
---------------------- ---------- - ---------- ---------- ------- ------------------ ---------------
INST_ID NUMBER Y 4 0 0.2500 1 4
SQL_ID VARCHAR2 Y 87,104 0 0.0000 008d71chrks14 gzw309ttkx862
PLAN_HASH_VALUE NUMBER Y 1,884 0 0.0005 2646229 4294043874
SPC_ID NUMBER N 83 0 0.0120 1076 2269
ADDRESS RAW N 118,198 0 0.0000 00000003E33123A0 0000001342FF3FF8
HASH_VALUE NUMBER N 87,104 0 0.0000 2758710 4294676643
CREATED_DATE DATE N 120 0 0.0083 2009-10-23 8:19:10 2009-12-10 9:19:13
LATEST_FIRST_LOAD_TIME DATE N 11,791 0 0.0001 2009-9-14 11:55:59 2009-12-13 9:33:24
TOTAL_LOADS NUMBER N 98 0 0.0102 1 55047
TOTAL_PARSE_CALLS NUMBER N 92 0 0.0109 0 2972
TOTAL_EXECUTIONS NUMBER N 235 0 0.0043 0 834624
TOTAL_ROWS_PROCESSED NUMBER N 809 0 0.0012 0 26946123
TOTAL_FETCHES NUMBER N 313 0 0.0032 0 834624
TOTAL_BUFFER_GETS NUMBER N 3,016 0 0.0003 0 3355576809
TOTAL_DISK_READS NUMBER N 985 0 0.0010 0 28189240
TOTAL_DIRECT_WRITES NUMBER N 98 0 0.0102 0 751289
TOTAL_SORTS NUMBER N 106 0 0.0094 0 5283
TOTAL_CPU_TIME NUMBER N 94,401 0 0.0000 1337 12183936207
TOTAL_ELAPSED_TIME NUMBER N 115,203 0 0.0000 1337 139692482086
TOTAL_OPTIMIZER_COST NUMBER N 1,467 0 0.0007 0 369740902209315000
0000000
Using AWR data for OS information December 22, 2009
Posted by mwidlake in performance.Tags: performance, SQL
5 comments
This is a demo of using AWR data to get some OS performance information.
This posting was actually a comment by (and is the work of) Bernard Polarski on a recent post about command_types I did a week or two back. I’ve lifted it into a separate posting as (a) I thought it might be useful to people and it would not get as much exposure as a comment as it would as a post and (b) I want to try and lay it out better for Bernard (WordPress likes to garble code in comments, which it did to Bernard’s efforts) and (c) I strognly believe in nicking other people’s code, if they are happy to put it “out there”. Why re-code what someone has already done and is happy to share
I tried the code (I added the SYS. ownership on tables, so it will work even if you can’t log in as sys but have access to the objects) and it worked fine for me.
I should just add, use of AWR data is covered by an oracle licence, the diagnostic pack. If you do not have a licence for it, Oracle deem the WRH$ views as not yours to view, even though they are on your box, in the database you paid for, containing data gathered by your CPUs by code that is damned tricky to turn off and is on by default. Shesh.
Any mistakes or typoes, blame me not Bernard. I took the liberty of tweaking some of the English, hope that is OK Bernard.
*************************************
Okay a challenge : Let me try to surprise you with something improbable. Some months ago while investigating an unsual load on a Linux box, the unix sysadmin told us he could not go back more than 2 weeks with ‘sar’ history. Since Oracle takes also system stats I quickly enriched smenu with an Oracle sar.
The example below gives the sar stats for the last 10 snaps, it is stored into shortcut sts and called with param ‘-sar’
( replace workd Superior_TO and Inferior_to with mathematic signs, system eat them)
/tmp: sts -sar 10
+ sqlplus -s '/ as sysdba'
+
set feed off verify off lines 190 pages 66
col inst new_value inst noprint;
col id1 head 'Idle time(%)' justify c
col usr1 head 'User time(%)' justify c
col sys1 head 'Sys time(%)' justify c
col io1 head 'Io Wait time(%)' justify c
col nice1 head 'Nice time (%)' justify c
col snap_len head 'Interval| (Secs)' justify c
col num_cpus new_value p_num_cpus head 'Number of CPU';
col a1 new_value secs noprint;
col SNAP_BEGIN format a20 head 'Snap begin' justify c
col SNAP_END format a20 head 'Snap end' justify c
select instance_number inst from v$instance;
SELECT value num_cpus
FROM v$osstat WHERE stat_name = 'NUM_CPUS';
prompt
prompt Negatives values correspond to Shutdown:
prompt
select snap_id, snap_len, round(id1 /snap_len *100,2) id1,
round(usr1 /snap_len *100,2) usr1,
round(sys1 /snap_len *100,2) sys1,
round(io1 /snap_len *100,2) io1,
round(nice1 /snap_len *100,2) nice1
, snap_begin, snap_end
from (
select snap_id, id1, usr1,sys1, io1, nice1
, snap_begin, snap_end ,
round( extract( day from diffs) *24*60*60*60+
extract( hour from diffs) *60*60+
extract( minute from diffs )* 60 +
extract( second from diffs )) snap_len
-- above is the exact length of the snapshot in seconds
from ( select /*+ at this stage, each row show the cumulative value.
r1 7500 8600
r2 7300 8300
r3 7200 8110
we use [max(row) - lag(row)] to have the difference
between [row and row-1], to obtain differentials values:
r1 200 300
r2 100 190
r3 0 0
*/
a.snap_id,
(max(id1) - lag( max(id1)) over (order by a.snap_id))/100 id1 ,
(max(usr1) - lag( max(usr1)) over (order by a.snap_id))/100 usr1,
( max(sys1) - lag( max(sys1)) over (order by a.snap_id))/100 sys1,
( max(io1) - lag( max(io1)) over (order by a.snap_id))/100 io1,
( max(nice1) - lag( max(nice1)) over (order by a.snap_id))/100 nice1,
-- for later display
max(to_char(BEGIN_INTERVAL_TIME,' YYYY-MM-DD HH24:mi:ss')) snap_begin,
-- for later display
max(to_char(END_INTERVAL_TIME,' YYYY-MM-DD HH24:mi:ss')) snap_end,
-- exact len of snap used for percentage calculation
( max(END_INTERVAL_TIME)-max(BEGIN_INTERVAL_TIME)) diffs
from ( /*+ perform a pivot table so that the 5 values selected appear
on one line. The case, distibute col a.value among 5 new
columns, but creates a row for each. We will use the group
by (snap_id) to condense the 5 rows into one. If you don't
see how this works, just remove the group by and max function,
then re-add it and you will see the use of max().
Here is what you will see:
Raw data : 1000 2222
1000 3333
1000 4444
1000 5555
1000 6666
The SELECT CASE creates populate our inline view with structure:
ID IDLE USER SYS IOWAIT NICE
1000 2222
1000 3333
1000 4444
1000 5555
1000 6666
the group by(1000) condenses the rows in one:
ID IDLE USER SYS IOWAIT NICE
1000 2222 3333 4444 5555 6666
*/
select a.snap_id,
case b.STAT_NAME
when 'IDLE_TIME' then a.value / &p_num_cpus
end id1,
case b.STAT_NAME
when 'USER_TIME' then a.value / &p_num_cpus
end usr1 ,
case b.STAT_NAME
when 'SYS_TIME' then a.value / &p_num_cpus
end sys1 ,
case b.STAT_NAME
when 'IOWAIT_TIME' then a.value / &p_num_cpus
end io1,
case b.STAT_NAME
when 'NICE_TIME' then a.value / &p_num_cpus
end nice1
from sys.WRH$_OSSTAT a, sys.WRH$_OSSTAT_NAME b
where
a.dbid = b.dbid and
a.STAT_ID = b.stat_id and
instance_number = &inst and
b.stat_name in ('IDLE_TIME','USER_TIME','SYS_TIME','IOWAIT_TIME','NICE_TIME') and
a.snap_id > (( select max(snap_id) from sys.WRH$_OSSTAT) - 10 -1 )
order by 1 desc
) a, sys.wrm$_snapshot s
where a.snap_id = s.snap_id
group by a.snap_id
order by snap_id desc
)
)where rownum < (10+1);
Number of CPU
-------------
2
Negatives values correspond to Shutdown:
Interval
SNAP_ID (Secs) Idle time(%) User time(%) Sys time(%) Io Wait time(%) Nice time (%) Snap begin Snap end
---------- ---------- ------------ ------------ ----------- --------------- ------------- -------------------- --------------------
2212 3603 87.07 9.43 2.84 16.19 0 2009-12-15 14:00:19 2009-12-15 15:00:21
2211 3613 90.15 7.31 2.14 18.97 0 2009-12-15 13:00:06 2009-12-15 14:00:19
2210 3593 92.58 5.91 1.4 7.91 0 2009-12-15 12:00:13 2009-12-15 13:00:06
2209 3574 91.24 6.78 1.73 9.37 0 2009-12-15 11:00:38 2009-12-15 12:00:13
2208 3620 90.04 7.8 1.96 11.81 0 2009-12-15 10:00:19 2009-12-15 11:00:38
2207 3594 89.8 6.89 2.37 12.99 0 2009-12-15 09:00:25 2009-12-15 10:00:19
2206 3611 91.75 5.81 1.87 8.29 0 2009-12-15 08:00:14 2009-12-15 09:00:25
2205 3599 92.38 5.63 1.68 8.57 0 2009-12-15 07:00:15 2009-12-15 08:00:14
2204 3605 91.72 6.51 1.58 10.79 0 2009-12-15 06:00:11 2009-12-15 07:00:15
2203 3601 86.36 10.35 2.94 10.62 0 2009-12-15 05:00:10 2009-12-15 06:00:11
Ensuring Correlated Partition Exclusion #2 December 20, 2009
Posted by mwidlake in performance, Uncategorized, VLDB.Tags: partitions, performance, VLDB
2 comments
A few days ago {oh dear, it is now two weeks!} I showed how you could better control Correlated Partition Exclusion by creating a lookup table that showed, for each partition, the minimum and maximum ID (the partition key) and the minimum and maximum CRE_DATETIME (what you want to limit your query on). Using this range table in the way I described, you did not have to take an educated guess as to what range of IDs you used to include the partitions for the date range you were interested in, you could check the table.
But this only worked reliably where the ID and CRE_DATETIME increase directly in proportion to each other. It was a special case.
What about the more normal case, where there is a correlation between the two columns but it is not a perfect correlation? This would happen, for example, if the CRE_DATETIME is entered from another system, or from paper records, where the order of the records is not enforced. So some records from today get put into the system before some from yesterday. Or if the correlation is even loser than this. eg you are looking at orders for new customers. You “know” there are going to be no orders for these customers from 5 years ago but you are not sure how far back in the orders table you should go to find what you want.
You can still use the lookup table method. The lookup table in effect becomes a meta-index – an index of the segments where you will find data but not actually the rows.
To demonstrate this, I created a table where the ID and CRE_DATETIME increase in order:
create a partitioned table test_p4
(id number(10) not null
,cre_datetime date not null
,status number(1) not null
,num_1 number(4) not null -- random 20
,num_2 number(4) -- random 500
,num_3 number(5) -- cycle smoothly
,num_4 number(5) -- Random 10000
,vc_1 varchar2(10)
,vc_2 varchar2(10)
,vc_pad varchar2(2000))
tablespace users
partition by range (id)
(partition id_01k values less than (1000)
tablespace users
,partition id_02k values less than (2000)
tablespace users
...
,partition id_45k values less than (45000)
tablespace users
,partition id_max values less than (maxvalue)
tablespace users
)
--
-- local partitioned indexes on the table
IND_NAME TAB_NAME PSN COL_NAME
------------------ ------------------ --------- ------------
TP4_CRE_DT TEST_P4 1 CRE_DATETIME
TP4_PK TEST_P4 1 ID
--
-- populate the table with data
insert into test_p4(id,cre_datetime,status,num_1,num_2,num_3,num_4
,vc_1,vc_2,vc_pad)
select rownum
,to_date('01-JUL-2009','DD-MON-YYYY')+(rownum/360)
,decode(mod(rownum,100),0,1
,0)
,trunc(dbms_random.value(1,20))
,trunc(dbms_random.value(1,50))
,mod(rownum,10)+1
,trunc(dbms_random.value(1,10000))
,dbms_random.string('U',5)
,lpad(chr(mod(rownum,6)+65),5,chr(mod(rownum,6)+65) )
,lpad('A',1000,'A')
from dba_objects
where rownum <43000
I then messed with the data, updating 10% of records and setting the CRE_DATETIME to plus or minus a random amount up to 2 days different, so data in partitions would overlap. I then created a range table in the same way as I did for the previous post.
I ended up with a range table like the below:
MIN_CRE_DATI MAX_CRE_DATI MIN_ID MAX_ID ----------------- ----------------- ---------- ---------- 29-JUN-2009 20:35 05-JUL-2009 14:24 1 999 02-JUL-2009 04:50 07-JUL-2009 12:56 1000 1999 05-JUL-2009 02:34 10-JUL-2009 08:31 2000 2999 08-JUL-2009 05:23 13-JUL-2009 03:32 3000 3999 11-JUL-2009 08:07 15-JUL-2009 18:41 4000 4999 14-JUL-2009 00:14 18-JUL-2009 18:27 5000 5999 16-JUL-2009 08:58 21-JUL-2009 14:18 6000 6999 19-JUL-2009 01:28 24-JUL-2009 11:21 7000 7999 22-JUL-2009 08:02 27-JUL-2009 07:01 8000 8999 24-JUL-2009 22:06 30-JUL-2009 05:37 9000 9999 28-JUL-2009 04:59 01-AUG-2009 10:57 10000 10999 ... 24-SEP-2009 01:52 28-SEP-2009 18:36 31000 31999 26-SEP-2009 16:49 01-OCT-2009 01:26 32000 32999 29-SEP-2009 13:20 04-OCT-2009 13:43 33000 33999 02-OCT-2009 08:40 07-OCT-2009 10:11 34000 34999 05-OCT-2009 04:29 10-OCT-2009 04:09 35000 35999 08-OCT-2009 02:04 12-OCT-2009 17:34 36000 36999 10-OCT-2009 20:03 15-OCT-2009 08:39 37000 37999 13-OCT-2009 15:09 18-OCT-2009 12:01 38000 38999 16-OCT-2009 06:49 21-OCT-2009 03:53 39000 39999 18-OCT-2009 20:16 23-OCT-2009 21:01 40000 40999 21-OCT-2009 12:10 26-OCT-2009 07:13 41000 41999 25-OCT-2009 01:29 29-OCT-2009 20:56 42000 42999
You can see that the MIN_CRE_DATI-MAX_CRE_DATI from record to record (partition to partition) overlap but generally increase.
How do you find the start and end of the ID range to cover a date period you are interested in? I always have to sit down with a pen a paer to work this out. It is the classic “overlapping ranges” check, but my brain cannot hold on to it. So here goes. I want all records between the 1st of October and the 6th, this year.
You want to find the partition with the lowest ID range which has a record that falls into the date range you want. ie the Maximum CRE_DATETIME record is as late or later than the range you are interested in. If the maximum CRE_DATETIME is less than the date range you are interested in, no records will be in that partition.
Here is the code to find the lowest partition
select min(min_id)
from tp4_range
where MAX_CRE_DATI>=to_date('01-OCT-2009','DD-MON-YYYY')
MIN(MIN_ID)
-----------
32000
Execution Plan
----------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost
----------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 12 | 2
| 1 | SORT AGGREGATE | | 1 | 12 |
| 2 | TABLE ACCESS BY INDEX ROWID| TP4_RANGE | 12 | 144 | 2
|* 3 | INDEX RANGE SCAN | TP4R_MACD_MIAD | 12 | | 1
----------------------------------------------------------
To cover the whole of the partition of interest you need to look for records with and I greater than the minimum in that partition, thus the selection of the min(min_id)
Similarly, you only want partitions where the minimum CRE_DATETIME is before the end of the date range you want. If all records in the partition have a CRE_DATETIME beyond the range, you are not interested in it.
select max(max_id)
from tp4_range where MIN_CRE_DATI<=to_date('06-OCT-2009','DD-MON-YYYY')
MAX(MAX_ID)
-----------
35999
Execution Plan
----------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost
----------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 13 | 2
| 1 | SORT AGGREGATE | | 1 | 13 |
| 2 | TABLE ACCESS BY INDEX ROWID| TP4_RANGE | 37 | 481 | 2
|* 3 | INDEX RANGE SCAN | TP4R_MICD_MIID | 37 | | 1
----------------------------------------------------------
Then you put these sub-queries into the query to select the data you want. Below I show the “original” code which does not use the partition range table and then using the partition range table, to prove the same number of records come back and to see the plan and statistics change:
select count(*) from test_p4
where cre_datetime between to_date('01-OCT-2009','DD-MON-YYYY')
and to_date('06-OCT-2009','DD-MON-YYYY')
and num_2 = 5
COUNT(*)
----------
48
Execution Plan
------------------------------------------------------
| Id | Operation | Name | R
ows | Bytes | Cost (%CPU)| Time | Pstart| Pstop |
------------------------------------------------------------
------------------------------------------------------
| 0 | SELECT STATEMENT | |
1 | 11 | 356 (1)| 00:00:02 | | |
| 1 | SORT AGGREGATE | |
1 | 11 | | | | |
| 2 | PARTITION RANGE ALL | |
36 | 396 | 356 (1)| 00:00:02 | 1 | 46 |
|* 3 | TABLE ACCESS BY LOCAL INDEX ROWID| TEST_P4 |
36 | 396 | 356 (1)| 00:00:02 | 1 | 46 |
|* 4 | INDEX RANGE SCAN | TP4_CRE_DT |
1788 | | 52 (0)| 00:00:01 | 1 | 46 |
------------------------------------------------------------
Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
676 consistent gets
0 physical reads
Note the 676 consistent gets and the Pstart/Pstop of 1-46
select count(*) from test_p4
where cre_datetime between to_date('01-OCT-2009','DD-MON-YYYY')
and to_date('06-OCT-2009','DD-MON-YYYY')
and id> (select min(min_id)
from tp4_range where MAX_CRE_DATI
>=to_date('01-OCT-2009','DD-MON-YYYY'))
and id < (select max(max_id)
from tp4_range where MIN_CRE_DATI
<=to_date('06-OCT-2009','DD-MON-YYYY'))
and num_2 = 5
COUNT(*)
----------
48
Execution Plan
-----------------------------------------------------------
| Id | Operation | Name
| Rows | Bytes | Cost (%CPU)| Time | Pstart| Pstop |
------------------------------------------------------------
-----------------------------------------------------------
| 0 | SELECT STATEMENT |
| 1 | 16 | 66 (5)| 00:00:01 | | |
| 1 | SORT AGGREGATE |
| 1 | 16 | | | | |
| 2 | PARTITION RANGE ITERATOR |
| 1 | 16 | 62 (5)| 00:00:01 | KEY | KEY |
|* 3 | TABLE ACCESS BY LOCAL INDEX ROWID | TEST_P4
| 1 | 16 | 62 (5)| 00:00:01 | KEY | KEY |
| 4 | BITMAP CONVERSION TO ROWIDS |
| | | | | | |
| 5 | BITMAP AND |
| | | | | | |
| 6 | BITMAP CONVERSION FROM ROWIDS |
| | | | | | |
| 7 | SORT ORDER BY |
| | | | | | |
|* 8 | INDEX RANGE SCAN | TP4_PK
| 1788 | | 2 (0)| 00:00:01 | KEY | KEY |
| 9 | SORT AGGREGATE |
| 1 | 12 | | | | |
| 10 | TABLE ACCESS BY INDEX ROWID| TP4_RANGE
| 12 | 144 | 2 (0)| 00:00:01 | | |
|* 11 | INDEX RANGE SCAN | TP4R_MACD_MIA
D | 12 | | 1 (0)| 00:00:01 | | |
| 12 | SORT AGGREGATE |
| 1 | 13 | | | | |
| 13 | TABLE ACCESS BY INDEX ROWID| TP4_RANGE
| 37 | 481 | 2 (0)| 00:00:01 | | |
|* 14 | INDEX RANGE SCAN | TP4R_MICD_MII
D | 37 | | 1 (0)| 00:00:01 | | |
| 15 | BITMAP CONVERSION FROM ROWIDS |
| | | | | | |
| 16 | SORT ORDER BY |
| | | | | | |
|* 17 | INDEX RANGE SCAN | TP4_CRE_DT
| 1788 | | 52 (0)| 00:00:01 | KEY | KEY |
------------------------------------------------------------
Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
350 consistent gets
0 physical reads
The plan is admittedly more complex and the SQL can be tricky to understand if you are not used to code that looks for overlap of ranges. But Consistent gets is down to 350 and the Pstart/Pstop values are KEY-KEY. The CBO cannot tell you WHAT the ranges will be when the code is parsed {I think it should check but in 10.2.0.3 at least, the CBO is not that smart at parse time}, but it knows there will be a start and stop.
I am using tiny partitions for my example and only 45 of them for the table. When the partitions are for millions of rows and there are a couple of thousand of them, excluding partitions in a manner you can rely on is a powerful tool. Which leads onto a final word of caution.
Nothing is enforcing that the range table is maintained.
You could do things with triggers or regular re-calculation of the ranges table but this will be something you need to consider if you use ranges tables to help partition exclusion. Flipping tablespaces to read-only can help the worry go away though…
As an example of the issue of needing to maintain the ranges table and also a demonstration that the ranges table does work correctly if maintainted, I’ll update a record well outside of the “expected window”, show that it does not appear in my range-check controlled code, then update the ranges table and try again.
update test_p4
set cre_datetime=to_date(’02-OCT-2009 11:15′,’DD-MON-YYYY HH24:MI’)
,NUM_2=5
where id=22100
/
3> select count(*) from test_p4
where cre_datetime between to_date(’01-OCT-2009′,’DD-MON-YYYY’)
and to_date(’06-OCT-2009′,’DD-MON-YYYY’)
and id> (select min(min_id)
from tp4_range where MAX_CRE_DATI>=to_date(’01-OCT-2009′,’DD-MON-YYYY’))
and id < (select max(max_id)
from tp4_range where MIN_CRE_DATI (select min(min_id)
from tp4_range where MAX_CRE_DATI>=to_date(’01-OCT-2009′,’DD-MON-YYYY’))
and id < (select max(max_id)
from tp4_range where MIN_CRE_DATI<=to_date('06-OCT-2009','DD-MON-YYYY'))
and num_2 = 5
COUNT(*)
———-
49
– success
–
I wonder if I will do the next posting on this topic in less than two weeks!
Scalar Functions Not Behaving as Scalar Functions December 8, 2009
Posted by mwidlake in performance, Uncategorized.Tags: performance, PL/SQL
3 comments
My thanks go to Graeme Hobbs, who I currently have the pleasure of working with, for explaining this to me.
This is all about scalar functions that do not act like scalar functions.
Where I am spending much of my time at the moment, we have an issue with a very simple sql statement. It takes on average 2 buffer gets per execution. How could that be an issue? Well, it is being executed a few thousand times an hour. By “a few thousand” read 50,000 times. Even a simple statement being executed 15 times a second is not good if it is not needed. Especially, as in this case, there is a context switch.
I got a look at the code and it is doing something like this:
HAVING SUM( col.vol_gbp ) >=
my_user.pkg_application_settings.get_number_value( ‘rp157_eric’ )
I quickly batted an email to my esteemed colleague Mr Hobbs saying “but this is a scalar. It will be executed once for the statement and the value passed into each execution of the whole query, it can’t be the source of the 50,000 executions an hour”.
Oh woe, for my ignorance has undone me. But it’s OK, Graeme not only took time to explain, but knocked up the following nice demo of WHY this is a problem. And said I could put it on my Blog. What a nice chap.
From Graeme:
******************************************************
Consider package:
CREATE OR REPLACE PACKAGE BODY ghtest IS l_called_count NUMBER := 0; -- Function and procedure implementations FUNCTION f_get_called_count RETURN NUMBER IS BEGIN RETURN( l_called_count ); END; FUNCTION f_call RETURN NUMBER IS BEGIN l_called_count := l_called_count + 1; RETURN 1; END; FUNCTION f_reset RETURN NUMBER IS BEGIN l_called_count := 0; RETURN 1; END; END ghtest;
– nothing up my sleeve
SELECT ghtest.f_get_called_count
FROM DUAL;
F_GET_CALLED_COUNT
——————
0
ie no calls made yet
SQL> SELECT id
, COUNT( * )
FROM ( SELECT *
FROM MY_USR.tbl_SOME_DATA
WHERE ROWNUM < 10 )
GROUP BY id
HAVING COUNT( * ) > ghtest.f_call;
ID COUNT(*)
---------- ----------
1624361594 3
SELECT ghtest.f_get_called_count
FROM DUAL;
F_GET_CALLED_COUNT
------------------
7
NOTE the counter has gone up to 7 as there were 7 IDs found from that code -6 IDs with one record and one with 3.
SELECT ghtest.f_reset
FROM DUAL;
F_RESET
———-
1
That just blanks the counter (see the package).
-- nothing up my sleeve
SELECT ghtest.f_get_called_count
FROM DUAL;
F_GET_CALLED_COUNT
------------------
0
SELECT id
, COUNT( * )
FROM ( SELECT *
FROM my_usr.tbl_some_data
WHERE ROWNUM < 10 )
GROUP BY id
HAVING COUNT( * ) > (SELECT ghtest.f_call FROM dual);
ID COUNT(*)
---------- ----------
1624361594 3
SELECT ghtest.f_get_called_count
FROM DUAL;
F_GET_CALLED_COUNT
------------------
1
You can see that the scalar subquery
HAVING COUNT( * ) > (SELECT ghtest.f_call FROM dual);
calls f_call once.
Whereas
HAVING COUNT( * ) > ghtest.f_call;
results in it being called 7 times – the number of rows returned from the group by
That is precisely the problem.
******************************************************
Apparently our developers should know this, as it is a “known issue”. But not known to me until now (unless I knew it and forgot it)
So, hats of to Graeme.
{apologies, wordpress utterly screwed the layout/truncated text when I pasted in the example, have fixed now}
Ensuring Correlated Partition Exclusion December 7, 2009
Posted by mwidlake in performance, Uncategorized, VLDB.Tags: partitions, performance, VLDB
9 comments
<Previous Post…Next Pos >
I’ve posted a couple of times recently about a tuning method with partition exclusion where you infer a relationship between the value you want to limit a query on and the partition key. It takes a while to explain the theory so I am going to give it a name, in the vain hope it catches on {forgive me if someone has already done this – in fact, just flame me with pointers if you already know a name for this}. At the very least, for my own postings, I can use this name from now on and link back to a single posting explaining it.
I’m going to call it Correlated Partition Exclusion. In essence, you have partitioned the table on a key, in my case the primary key ID, which is an ascending numeric probably sourced from a sequence.
You have a second column, in my case CRE_DATETIME, which increases in line with the PK. If you limit your query on the CRE_DATETIME partition exclusion is not possible as there is no guarantee which CRE_DATETIME values appear in which partition. But, as a human, you understand that if you create 10,000 records a day, if you want to look at the last week’s date you can use:
WHERE ID > MAX_ID-(7*10000)
to exclude partitions with an id more than 7 days worth ago.
So, you have your Correlated Partition Exclusion.
How can you be sure that going back 70,000 IDs is going to safely cover one week of data and how can you maximise your efficiency of including only partitions that cover the date range? {note, I am using a numeric ID as my partition range key and a datetime as my correlated column, this principle works just as well if you partition on datetime and want to correlate to a (generally) ascending numeric key}
Here is my test table :-
create table test_p3 (id number(10) not null ,cre_datetime date not null ,status number(1) not null ,num_1 number(4) not null -- random 20 ,num_2 number(4) -- random 500 ,num_3 number(5) -- cycle smoothly ,num_4 number(5) -- Random 10000 ,vc_1 varchar2(10) ,vc_2 varchar2(10) ,vc_pad varchar2(2000)) tablespace users partition by range (id) (partition id_01k values less than (1000) tablespace users ,partition id_02k values less than (2000) tablespace users ,partition id_03k values less than (3000) tablespace users ,partition id_04k values less than (4000) tablespace users ... ,partition id_45k values less than (45000) tablespace users -- ,partition id_max values less than (maxvalue) tablespace users ) / --@ind_cols IND_NAME TAB_NAME PSN COL_NAME ------------------ ------------------ --------- -------------------- TP3_PK TEST_P3 1 ID TP_CRE_DT TEST_P3 1 CRE_DATETIME
If I want to look for all records between two dates I could use code like the below (based in that suggested by Bernard Polarski, any mistakes are mine).
with get_min_id as
(select max(id) min_id from test_p3
where cre_datetime >= TO_DATE('18-OCT_2009','DD-MON-YYYY') )
,get_max_id as
(select min(id) max_id from test_p3
where cre_datetime <= TO_DATE('20-OCT_2009','DD-MON-YYYY') )
select count(*)
from test_p3
,get_min_id
,get_max_id
where id >get_min_id.min_id
and id < get_max_id.max_id
and cre_datetime between TO_DATE('18-OCT_2009','DD-MON-YYYY')
and TO_DATE('20-OCT_2009','DD-MON-YYYY')
/
COUNT(*)
----------
721
1 row selected.
ie find the minimum ID for the start date and the maximum ID for the end date and query between them.
This works fine. Unfortunately, you get a plan like the below
------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Pstart| Pstop | ------------------------------------------------------------------ ------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 39 | 887 (4)| 00:00:04 | | | | 1 | SORT AGGREGATE | | 1 | 39 | | | | | | 2 | NESTED LOOPS | | 2 | 78 | 887 (4)| 00:00:04 | | | | 3 | NESTED LOOPS | | 1 | 26 | 717 (5)| 00:00:03 | | | | 4 | VIEW | | 1 | 13 | 505 (5)| 00:00:02 | | | |* 5 | FILTER | | | | | | | | | 6 | SORT AGGREGATE | | 1 | 26 | | | | | |* 7 | VIEW | index$_join$_001 | 38393 | 974K| 505 (5)| 00:00:02 | | | |* 8 | HASH JOIN | | | | | | | | | 9 | PARTITION RANGE ALL | | 38393 | 974K| 418 (6)| 00:00:02 | 1 | 46 | |* 10 | INDEX RANGE SCAN | TP_CRE_DT | 38393 | 974K| 418 (6)| 00:00:02 | 1 | 46 | | 11 | PARTITION RANGE ALL | | 38393 | 974K| 152 (4)| 00:00:01 | 1 | 46 | | 12 | INDEX FAST FULL SCAN | TP3_PK | 38393 | 974K| 152 (4)| 00:00:01 | 1 | 46 | | 13 | VIEW | | 1 | 13 | 212 (5)| 00:00:01 | | | | 14 | SORT AGGREGATE | | 1 | 26 | | | | | |* 15 | VIEW | index$_join$_002 | 2972 | 77272 | 212 (5)| 00:00:01 | | | |* 16 | HASH JOIN | | | | | | | | | 17 | PARTITION RANGE ALL | | 2972 | 77272 | 59 (6)| 00:00:01 | 1 | 46 | |* 18 | INDEX RANGE SCAN | TP_CRE_DT | 2972 | 77272 | 59 (6)| 00:00:01 | 1 | 46 | | 19 | PARTITION RANGE ALL | | 2972 | 77272 | 152 (4)| 00:00:01 | 1 | 46 | | 20 | INDEX FAST FULL SCAN | TP3_PK | 2972 | 77272 | 152 (4)| 00:00:01 | 1 | 46 | | 21 | PARTITION RANGE ITERATOR | | 2 | 26 | 170 (1)| 00:00:01 | KEY | KEY | |* 22 | TABLE ACCESS BY LOCAL INDEX ROWID| TEST_P3 | 2 | 26 | 170 (1)| 00:00:01 | KEY | KEY | |* 23 | INDEX RANGE SCAN | TP_CRE_DT | 707 | | 48 (0)| 00:00:01 | KEY | KEY | ------------------------------------------------------------------ Statistics ---------------------------------------------------------- 0 recursive calls 0 db block gets 892 consistent gets 0 physical reads
If you look at the plan {towards the end, and sorry about the naff layout, my blog is not wide enough for this sort of printout} there are two checks on the TP_CRE_DT indexes that scan all partitions of the index – Pstart/Pstop are 1-46. This is the CBO looking for the partitions where the stated CRE_DATETIME records occur in the whole table. Cost is 892 consistent gets, much of which is the checking of local index partitions that will hold no relevant entries.
Bernard also spotted that the code was considering all partitions and was not as efficient as it could be but did not know how to get around it except with a Global index, which brings it’s own issues.
One way is to just say “well, I know how many records per day I get so I will fake up the ID limits based on this”. The problem is, and this is why the CBO cannot make the decision for you, is that there is no guarantee, no fixed rule, saying that CRE_DATETIME will always increment with the ID. In fact, there is nothing stopping you altering a record which has an ID from yesterday having a CRE_DATETIME from 10 years ago {but forget I said that until tomorrow’s post}. Now, in my example the CRE_DATETIME is going to increment with ID. We will use this special case for now, I know it is flawed and will address that flaw {tomorrow}.
So to ensure yo do not miss data you end up making your ID window pretty large to endure you do not miss records. Say you want all records for the last day, you process 1500 records a day, so you consider a window covering all samples with an ID within the last 10,000. It will still be more efficient than scanning all partitions and should be safe. Fairly safe.
The way out of this is to have a ranges table. Something that tells you, for each partition, which is the maximum and minimum CRE_DATE and the IDs covered by that range. You can then use that to identify the partitions that cover the date range you are interested in.
Here is an example. I will create a simple table to hold the ranges:
create table tp3_range (min_cre_dati date ,max_cre_dati date ,min_id number ,max_id number)
Now you have to populate it.
The below code will create the first record for the first partition.
insert into tp3_range SELECT MIN(CRE_DATETIME) ,MAX(CRE_DATETIME) ,MIN(ID) ,MAX(ID) FROM TEST_P3 PARTITION (ID_01K) / Execution Plan ---------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost ( %CPU)| Time | Pstart| Pstop | ------------------------------------------------------------------ | 0 | INSERT STATEMENT | | 1 | 12 | 102 (1)| 00:00:01 | | | | 1 | SORT AGGREGATE | | 1 | 12 | | | | | | 2 | PARTITION RANGE SINGLE| | 999 | 11988 | 102 (1)| 00:00:01 | 1 | 1 | | 3 | TABLE ACCESS FULL | TEST_P3 | 999 | 11988 | 102 (1)| 00:00:01 | 1 | 1 | ------------------------------------------------------------------ Statistics ---------------------------------------------------------- 1 recursive calls 3 db block gets 195 consistent gets 0 physical reads
As you can see from the plan and cost, this is not very efficient as it has to scan the whole partition. Maybe not a problem if you do this once, but there are indexes on both these columns, can’t this be done more efficiently? Yes, if you split up the code into four in-line selects (if you want more details about it being more performant to do MIN and MAX on their own than in one statement then see this post on the topic ):
insert into tp3_range SELECT (SELECT MIN(CRE_DATETIME) FROM TEST_P3 PARTITION (ID_04K)) ,(SELECT MAX(CRE_DATETIME) FROM TEST_P3 PARTITION (ID_04K)) ,(SELECT MIN(id ) FROM TEST_P3 PARTITION (ID_04K)) ,(SELECT MAX(ID ) FROM TEST_P3 PARTITION (ID_04K)) FROM DUAL PLAN --------------------------------------------------------- SORT AGGREGATE PARTITION RANGE SINGLE cst:2 rws:1000 INDEX FULL SCAN (MIN/MAX) TP_CRE_DT cst:2 rws:1000 Statistics ---------------------------------------------------------- 1 recursive calls 5 db block gets 9 consistent gets 0 physical reads
{You may wonder why the Explain Plan section above has a different look. This is because there seems to be a bug in 10.2.0.3 where the autotrace plan for the insert statement comes out wrong, as a FAST DUAL access, so I had to Explain the statement in a different way}
You would run one of the above statements against each partition. Probably, the “Best Practice” way would be to generate a SQL script from a query against DBA_TAB_PARTITIONS.
To populate my table I cheated – I just assumed all my partitions are of the same size (1000 records) and used:-
insert into tp3_range
select min(cre_datetime),max(cre_datetime),min(id),max(id)
from test_p3
– where id between 10000 and 20000
group by trunc(id/1000)
/
Note I left in a commented line. You could run the above against the whole table and then limit it to just new partitions as you add them. This is a tad risky though, you are relying on the partitioning being perfect and it would not scale to hundreds of very large partitions. You would be better off with the partition-by-partition methods above.
You end up with a table like the below:-
select * from tp3_range order by min_cre_dati MIN_CRE_DATI MAX_CRE_DATI MIN_ID MAX_ID ----------------- ----------------- ---------- ---------- 01-JUL-2009 00:04 03-JUL-2009 18:36 1 999 03-JUL-2009 18:40 06-JUL-2009 13:16 1000 1999 06-JUL-2009 13:20 09-JUL-2009 07:56 2000 2999 09-JUL-2009 08:00 12-JUL-2009 02:36 3000 3999 12-JUL-2009 02:40 14-JUL-2009 21:16 4000 4999 14-JUL-2009 21:20 17-JUL-2009 15:56 5000 5999 17-JUL-2009 16:00 20-JUL-2009 10:36 6000 6999 20-JUL-2009 10:40 23-JUL-2009 05:16 7000 7999 23-JUL-2009 05:20 25-JUL-2009 23:56 8000 8999 26-JUL-2009 00:00 28-JUL-2009 18:36 9000 9999 28-JUL-2009 18:40 31-JUL-2009 13:16 10000 10999 31-JUL-2009 13:20 03-AUG-2009 07:56 11000 11999 ... 14-OCT-2009 13:20 17-OCT-2009 07:56 38000 38999 17-OCT-2009 08:00 20-OCT-2009 02:36 39000 39999 20-OCT-2009 02:40 22-OCT-2009 21:16 40000 40999 22-OCT-2009 21:20 25-OCT-2009 15:56 41000 41999 25-OCT-2009 16:00 28-OCT-2009 10:36 42000 42999
add the two below indexes:
create index tp3r_micd_miid on tp3_range(min_cre_dati,min_id);
create index tp3r_macd_miad on tp3_range(max_cre_dati,max_id);
And now you can find the upper and lower ID bounds for a date range with the following code:
select min_id from tp3_range
where min_cre_dati = (select max(min_cre_dati)
from tp3_range
where min_cre_dati <TO_DATE('18-OCT_2009','DD-MON-YYYY')
)
MIN_ID
----------
39000
Execution Plan
---------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost
---------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 22 | 2
|* 1 | INDEX RANGE SCAN | TP3R_MICD_MIID | 1 | 22 | 1
| 2 | SORT AGGREGATE | | 1 | 9 |
| 3 | FIRST ROW | | 40 | 360 | 1
|* 4 | INDEX RANGE SCAN (MIN/MAX)| TP3R_MICD_MIID | 40 | 360 | 1
---------------------------------------------------------
Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
3 consistent gets
0 physical reads
select max_id from tp3_range
where max_cre_dati = (select min(max_cre_dati)
from tp3_range
where max_cre_dati >to_date('20-OCT_2009','DD-MON-YYYY')
)
MAX_ID
----------
39999
Execution Plan
----------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost
----------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 22 | 2
|* 1 | INDEX RANGE SCAN | TP3R_MACD_MIAD | 1 | 22 | 1
| 2 | SORT AGGREGATE | | 1 | 9 |
| 3 | FIRST ROW | | 4 | 36 | 1
|* 4 | INDEX RANGE SCAN (MIN/MAX)| TP3R_MACD_MIAD | 4 | 36 | 1
----------------------------------------------------------
Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
3 consistent gets
0 physical reads
Put it all together into one statement and let’s see how much it costs:
with get_min_id as
(select min_id from tp3_range
where min_cre_dati = (select max(min_cre_dati)
from tp3_range
where min_cre_dati <TO_DATE('18-OCT_2009','DD-MON-YYYY')
) )
,get_max_id as
(select max_id from tp3_range
where max_cre_dati = (select min(max_cre_dati)
from tp3_range
where max_cre_dati >to_date('20-OCT_2009','DD-MON-YYYY')
) )
select count(*)
from test_p3
,get_min_id
,get_max_id
where id >get_min_id.min_id
and id < get_max_id.max_id
and cre_datetime between TO_DATE('18-OCT_2009','DD-MON-YYYY')
and TO_DATE('20-OCT_2009','DD-MON-YYYY')
COUNT(*)
----------
721
1 row selected.
Execution Plan
----------------------------------------------------------
| Id | Operation | Name | Row
s | Bytes | Cost (%CPU)| Time | Pstart| Pstop |
------------------------------------------------------------------
----------------------------------------------------
| 0 | SELECT STATEMENT | |
1 | 57 | 67 (5)| 00:00:01 | | |
| 1 | SORT AGGREGATE | |
1 | 57 | | | | |
| 2 | TABLE ACCESS BY LOCAL INDEX ROWID | TEST_P3 |
2 | 26 | 65 (5)| 00:00:01 | | |
| 3 | NESTED LOOPS | |
2 | 114 | 65 (5)| 00:00:01 | | |
| 4 | MERGE JOIN CARTESIAN | |
1 | 44 | 2 (0)| 00:00:01 | | |
|* 5 | INDEX RANGE SCAN | TP3R_MICD_MIID |
1 | 22 | 1 (0)| 00:00:01 | | |
| 6 | SORT AGGREGATE | |
1 | 9 | | | | |
| 7 | FIRST ROW | |
40 | 360 | 1 (0)| 00:00:01 | | |
|* 8 | INDEX RANGE SCAN (MIN/MAX) | TP3R_MICD_MIID |
40 | 360 | 1 (0)| 00:00:01 | | |
| 9 | BUFFER SORT | |
1 | 22 | 1 (0)| 00:00:01 | | |
|* 10 | INDEX RANGE SCAN | TP3R_MACD_MIAD |
1 | 22 | 1 (0)| 00:00:01 | | |
| 11 | SORT AGGREGATE | |
1 | 9 | | | | |
| 12 | FIRST ROW | |
4 | 36 | 1 (0)| 00:00:01 | | |
|* 13 | INDEX RANGE SCAN (MIN/MAX) | TP3R_MACD_MIAD |
4 | 36 | 1 (0)| 00:00:01 | | |
| 14 | PARTITION RANGE ITERATOR | |
| | | | KEY | KEY |
| 15 | BITMAP CONVERSION TO ROWIDS | |
| | | | | |
| 16 | BITMAP AND | |
| | | | | |
| 17 | BITMAP CONVERSION FROM ROWIDS| |
| | | | | |
| 18 | SORT ORDER BY | |
| | | | | |
|* 19 | INDEX RANGE SCAN | TP3_PK | 7
07 | | 6 (0)| 00:00:01 | KEY | KEY |
| 20 | BITMAP CONVERSION FROM ROWIDS| |
| | | | | |
| 21 | SORT ORDER BY | |
| | | | | |
|* 22 | INDEX RANGE SCAN | TP_CRE_DT | 7
07 | | 48 (0)| 00:00:01 | KEY | KEY |
------------------------------------------------------------------
Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
133 consistent gets
0 physical reads
If you look way back up this post, you will see that the number of records selected by the above is the same as from the code indicated by Bernard (721 records) and for a lot less cost – 133 consistent gets compared to 892.
So I have hopefully explained the issue (having to visit all the index partitions to check the date range), shown how a ranges table can help, given some simple and a less-simple-but-more-efficient examples of code to populate the table and finally shown that using the range table can be more efficient.
And I arrogantly gave it a name – Correlated Partition Exclusion
Some of you may remember tha this example has been a special case, as there is no overlap between the dates; the relationship between the CRE_DATETIME and ID is perfect. You are unlikely to have that in real life. Also, dome of you may also be tired of reading this post. So I will cover the general case in the NEXT post.
