cost of full table scans June 17, 2009
Posted by mwidlake in performance.Tags: cost, explain plan, performance
12 comments
As many sites point out, the cost as stated in explain plan is {being a little simplistic} a calculation by the CBO of the number of IO operations it thinks will be needed to complete that step of the plan. {It includes the cost of any previous step or steps that feed into it but many examples are nice and simple and have only a couple of steps in them}.
However, it can be confusing when you decide to look at some simple examples and don’t see what you expect, as you start with simple table scans.
My test script can be found here .
I set up a simple pair of test tables
-- create a table with 20k rows, an ascending numeric PK -- a unique VC -- a non unique numeric -- a sparsly populated numeric -- and some padding fields. create table test_big as select rownum id ,mod(rownum,10) num_1to10 ,trunc(dbms_random.value(1,1000)) num_1k_vals ,'A'||to_char(rownum+1000) id_2 ,trunc(dbms_random.value(1,100)) num_100_vals ,mod(rownum,5) num_4 ,decode(mod(rownum,10),0,trunc(dbms_random.value(1,30)) ,null) num_sparse ,lpad('A',50,'A') vc_1 ,lpad('B',50,'B') vc_2 ,lpad('C',250,'C') vc_3 from dba_objects where rownum < 40001 / create table test_small as select rownum id ,mod(rownum,10) num_1to10 ,trunc(dbms_random.value(1,1000)) num_1k_vals ,'A'||to_char(rownum+1000) id_2 ,trunc(dbms_random.value(1,100)) num_100_vals ,mod(rownum,5) num_4 ,decode(mod(rownum,10),0,trunc(dbms_random.value(1,30)) ,null) num_sparse ,lpad('A',50,'A') vc_1 ,lpad('B',50,'B') vc_2 ,lpad('C',250,'C') vc_3 from dba_objects where rownum < 101 / -- now add statistics -- NB each exec must be on one line exec dbms_stats.gather_table_stats (ownname=>user,tabname=>'TEST_BIG' ,estimate_percent=>100,cascade=>true) exec dbms_stats.gather_table_stats (ownname=>user,tabname=>'TEST_SMALL' ,estimate_percent=>100,cascade=>true) -- desc test_BIG DESC TEST_SMALL -- set autotrace on set timi on set pause off select count(*) from test_big; select count(*) from test_small;
I’m running in Oracle 11.1.0.6 on a windows vista laptop, multi block read count is 128, no system statistics have been collected.
It creates one table with 40,000 records and one with 100 records and gather stats on them. I’ll now select count(*) from both and look at the plan.
select count(*) from test_big; COUNT(*) ---------- 40000 Execution Plan ----------------------------------------------------- | Id | Operation | Name | Rows | Cost (%CPU)| ----------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 617 (1)| | 1 | SORT AGGREGATE | | 1 | | | 2 | TABLE ACCESS FULL| TEST_BIG | 40000 | 617 (1)| ----------------------------------------------------- select count(*) from test_small; COUNT(*) ---------- 100 Execution Plan ----------------------------------------------------- | Id | Operation | Name | Rows | Cost (%CPU)| ----------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 4 (0)| | 1 | SORT AGGREGATE | | 1 | | | 2 | TABLE ACCESS FULL| TEST_SMALL | 100 | 4 (0)| -----------------------------------------------------
The costs of full table scans are 617 for the large table and 4 for the small table. We have full stats on the tables and they are just simple tables, no indexes, we should be able to see where the numbers come from.
Any guesses? Well, it could be the number of blocks making up the objects:-
select table_name,num_rows,blocks,empty_blocks from dba_tab_statistics where owner=user and table_name in ('TEST_SMALL','TEST_BIG'); TABLE_NAME NUM_ROWS BLOCKS EMPTY_BLOCKS ---------- ---------- ---------- ------------ TEST_BIG 40000 2267 0 TEST_SMALL 100 9 0
Err, no. 2267 blocks in TEST_BIG and 9 blocks in TEST_SMALL.
{btw the view DBA_TAB_STATISTICS holds a subset of the information on DBA_TABLES and I tend to use it if I just want stats info}.
It could be the number of blocks in the segment maybe?
run @seg_dets.sql .
Enter value for seg_name: test Enter value for owner: t SEG_NAME SEG TS_NAME BYTES_K BLOCKS exts INI_K NXT_K ------------------------------------------------------ TEST_BIG TAB DATA_01 18,432 2,304 33 64 TEST_SMALL TAB DATA_01 128 16 2 64
No, it isn’t that. 2304 is not very different from 2267, as you would expect as we simply created and populated the table.
Hands up anyone who is thinking about db_file_multiblock_read_count? MBRC is the number of blocks Oracle will try and select from a segment in one I/O operation if it is scanning it. Well, MBRC is set to 128 on this system {it’s “out of the box” and I question it being that high actually. On 10.2 on windows it is, I think, 8}.
That sounds incredibly sensible, we can check if it has scanned all those blocks in chunks as dictated by the MBRC. Well, divide 2304 {number of blocks used by the segment} by 617 {I/O cost} and you get…3.734. Nothing like the MBRC.
What’s going on?!!!
Well, oracle modifies the value of db_file_multiblock_read_count before using it in calculations. But it IS the source of the calculation. You can change MBRC at a session level to check:-
alter session set db_file_multiblock_read_count=4; Session altered. select count(*) from test_big; COUNT(*) ---------- 40000 Execution Plan ---------------------------------------------------------- | Id | Operation | Name | Rows | Cost (%CPU)| ---------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 853 (1)| | 1 | SORT AGGREGATE | | 1 | | | 2 | TABLE ACCESS FULL| TEST_BIG | 40000 | 853 (1)| ---------------------------------------------------------- alter session set db_file_multiblock_read_count=8; Session altered. select count(*) from test_big; COUNT(*) ---------- 40000 Execution Plan ---------------------------------------------------------- | Id | Operation | Name | Rows | Cost (%CPU)| ---------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 617 (1)| | 1 | SORT AGGREGATE | | 1 | | | 2 | TABLE ACCESS FULL| TEST_BIG | 40000 | 617 (1)| ----------------------------------------------------------
At an MBRC of 4 the CBO calculates a cost of 853 to scan the TEST_BIG table and at an MBRC of 8 the CBO calculates a cost of 617.
This script here will run through a set of values for db_file_multiblock_read_count and run the count(*) so you can see the values going down as MBRC goes up, but oracle discounts the effect of the parameter as it gets larger.
To save you the effort, these are the costs at each value of MBRC I checked and I calculate the modifed MBRC {based on 2267 blocks, 2267 is the best number to use as it is likely to be the high water mark or very close}.
MBRC modified Value Cost MBRC 2 1325 1.711 3 1010 2.245 -- not a Magic Number 4 853 2.658 6 695 3.262 8 617 3.674 10 569 3.984 12 538 4.214 16 499 4.543 20 475 4.773 24 459 4.939 32 440 5.152 33 438 5.176 -- not a Magic Number 64 410 5.529 128 395 5.729
What do I mean by the comment “it does not need to be a magic number”. Well, MBRC is normally set to 2, 4, 8, 16, 32,64… a factor of 2. All examples I can remember seeing used have been a Magic Number, so I thought I would check if Oracle does any rounding to such numbers or insists on even values. And it does not.
{quick comment – the table has changed since the first post, I stuck the segment size not the hwm into my spreadsheet}
A couple of final comments for today:
- How in heck can you know about this fudge unless someone tells you? And yes, I know as someone told me.
- The values I gained on 10.2 were very,very similar, consistently a cost 3 higher than on 11.1
- Why does scanning 9 blocks have a cost of 4?
- Why was I getting a cost of 617 with the default MBRC of 128 and yet when I set my session to have an MBRC of 128 the cost is calculated at 395?
I don’t know all the answers to the above yet.