jump to navigation

IOT Part 6(B) – OLTP Inserts into an IOT November 10, 2011

Posted by mwidlake in development, performance, Testing.
Tags: , , , , , ,
trackback

<..IOT1 – the basics
<….IOT2 – Examples and proofs
<……IOT3 – Significantly reducing IO
<……..IOT4 – Boosting Buffer Cache efficiency
<……….IOT5 – Primary Key Drawback
<…………IOT6(A) – Bulk Insert slowed down

IOT 6 part A covered the sort of insert work used to create an IOT or push a large number of rows into it via an “insert into…select from” type format. ie Bulk Load.
In this, part B, I am going to test inserting a small number of rows (10,000) into a table already holding a million rows. Something more like an hourly load or the inserts that would occur over a working day – a more OLTP-like test

I’m going to do the tests against four things. A “non-insert” test to ensure my data creation is not taking a lot of resource {the Null test}, then against my three versions of test table – a normal heap table, my IOT and my Partitioned IOT. I run each test on each segment type three times, the minimum number needed to help identify variation that could invalidate the test.

You will probably be surprised by the results….

All three tables have the same column structure, the heap has indexes on the primary key and a unique constraint, the IOT and partitioned IOT are organized on a modified version of the Primary key and have the second index for the unique key. Below is the creation statement for the partitioned IOT (the IOT is the same without the partition clause). You can get full creation details from IOT part 2:

create table transaction_IOT_P
(tran_type number(2)     not null
,tran_id   number(10)    not null
,ACCO_type number(2)     not null
,ACCO_id   number(10)    not null
,cre_date  date          not null
,vc_1      varchar2(1000) not null
,vc_2      varchar2(1000)
,date_1    date
,num_1     number(2)
,num_2     number(2)
,constraint tip_pk primary key(ACCO_type,ACCO_id,cre_date,tran_type,tran_id) 
--  using index tablespace index_01
,constraint tip_tran_uq unique (tran_type,tran_id)
  using index tablespace index_01
)
organization index
tablespace data_01
partition by range  (cre_date)
(partition rm20110901 values less than (to_date('01-09-2011','DD-MM-YYYY')) 
  tablespace data_01
,partition rm20111001 values less than (to_date('01-10-2011','DD-MM-YYYY'))
  tablespace data_01
,partition rm20111101 values less than (to_date('01-11-2011','DD-MM-YYYY'))
  tablespace data_01
,PARTITION RMTOP  VALUES LESS THAN (MAXVALUE)
  tablespace USERS
)

The three tests are:

  • Insert 10,000 records for random accounts, in time order for the same day. This would be like processing data for or during a “day”
  • Generating the 10,000 records in order of account, to better match the IOT structure, again for a single day. For any given account 0,1,2 or 3 records are created, to be a little more “real life”. This would be like loading a pre-processed set of data.
  • All 10,000 records are created for the one account, in datetime order, as an extreme example of inserting into an IOT data that is ordered to perfectly match the order of the IOT.

The code is written ‘badly’, ie a single record is prepared and inserted into the table at a time, within a loop of 10000 iterations. This is to avoid the bulk-processing that may benefit the heap table and also to better match the reality of an OLTP system. I do however only commit after all records are created for any given test, which is maybe a mistake but that’s what I did.

Now, before I go any further, does anyone want to guess by how much inserting into the heap table will be faster than inserting into the IOT (partitioned or otherwise)? Also, whether the partitioned IOT will be faster or slower than the IOT and what, if any, impact there will be by the ordering of data?

To allow you to think about that and not see the results by accident, I’ll stick in one of the insert harnesses here so you can see how I am doing it. Skip over if you just want the results

--
execute s_snap.my_snap(' finished non-insert test1')
--
-- Transaction_heap random data test
declare
v_num         number :=10000; -- number of accounts
v_offset      number :=-4;
v_str1        varchar2(60);
v_str2        varchar2(60);
v_tran_type   number(2);
v_tran_id     number(10);
v_acco_type   number(2);
v_acco_id     number(10);
v_cre_date    date;
v_vc_1        varchar2(1000);
v_vc_2        varchar2(1000);
v_date_1      date :=sysdate;
v_num_1       number(2) :=10;
v_num_2       number(2) :=15;
begin
v_str1:=dbms_random.string('U',60);
v_str2:=dbms_random.string('U',60);
dbms_output.put_line (to_char(SYSTIMESTAMP,'HH24:MI:SS.FF'));
for i in 1..10000 loop
    v_tran_type :=mod(1,3)+1;
    v_tran_id   := 1000000+i;
    v_acco_type := 5+(trunc(dbms_random.value(1,3))*5); -- 10 or 15
    v_acco_id   :=trunc(dbms_random.value(1,v_num/2)); -- 1 to 5000
    v_cre_date  :=trunc(sysdate-v_offset)+(i/(60*60*24));
    v_vc_1      :=substr(v_str1,1,51+mod(i,10));
    v_vc_2      :=substr(v_str2,1,46+mod(i,15));
    insert into transaction_heap
      (tran_type,tran_id,ACCO_type,ACCO_id,cre_date,vc_1,vc_2,date_1,num_1,num_2)
    values
      (v_tran_type,v_tran_id,v_ACCO_type,v_ACCO_id,v_cre_date
      ,V_vc_1,v_vc_2,v_date_1,v_num_1,v_num_2);
end loop;
commit;
dbms_output.put_line (to_char(SYSTIMESTAMP,'HH24:MI:SS.FF'));
end;
/
--
execute s_snap.my_snap(' finished th insert test1')
--

Here are the results of three runs for all tests:

Elapsed time in seconds:

                    random insert             ordered-by-account       perfect IOT order match
Segment type    Test 1    2    3            Test 1    2    3           Test 1    2     3    
------------    ------ ---- ----            ------ ---- ----            ----- ---- ----
No insert          2.1  0.1  0.1               0.1  0.1  0.1              0.1  0.0  0.1   

                                    AVG                         AVG                        AVG
                                   ----                        ----                       ----
Heap              52.3 20.5 11.4   28.1       33.4 13.8 10.7   19.3       9.1 13.0  9.8   10.6
IOT              137.1 83.7 26.4   82.4      110.8 63.6 39.0   71.2       6.4  6.7  4.0    5.8
Partn'd IOT       33.2  9.5  8.7   17.1       17.1 10.4 14.3   13.9       8.1  5.3  4.5    6.0
                                   ----                        ----                       ----

Allow me to summarise that.

First, the preparation of the data and cycling around the loop, the no-insert tests, are negligible. {The first no_insert test is taking 2 seconds due to the parsing and collecting dictionary information overhead of the first thing done}. We can ignore the overhead of the test harness.

Much more significantly:

  • Each iteration of each test, things get faster {with one odd exception}
  • For Random and semi-ordered inserts, inserting into the IOT is slower than the Heap by a factor of 300% to 400% (3 times to 4 times slower) – what we all probably expected.
  • Inserting into the IOT is Faster than the heap when the insert order matches the structure of the IOT in the perfect_IOT test. Not just the same but 40% faster
  • The partitioned IOT is as Fast or Faster than the the Heap table, never mind the IOT. Over all the partitioned IOT takes 50% TO 75% THE TIME OF THE HEAP to be populated.

Hands up everyone expecting this? NB My hand would be down if I had not done similar tests before.

Why do we see these results?

I’ll just mention that the reason all the tests get quicker with each iteration is that the computer is doing the same thing over and over again and gets used to it, the relevant circuits in the chip are “oiling in”. Of course not, how silly – but it seems to be how non-IT people think of it 🙂 Basically more and more of each segment is getting cached and that reduces physical IO and speed things up. You see this a lot in testing and you need to be constantly vigilant about it. You might think I should really run the tests until they are steady-state, but that is a very false situation. On real system you don’t tend to get the same thing run over and over and over again on a system with nothing else running.

Much more interestingly, why is the partitioned IOT winning? Well, for the Heap table data is being poured into the leading edge of the table, which takes a little processing and some blocks to be created. However, there are also two indexes being maintained, the Primary Key and that Unique key. For each insert oracle has to find the right spot in each index (root block ->branch block->leaf block->check for row) and insert the new record, occasionally splitting a full leaf block.

For the IOT and the Partitioned IOT there is no table to maintain. In effect there are just two indexes to maintain. For the normal IOT this benefit is outweighed by the cost of working with a less space efficient Primary Key index. As there is the whole of the row in each index entry there are fewer index entries possible per leaf block and so more block need to be read in from disc and more splits are needed. Thus the inserts into the IOT are slower.
But for the partitioned IOT not only is there no table to maintain but the table (really the Primary Key) is partitioned – on date. We are creating data for a single day at a time. All the relevant blocks will be in the one partition (and it’s the latest and smallest partition) and quickly all cached in memory and processed in memory. It’s the efficient use of the cache again. For both the IOT and the Heap, the primary key segment being processed is larger and the inserts are spread across the whole index.

Ordering data to match the index order (ordered_by_account tests) helps for all three segment types – heap, IOT and partitioned IOT. We have two indexes and the data is being ordered for both in effect, as one is on the columns we are ordering the data creation for and the other, the unique key, is generated by an ascending sequence. The gain is about 25, 30% and is due to working through the indexes in order as opposed to skipping about. If the indexes were significantly larger than the buffer cache or on a highly active system where the turnover in the cache was very high (indeicated by a low buffer cache hit ratio 🙂 ) then the benefit would be higher.
The perfect ordering test (perfect_iot_order_match) is a special case. Not only is the data being processed in order of the indexes but only one part of the index is being updated – for a single account. In this situation the maintaining of the index is so efficient that the time saved not processing the heap table means both the IOT and IOT_P win out over the heap. So in this special case, inserts into IOTs are faster than inserts into a heap table.

And finally a quick summary:

  • Inserting records into an IOT carries about a 200-400% overhead, depending on the number of indexes
  • Ordering the data to match the IOT gives a 25-30% advantage, but it similarly aids update of the primary key on the heap
  • Partitioning the IOT such that a smaller chunk of the index is processed can be highly beneficial. I don’t cover it specifically in the above but you can probably appreciate that partitioning on a heap can be beneficial to processing indexes on the heap also.
  • If you can organise to create your IOT such that data is created only in one or a few distinct points in the structure (you might have to tweak you data load too of course), insert performance might match or even beat that for a heap table with the same indexes
  • Using methods to maximise efficient use of your buffer cache and reduce IO is a very powerful performance tuning aid, not just for select

BTW I am conscious of the fact that the above may surprise many and you might wonder if my data or tests are flawed. I have stripped out most of the code from this post as it was just becoming massive and confusing, but contact me and I will happily supply all the example scripts and data.

Comments»

1. Mark W. Farnham - November 10, 2011

About the non-partitioned IOT:

“As there is the whole of the row in each index entry there are fewer index entries possible per leaf block and so more block need to be read in from disc and more splits are needed. Thus the inserts into the IOT are slower.”

True when there is a lot of row “payload” as compared to the index keys. An important special case is when an IOT contains just a globally unique identifier or a globally unique identifier and a very small payload such as a timestamp. This is not a contradiction of your findings; I’m just noting that the insert performance profile for even non-partitioned IOTs is dramatically different than your tests indicate when the “payload” is small (including nothing). The reason, of course, being exactly in tune with your correct analysis of why they are significantly slower when the whole row is large as compared to the keys.

Thanks for an excellent blog series.

mwf

2. mwidlake - November 10, 2011

Nods in agreement. Yes, it’s a really valid point and one I do not think I have even touched on yet. Thanks for that.

Piet de Visser and I have exchanged emails (maybe even comments in this series) about the power of IOTs when the Row Payload {love the term Mark} is very low, in particular the case of intersection entities.
In this case, a table exists just to link Table A to table B and probably the other way around. Table A and Table B both have a globally unique identifier. What you want to drive this (traditionally) is a table with the globally unique identifiers as the two columns of the table and then two indexes, one on (A,B), the other on (B,A).

The table is redundant, all you want are the two indexes. So create an IOT of (A,B) {one of the indexes} and add an index on (B,A). Save one heap table and for the inserts, you will probably have a hot spot in the IOT and a hot spot in the index.

There is of course a way to reduce the Row Payload for an IOT and help make the IOT more efficient…(A packet of Smarties for the first person to say what that is…).

Thanks for both the comment and the encouragement Mark.

Mark W. Farnham - November 11, 2011

Insert the iot in the order of the second key…

3. ramsey - November 11, 2011

Store columns in overflow segment

4. Randolf Geist - November 11, 2011

Hi Martin,

great series, really appreciated.

I do have two points to comment:

– If you show performance related details, like session statistics, it would also be great to see corresponding wait event details, if not taken from a SQL trace then for example an ASH like report. Although I agree that providing even more details probably rather tend to confuse than help

– For the OLTP test case a “concurrency” scenario where multiple sessions apply changes to the data at the same time would obviously be also pretty relevant and interesting

Randolf

5. mwidlake - November 13, 2011

Sorry for the delay in responding, I had to go off and do domestic duties at my mother’s – she has no IT at all and I’ve lost my 3G dongle!

So, Randolf, very good points. I did think about adding an ASH report (or parts from it) but I figured I was already throwing a lot of information at people. I look at ASH/AWR a lot when I am investigating things like this, especially if for a client. On a similar topic, for my tests I kept it simple and did not generate any concurrency – in fact I avoided it as much as possible. When I did similar tests for a client, I did the tests “clean”, ie just running the test, and then with a workload running to simulate concurrent read access and some update access.

Mark and Ramsey, it’s overflow segments I was thinking of, so you win the packet of smarties. Not sure how I get them to you but if you are going to the UKOUG conference this year I’ll be there!

Jonathan Lewis - November 23, 2011

I don’t suppose I win a packet of smarties for this because it’s not about the payload, but you can reduce the size of the IOT’s segment by applying index compression to the IOT’s PK (and its secondary indexes, of course).

In your case the first two columns of the PK (acco_type, acco_id) are likely to be very repetitive so “compress 2” seems appropriate. The syntax requires you to add thie clause as a table attribute, viz:

...
organization index
compress 2
mwidlake - November 23, 2011

Compressing the index is certainly worth investigating on an IOT, especially if the intention of the IOT is the grouping of data for a leading portion of the PK, as it is in my example. It should aid the space efficiency.

I plan to do this in a future post. I have to confess, I have never implemented a compressed IOT for real. The client usually shows signs of stress when faced with two things they are not familiar with together and when asked if I can categorically promise there will be no issues I simply can’t.

6. IOTs « Oracle Scratchpad - November 22, 2011

[…] Part 6b: OLTP inserts into an IOT Share this:TwitterLike this:LikeBe the first to like this post. Leave a Comment […]

7. IOT Part 6 – Inserts and Updates Slowed Down (part A) « Martin Widlake's Yet Another Oracle Blog - November 27, 2011

[…] <..IOT1 – the basics <….IOT2 – Examples and proofs <……IOT3 – Significantly reducing IO <……..IOT4 – Boosting Buffer Cache efficiency <……….IOT5 – Primary Key Drawback …………>IOT6(B) – OLTP Inserts […]

8. Sergey - December 9, 2011

Amazing and very informative study of IOT.
Thank you for all the work.

9. Index Orgnaized Tables – A Start (Star) « Richard Foote’s Oracle Blog - January 3, 2012

[…] better news is that the subject of IOTs have already been covered by Martin Widlake in a truly excellent series of articles on his blog. I would strongly recommend giving them a read as they cover many aspects […]

10. Index Organized Tables – An Introduction Of Sorts (Pyramid Song) « Richard Foote’s Oracle Blog - January 10, 2012

[…] I mentioned in a previous post, Martin Widlake has recently written an excellent series on IOTs, which I highly recommend. I’ll try to […]

11. IOT Part 5 – Primary Key Drawback – and Workaround « Martin Widlake's Yet Another Oracle Blog - January 18, 2012

[…] <..IOT1 – the basics <….IOT2 – Examples and proofs <……IOT3 – Significantly reducing IO <……..IOT4 – Boosting Buffer Cache efficiency ……….>IOT6a – Slowing Down Insert …………>IOT6(B) – OLTP Inserts […]


Leave a comment