jump to navigation

IOT 2 – First examples and proofs July 26, 2011

Posted by mwidlake in development, performance.
Tags: , , , ,
trackback

<.. IOT1 – Basics
..>IOT3 – Great reductions in IO for IOTs
….>IOT4 – Boosting Buffer Cache Efficiency
……>IOT5 – Primary Key issues
……….>IOT6a – Slowing Down Insert
…………>IOT6(B) – OLTP Inserts

In my first post on IOTs I ran through the basics of what they are. Here I am going to create some test tables and show you a few things.

I am going to create a simple PARENT table with 9,999 records and then two CHILD tables. CHILD_HEAP, a normal table, and CHILD_IOT, an Index Organized Table. They have the same columns and will hold very similar data.

All of this is on Oracle 11.1 but is exactly the same on 10.2. 8K block size, tablespaces are auto segment space managed.

Here are the creation statements:

--first create the parent table, keyed by ID.
-- The other columns are not significant, they just represent "information"
create table mdw.parent
(id       number(10)    not null 
,name     varchar2(100) not null
,date_1   date          not null
,num_1    number(2)
,num_2    number(2) 
,constraint pare_pk primary key(id) 
 using index tablespace index_01
)
tablespace data_01
/
--
--Now put my 9999 parents into the table.
insert into parent
select rownum
,dbms_random.string('U',mod(rownum,10)+50)
,sysdate-(mod(rownum,500)+1000)
,mod(rownum,99)+1
,trunc(dbms_random.value(0,100))
from dual connect by level < 10000
/
--
-- create the table to hold the children as a heap table
create table child_heap
(pare_id   number(10)    not null
,cre_date  date          not null
,vc_1      varchar2(100) not null
,date_1    date
,num_1     number(2)
,num_2     number(2)
,constraint chhe_pk primary key(pare_id,cre_date) 
 using index tablespace index_01
)
tablespace data_01
/
--
-- create the table to hold the children as an IOT table
create table child_iot
(pare_id   number(10)    not null
,cre_date  date          not null
,vc_1      varchar2(100) not null
,date_1    date
,num_1     number(2)
,num_2     number(2)
,constraint chio_pk primary key(pare_id,cre_date) 
-- using index tablespace index_01 -- CANNOT STATE for IOT. State in table definition
)
ORGANIZATION INDEX -- This is it. This makes the table an IOT
tablespace data_01
/

There are only two differences between the statements creating the CHILD_HEAP and the CHILD_IOT tables.

The main one is the inclusion of the line ORGANIZATION INDEX and is what instructs Oracle to create the table as an IOT. Note that it does not state the index and you cannot state the index. The IOT is created based on the Primary Key.
The other change is that you now cannot state the tablespace for the Primary Key index. I’ve not played with this at all but I don’t think you can state anything with the “using index” as the table storage clauses are used for the Primary Key index. I personally find this a little illogical as it is the index segment that is created, but I guess others would find it more natural that you still state this at the table level.

When I create IOTs on a real system, I put the IOT in a table tablespace {I still maintain table and index tablespaces, for reasons I won’t go into here}. I put it there as it holds the actual data. If I lose that Primary Key index I am losing real data, not duplicated data.

I then populated the two CHILD tables with data. The method of creating this test data is very important.

I am simulating a very common situation, where data is coming in for a set of Parents (think customers, accounts, scientific instruments, financial entities) and the data is coming in as a record or set of records each day. ie not where the parent and all of it’s child records are created at one time, like an order and it’s order lines. I am simulating where the child data is created a few records at a time, not all in one go.

The code is simple. it loops for one hundred days and for each day it creates 10,000 records for random parents. On each day any given parent will have none, one or several records. On average, each parent will end up with 100 records, but some will have more and some less. The key thing is that the data for any given parent is created a record at a time, with lots of records created for other parents before the next record for that given parent.

The two tables will have the same pattern of data but not identical data. {I could have seeded the random number generator to make the two data sets the same but this will do}. Below is the statement for one table, you just change the table name to populate each table. {BTW I like using the from dual connect by level <=x method of getting the number of rows desired – it is fast and is neat, once you have seen it once}.

declare
v_num number :=10000; -- number of people
v_str varchar2(60);
begin
dbms_output.put_line (to_char(SYSTIMESTAMP,'HH24:MI:SS.FF'));
for i in 1..100 loop --days to do
  v_str:=dbms_random.string('U',60);
  insert into CHILD_HEAP
    (pare_id,cre_date,vc_1,date_1,num_1,num_2)
  select
    trunc(dbms_random.value(1,v_num))
   ,sysdate-(100-i) + (rownum/(60*60*24) )
   ,substr(v_str,1,51+mod(rownum,10))
   ,sysdate-(100-i) + ((mod(rownum,30)+1)/3)
   ,mod(rownum,20)+1
   ,mod(rownum,99)+1
  from dual connect by level <=v_num;
end loop;
dbms_output.put_line (to_char(SYSTIMESTAMP,'HH24:MI:SS.FF'));
end;
/

I then gathered objects stats on the tables.
Let’s check the size of the tables:

select segment_name, segment_type,tablespace_name,blocks
from dba_segments where owner=USER and segment_name like 'CHILD%';

SEGMENT_NAME    SEGMENT_TYPE    TABLESPACE_NAME     BLOCKS
--------------- --------------- --------------- ----------
CHILD_HEAP      TABLE           DATA_01              12288

1 row selected.

ONE row? Where is the other table, where is CHILD_IOT? It does not exists.

Remember from my first post that I made the comment I would have prefered it if Index Organized Tables had been called something like ‘Table Containing Indexes’? The table data has been placed in the Primary Key index and the table segment does not even exist. If you start using IOTs this will catch you out periodically – it does me anyway and I’ve been using them on and off for years :-).

Let’s look at the size of the primary key indexes:

select segment_name, segment_type,tablespace_name,blocks
from dba_segments where owner=USER and segment_name like 'CH%PK'
and segment_name not like '%ORD%'

SEGMENT_NAME    SEGMENT_TYPE    TABLESPACE_NAME     BLOCKS
--------------- --------------- --------------- ----------
CHHE_PK         INDEX           INDEX_01              4224
CHIO_PK         INDEX           DATA_01              19456

2 rows selected.

Note that the Primary Key index for CHILD_HEAP, CHHE_PK, is there and is 4,224 blocks in size, and the CHILD_IOT Primary Key, CHIO_PK, is a lot larger at 19,456 blocks. In fact, not only is the CHIO_PK index larger than the CHILD_HEAP table, it is larger than the combined size of the CHILD_HEAP table and CHHE_PK index combines. So much for me saying last post that IOTs can save disk space? I’ll come back to that in a later post…

Here are some other stats from one of my scripts:

mdw11> @tab_sci_own
owner for Table: mdw
Name for Table: child_heap

OWNER    TABLE_NAME          NUM_ROWS      BLOCKS AVG_L GLS ULS LST_ANL      PRT  SAMP_SIZE
-------- -------------- ------------- ----------- ----- --- --- ------------ --- ----------
MDW      CHILD_HEAP          1000,000      12,137    83 YES NO  250711 22:01 NO     1000000


INDEX_NAME      TYP PRT UNQ BL     L_BLKS   DIST_KEYS       CLUSTF     LB_KEY     DB_KEY LST_ANL
--------------- --- --- --- -- ---------- ----------- ------------ ---------- ---------- ------------
CHHE_PK         NOR NO  UNI  2      4,034    1000,000      995,857          1          1 250711 22:02

INDEX_NAME                   TABLE_NAME       PSN COL_NAME
---------------------------- ---------------- --- ------------------------------------------------
CHHE_PK                      CHILD_HEAP       1   PARE_ID
CHHE_PK                      CHILD_HEAP       2   CRE_DATE

--
--
owner for Table: mdw
Name for Table: child_iot

OWNER    TABLE_NAME          NUM_ROWS      BLOCKS AVG_L GLS ULS LST_ANL      PRT  SAMP_SIZE
-------- -------------- ------------- ----------- ----- --- --- ------------ --- ----------
MDW      CHILD_IOT           1000,000                83 YES NO  250711 22:03 NO     1000000

INDEX_NAME      TYP PRT UNQ BL     L_BLKS   DIST_KEYS       CLUSTF     LB_KEY     DB_KEY LST_ANL
--------------- --- --- --- -- ---------- ----------- ------------ ---------- ---------- ------------
CHIO_PK         IOT NO  UNI  2     17,855     910,881            0          1          1 250711 22:03

INDEX_NAME                   TABLE_NAME       PSN COL_NAME
---------------------------- ---------------- --- ------------------------------------------------
CHIO_PK                      CHILD_IOT        1   PARE_ID
CHIO_PK                      CHILD_IOT        2   CRE_DATE

Note the lack of BLOCKS for the CHILD_IOT table and the CLUSTERING_FACTOR of 0 for the CHIO_PK.

The clustering factor is the number of times Oracle, when scanning the whole index in order, would have to swap to a different Table block to look up the table record for each index entry. If it is close to the number of blocks in the table, then the clustering factor is low and the order of records in the table matches the order of entries in the index. This would make index range scans that need to visit the table reasonably efficient.

If the clustering factor is close to the number of records in the table then it means there is no correlation between index order and table row order and such index ranges scans that have to visit the table would be inefficient. Again, this is significant and will be the major topic of the next post.

The depth of the index does not change, being 3 in each case (BL or blevel 2)

So, can we see evidence of the theoretical efficiency of looking up single records via the IOT that I mentioned in the fist post? Here we go {oh, usual disclaimer, I run the code twice and show the second run, to remove the parsing overhead}:

-- First the Heap table
select * from child_HEAP where PARE_ID=1234
AND cre_date=to_date('24-JUN-11 20:13:21','DD-MON-YY HH24:MI:SS')

   PARE_ID CRE_DATE  VC_1
---------- --------- ------------------------------------------------------
DATE_1         NUM_1      NUM_2
--------- ---------- ----------
      1234 24-JUN-11  LUTFHOCIJNYREYICQNORREAJOVBRIHFVLXNIGIVZDMFJCTGYFWC
25-JUN-11         11         16
1 row selected.

Execution Plan
------------------------------------------------------------------------------------------
| Id  | Operation                   | Name       | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |            |     1 |    83 |     3   (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID| CHILD_HEAP |     1 |    83 |     3   (0)| 00:00:01 |
|*  2 |   INDEX UNIQUE SCAN         | CHHE_PK    |     1 |       |     2   (0)| 00:00:01 |
------------------------------------------------------------------------------------------

Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
          4  consistent gets


--and now the IOT table

select * from child_IOT where PARE_ID=1234
AND cre_date=to_date('24-JUN-11 21:23:41','DD-MON-YY HH24:MI:SS')

   PARE_ID CRE_DATE  VC_1
---------- --------- -------------------------------------------------------
DATE_1         NUM_1      NUM_2
--------- ---------- ----------
      1234 24-JUN-11
CSIGBHSXWNDDTCFRCNWYPRNLEQWPCRYTXQQZHACDEXHOBEYXLNYBHRUHJ
27-JUN-11          7         52
1 row selected.

Execution Plan
-----------------------------------------------------------------------------
| Id  | Operation         | Name    | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |         |     1 |    83 |     2   (0)| 00:00:01 |
|*  1 |  INDEX UNIQUE SCAN| CHIO_PK |     1 |    83 |     2   (0)| 00:00:01 |
-----------------------------------------------------------------------------

Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
          3  consistent gets

{I had to look up the exact values of CRE_DATE of a couple of records to do the above queries}

To look up a single row with the heap table you can see that the explain plan was to carry out a unique scan on the primary key and then look up the row via the rowid and took 4 consistent gets. 3 to walk down the index and get the rowid, one to look up the row block.

For the IOT table the explain plan reveals that there was simply an index unique scan of the Primary Key, nothing more. All data for the row was there in the index entry rather than the rowid. Thus only 3 consistent gets were required.

For single row lookups on the Primary Key, IOTS are more efficient than traditional Heap tables with a Primary Key index. {Please, no one point out that if all the columns you need are in the index you also do not need to go to the table, that is a different topic}.

Quite a few people have shown this efficiency before but the next step is far, far more interesting and shows a much more significant impact of IOTs. That is the topic of the next post :-).

For now, I am going to finish off with what happens with range scans as I suggested they could slow down with an IOT.
Below, I select count(*) for just one of the parent values.

select count(*) from child_heap where pare_id = 2

  COUNT(*)
----------
        98

Execution Plan
-----------------------------------------------------------------------------
| Id  | Operation         | Name    | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |         |     1 |     4 |     3   (0)| 00:00:01 |
|   1 |  SORT AGGREGATE   |         |     1 |     4 |            |          |
|*  2 |   INDEX RANGE SCAN| CHHE_PK |   100 |   400 |     3   (0)| 00:00:01 |
-----------------------------------------------------------------------------

Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
          3  consistent gets

--
--

select count(*) from child_iot where pare_id = 2

  COUNT(*)
----------
        93

Execution Plan
-----------------------------------------------------------------------------
| Id  | Operation         | Name    | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |         |     1 |     4 |     4   (0)| 00:00:01 |
|   1 |  SORT AGGREGATE   |         |     1 |     4 |            |          |
|*  2 |   INDEX RANGE SCAN| CHIO_PK |   100 |   400 |     4   (0)| 00:00:01 |
-----------------------------------------------------------------------------

Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
          4  consistent gets

Both statements carry out a range scan on the Primary Key of the table. For the normal HEAP table this takes 3 consistent gets, which is no suprise as we have an 8k block size and only 100 rows for a given parent, they happen to fit into one block of the index. So Oracle works down the depth of the index and looks at one block.

For the IOT the scan works down the index but has to scan three blocks. Even though there are fewer entries, 93 compared to 98, they span three blocks and thus the total number of consistent gets is 5.

Admittedly I was a little lucky in my example above. Sometimes the entries for one parent will scan 2 blocks for the heap table’s Primary Key and occasionally the entries for the IOT will fit into 2 blocks. But if you look at the number of leaf blocks in the earlier stats (4,034 for the normal and 17,855 for the IOT, both for 10,000 entries) usually the 100 or so entries for single parent in the normal index will all fall into one block and the entries for the IOT will fall into between 2 and 3 blocks.

A select count(*) will full scan the smallest segment that can satisfy the query. Let’s try it:

mdw11> select count(*) from child_heap

  COUNT(*)
----------
   1000000

Execution Plan
-------------------------------------------------------------------------
| Id  | Operation             | Name    | Rows  | Cost (%CPU)| Time     |
-------------------------------------------------------------------------
|   0 | SELECT STATEMENT      |         |     1 |   989   (1)| 00:00:15 |
|   1 |  SORT AGGREGATE       |         |     1 |            |          |
|   2 |   INDEX FAST FULL SCAN| CHHE_PK |  1000K|   989   (1)| 00:00:15 |
-------------------------------------------------------------------------

Statistics
----------------------------------------------------------
          1  recursive calls
          2  db block gets
       4109  consistent gets
       4088  physical reads

mdw11> select count(*) from child_iot

  COUNT(*)
----------
   1000000

Execution Plan
-------------------------------------------------------------------------
| Id  | Operation             | Name    | Rows  | Cost (%CPU)| Time     |
-------------------------------------------------------------------------
|   0 | SELECT STATEMENT      |         |     1 |  4359   (1)| 00:01:05 |
|   1 |  SORT AGGREGATE       |         |     1 |            |          |
|   2 |   INDEX FAST FULL SCAN| CHIO_PK |  1000K|  4359   (1)| 00:01:05 |
-------------------------------------------------------------------------

Statistics
----------------------------------------------------------
          1  recursive calls
          0  db block gets
      19298  consistent gets
      19246  physical reads

The number of consistent gets (and physical reads) are close to the number of leaf blocks in the two segments, though higher. This is because Oracle is scanning the whole index, leaf blocks and branch blocks. The scan is far more expensive for the IOT, simply as the index is so much larger. I’ve not shown timings but on my little laptop, the count(*) takes about 3 seconds on CHILD_HEAP and about 5 seconds on the CHILD_IOT.

That is enough for one post.

About these ads

Comments»

1. Index Organized Tables – the Basics. « Martin Widlake's Yet Another Oracle Blog - July 26, 2011

[...] …Next post, giving examples>> [...]

2. IOT part 3 – Significantly Reducing IO « Martin Widlake's Yet Another Oracle Blog - August 2, 2011

[...] <..IOT1 – the basics <….IOT2 – Examples and proofs [...]

3. IOT Part 4 – Greatly Boosting Buffer Cache Efficiency « Martin Widlake's Yet Another Oracle Blog - August 8, 2011

[...] – the basics <….IOT2 – Examples and proofs <……IOT3 – Significantly reducing [...]

4. IOT Part 5 – Primary Key Drawback – and Workaround « Martin Widlake's Yet Another Oracle Blog - August 17, 2011

[...] – the basics <….IOT2 – Examples and proofs <……IOT3 – Significantly reducing [...]

5. IOT Part 6 – Inserts and Updates Slowed Down (part A) « Martin Widlake's Yet Another Oracle Blog - November 1, 2011

[...] – the basics <….IOT2 – Examples and proofs <……IOT3 – Significantly reducing IO <……..IOT4 – Boosting [...]

6. IOTs « Oracle Scratchpad - November 22, 2011

[...] Part 2: Examples and Proofs [...]

7. IOTs « Ukrainian Oracle User Group - November 23, 2011

[...] Part 2: Examples and Proofs [...]

8. ibneyhasan - January 6, 2014

Can you answer or provide the link for below your comment.
So much for me saying last post that IOTs can save disk space? I’ll come back to that in a later post…


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Follow

Get every new post delivered to your Inbox.

Join 165 other followers

%d bloggers like this: