IOT 2 – First examples and proofs July 26, 2011
Posted by mwidlake in development, performance.Tags: design, index organized tables, IOT, performance, system development
trackback
<.. IOT1 – Basics
..>IOT3 – Great reductions in IO for IOTs
….>IOT4 – Boosting Buffer Cache Efficiency
……>IOT5 – Primary Key issues
……….>IOT6a – Slowing Down Insert
…………>IOT6(B) – OLTP Inserts
In my first post on IOTs I ran through the basics of what they are. Here I am going to create some test tables and show you a few things.
I am going to create a simple PARENT table with 9,999 records and then two CHILD tables. CHILD_HEAP, a normal table, and CHILD_IOT, an Index Organized Table. They have the same columns and will hold very similar data.
All of this is on Oracle 11.1 but is exactly the same on 10.2. 8K block size, tablespaces are auto segment space managed.
Here are the creation statements:
--first create the parent table, keyed by ID. -- The other columns are not significant, they just represent "information" create table mdw.parent (id number(10) not null ,name varchar2(100) not null ,date_1 date not null ,num_1 number(2) ,num_2 number(2) ,constraint pare_pk primary key(id) using index tablespace index_01 ) tablespace data_01 / -- --Now put my 9999 parents into the table. insert into parent select rownum ,dbms_random.string('U',mod(rownum,10)+50) ,sysdate-(mod(rownum,500)+1000) ,mod(rownum,99)+1 ,trunc(dbms_random.value(0,100)) from dual connect by level < 10000 / -- -- create the table to hold the children as a heap table create table child_heap (pare_id number(10) not null ,cre_date date not null ,vc_1 varchar2(100) not null ,date_1 date ,num_1 number(2) ,num_2 number(2) ,constraint chhe_pk primary key(pare_id,cre_date) using index tablespace index_01 ) tablespace data_01 / -- -- create the table to hold the children as an IOT table create table child_iot (pare_id number(10) not null ,cre_date date not null ,vc_1 varchar2(100) not null ,date_1 date ,num_1 number(2) ,num_2 number(2) ,constraint chio_pk primary key(pare_id,cre_date) -- using index tablespace index_01 -- CANNOT STATE for IOT. State in table definition ) ORGANIZATION INDEX -- This is it. This makes the table an IOT tablespace data_01 /
There are only two differences between the statements creating the CHILD_HEAP and the CHILD_IOT tables.
The main one is the inclusion of the line ORGANIZATION INDEX and is what instructs Oracle to create the table as an IOT. Note that it does not state the index and you cannot state the index. The IOT is created based on the Primary Key.
The other change is that you now cannot state the tablespace for the Primary Key index. I’ve not played with this at all but I don’t think you can state anything with the “using index” as the table storage clauses are used for the Primary Key index. I personally find this a little illogical as it is the index segment that is created, but I guess others would find it more natural that you still state this at the table level.
When I create IOTs on a real system, I put the IOT in a table tablespace {I still maintain table and index tablespaces, for reasons I won’t go into here}. I put it there as it holds the actual data. If I lose that Primary Key index I am losing real data, not duplicated data.
I then populated the two CHILD tables with data. The method of creating this test data is very important.
I am simulating a very common situation, where data is coming in for a set of Parents (think customers, accounts, scientific instruments, financial entities) and the data is coming in as a record or set of records each day. ie not where the parent and all of it’s child records are created at one time, like an order and it’s order lines. I am simulating where the child data is created a few records at a time, not all in one go.
The code is simple. it loops for one hundred days and for each day it creates 10,000 records for random parents. On each day any given parent will have none, one or several records. On average, each parent will end up with 100 records, but some will have more and some less. The key thing is that the data for any given parent is created a record at a time, with lots of records created for other parents before the next record for that given parent.
The two tables will have the same pattern of data but not identical data. {I could have seeded the random number generator to make the two data sets the same but this will do}. Below is the statement for one table, you just change the table name to populate each table. {BTW I like using the from dual connect by level <=x method of getting the number of rows desired – it is fast and is neat, once you have seen it once}.
declare v_num number :=10000; -- number of people v_str varchar2(60); begin dbms_output.put_line (to_char(SYSTIMESTAMP,'HH24:MI:SS.FF')); for i in 1..100 loop --days to do v_str:=dbms_random.string('U',60); insert into CHILD_HEAP (pare_id,cre_date,vc_1,date_1,num_1,num_2) select trunc(dbms_random.value(1,v_num)) ,sysdate-(100-i) + (rownum/(60*60*24) ) ,substr(v_str,1,51+mod(rownum,10)) ,sysdate-(100-i) + ((mod(rownum,30)+1)/3) ,mod(rownum,20)+1 ,mod(rownum,99)+1 from dual connect by level <=v_num; end loop; dbms_output.put_line (to_char(SYSTIMESTAMP,'HH24:MI:SS.FF')); end; /
I then gathered objects stats on the tables.
Let’s check the size of the tables:
select segment_name, segment_type,tablespace_name,blocks from dba_segments where owner=USER and segment_name like 'CHILD%'; SEGMENT_NAME SEGMENT_TYPE TABLESPACE_NAME BLOCKS --------------- --------------- --------------- ---------- CHILD_HEAP TABLE DATA_01 12288 1 row selected.
ONE row? Where is the other table, where is CHILD_IOT? It does not exists.
Remember from my first post that I made the comment I would have prefered it if Index Organized Tables had been called something like ‘Table Containing Indexes’? The table data has been placed in the Primary Key index and the table segment does not even exist. If you start using IOTs this will catch you out periodically – it does me anyway and I’ve been using them on and off for years :-).
Let’s look at the size of the primary key indexes:
select segment_name, segment_type,tablespace_name,blocks from dba_segments where owner=USER and segment_name like 'CH%PK' and segment_name not like '%ORD%' SEGMENT_NAME SEGMENT_TYPE TABLESPACE_NAME BLOCKS --------------- --------------- --------------- ---------- CHHE_PK INDEX INDEX_01 4224 CHIO_PK INDEX DATA_01 19456 2 rows selected.
Note that the Primary Key index for CHILD_HEAP, CHHE_PK, is there and is 4,224 blocks in size, and the CHILD_IOT Primary Key, CHIO_PK, is a lot larger at 19,456 blocks. In fact, not only is the CHIO_PK index larger than the CHILD_HEAP table, it is larger than the combined size of the CHILD_HEAP table and CHHE_PK index combines. So much for me saying last post that IOTs can save disk space? I’ll come back to that in a later post…
Here are some other stats from one of my scripts:
mdw11> @tab_sci_own owner for Table: mdw Name for Table: child_heap OWNER TABLE_NAME NUM_ROWS BLOCKS AVG_L GLS ULS LST_ANL PRT SAMP_SIZE -------- -------------- ------------- ----------- ----- --- --- ------------ --- ---------- MDW CHILD_HEAP 1000,000 12,137 83 YES NO 250711 22:01 NO 1000000 INDEX_NAME TYP PRT UNQ BL L_BLKS DIST_KEYS CLUSTF LB_KEY DB_KEY LST_ANL --------------- --- --- --- -- ---------- ----------- ------------ ---------- ---------- ------------ CHHE_PK NOR NO UNI 2 4,034 1000,000 995,857 1 1 250711 22:02 INDEX_NAME TABLE_NAME PSN COL_NAME ---------------------------- ---------------- --- ------------------------------------------------ CHHE_PK CHILD_HEAP 1 PARE_ID CHHE_PK CHILD_HEAP 2 CRE_DATE -- -- owner for Table: mdw Name for Table: child_iot OWNER TABLE_NAME NUM_ROWS BLOCKS AVG_L GLS ULS LST_ANL PRT SAMP_SIZE -------- -------------- ------------- ----------- ----- --- --- ------------ --- ---------- MDW CHILD_IOT 1000,000 83 YES NO 250711 22:03 NO 1000000 INDEX_NAME TYP PRT UNQ BL L_BLKS DIST_KEYS CLUSTF LB_KEY DB_KEY LST_ANL --------------- --- --- --- -- ---------- ----------- ------------ ---------- ---------- ------------ CHIO_PK IOT NO UNI 2 17,855 910,881 0 1 1 250711 22:03 INDEX_NAME TABLE_NAME PSN COL_NAME ---------------------------- ---------------- --- ------------------------------------------------ CHIO_PK CHILD_IOT 1 PARE_ID CHIO_PK CHILD_IOT 2 CRE_DATE
Note the lack of BLOCKS for the CHILD_IOT table and the CLUSTERING_FACTOR of 0 for the CHIO_PK.
The clustering factor is the number of times Oracle, when scanning the whole index in order, would have to swap to a different Table block to look up the table record for each index entry. If it is close to the number of blocks in the table, then the clustering factor is low and the order of records in the table matches the order of entries in the index. This would make index range scans that need to visit the table reasonably efficient.
If the clustering factor is close to the number of records in the table then it means there is no correlation between index order and table row order and such index ranges scans that have to visit the table would be inefficient. Again, this is significant and will be the major topic of the next post.
The depth of the index does not change, being 3 in each case (BL or blevel 2)
So, can we see evidence of the theoretical efficiency of looking up single records via the IOT that I mentioned in the fist post? Here we go {oh, usual disclaimer, I run the code twice and show the second run, to remove the parsing overhead}:
-- First the Heap table select * from child_HEAP where PARE_ID=1234 AND cre_date=to_date('24-JUN-11 20:13:21','DD-MON-YY HH24:MI:SS') PARE_ID CRE_DATE VC_1 ---------- --------- ------------------------------------------------------ DATE_1 NUM_1 NUM_2 --------- ---------- ---------- 1234 24-JUN-11 LUTFHOCIJNYREYICQNORREAJOVBRIHFVLXNIGIVZDMFJCTGYFWC 25-JUN-11 11 16 1 row selected. Execution Plan ------------------------------------------------------------------------------------------ | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ------------------------------------------------------------------------------------------ | 0 | SELECT STATEMENT | | 1 | 83 | 3 (0)| 00:00:01 | | 1 | TABLE ACCESS BY INDEX ROWID| CHILD_HEAP | 1 | 83 | 3 (0)| 00:00:01 | |* 2 | INDEX UNIQUE SCAN | CHHE_PK | 1 | | 2 (0)| 00:00:01 | ------------------------------------------------------------------------------------------ Statistics ---------------------------------------------------------- 0 recursive calls 0 db block gets 4 consistent gets --and now the IOT table select * from child_IOT where PARE_ID=1234 AND cre_date=to_date('24-JUN-11 21:23:41','DD-MON-YY HH24:MI:SS') PARE_ID CRE_DATE VC_1 ---------- --------- ------------------------------------------------------- DATE_1 NUM_1 NUM_2 --------- ---------- ---------- 1234 24-JUN-11 CSIGBHSXWNDDTCFRCNWYPRNLEQWPCRYTXQQZHACDEXHOBEYXLNYBHRUHJ 27-JUN-11 7 52 1 row selected. Execution Plan ----------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ----------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 83 | 2 (0)| 00:00:01 | |* 1 | INDEX UNIQUE SCAN| CHIO_PK | 1 | 83 | 2 (0)| 00:00:01 | ----------------------------------------------------------------------------- Statistics ---------------------------------------------------------- 0 recursive calls 0 db block gets 3 consistent gets
{I had to look up the exact values of CRE_DATE of a couple of records to do the above queries}
To look up a single row with the heap table you can see that the explain plan was to carry out a unique scan on the primary key and then look up the row via the rowid and took 4 consistent gets. 3 to walk down the index and get the rowid, one to look up the row block.
For the IOT table the explain plan reveals that there was simply an index unique scan of the Primary Key, nothing more. All data for the row was there in the index entry rather than the rowid. Thus only 3 consistent gets were required.
For single row lookups on the Primary Key, IOTS are more efficient than traditional Heap tables with a Primary Key index. {Please, no one point out that if all the columns you need are in the index you also do not need to go to the table, that is a different topic}.
Quite a few people have shown this efficiency before but the next step is far, far more interesting and shows a much more significant impact of IOTs. That is the topic of the next post :-).
For now, I am going to finish off with what happens with range scans as I suggested they could slow down with an IOT.
Below, I select count(*) for just one of the parent values.
select count(*) from child_heap where pare_id = 2 COUNT(*) ---------- 98 Execution Plan ----------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ----------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 4 | 3 (0)| 00:00:01 | | 1 | SORT AGGREGATE | | 1 | 4 | | | |* 2 | INDEX RANGE SCAN| CHHE_PK | 100 | 400 | 3 (0)| 00:00:01 | ----------------------------------------------------------------------------- Statistics ---------------------------------------------------------- 0 recursive calls 0 db block gets 3 consistent gets -- -- select count(*) from child_iot where pare_id = 2 COUNT(*) ---------- 93 Execution Plan ----------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ----------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 4 | 4 (0)| 00:00:01 | | 1 | SORT AGGREGATE | | 1 | 4 | | | |* 2 | INDEX RANGE SCAN| CHIO_PK | 100 | 400 | 4 (0)| 00:00:01 | ----------------------------------------------------------------------------- Statistics ---------------------------------------------------------- 0 recursive calls 0 db block gets 4 consistent gets
Both statements carry out a range scan on the Primary Key of the table. For the normal HEAP table this takes 3 consistent gets, which is no suprise as we have an 8k block size and only 100 rows for a given parent, they happen to fit into one block of the index. So Oracle works down the depth of the index and looks at one block.
For the IOT the scan works down the index but has to scan three blocks. Even though there are fewer entries, 93 compared to 98, they span three blocks and thus the total number of consistent gets is 5.
Admittedly I was a little lucky in my example above. Sometimes the entries for one parent will scan 2 blocks for the heap table’s Primary Key and occasionally the entries for the IOT will fit into 2 blocks. But if you look at the number of leaf blocks in the earlier stats (4,034 for the normal and 17,855 for the IOT, both for 10,000 entries) usually the 100 or so entries for single parent in the normal index will all fall into one block and the entries for the IOT will fall into between 2 and 3 blocks.
A select count(*) will full scan the smallest segment that can satisfy the query. Let’s try it:
mdw11> select count(*) from child_heap COUNT(*) ---------- 1000000 Execution Plan ------------------------------------------------------------------------- | Id | Operation | Name | Rows | Cost (%CPU)| Time | ------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 989 (1)| 00:00:15 | | 1 | SORT AGGREGATE | | 1 | | | | 2 | INDEX FAST FULL SCAN| CHHE_PK | 1000K| 989 (1)| 00:00:15 | ------------------------------------------------------------------------- Statistics ---------------------------------------------------------- 1 recursive calls 2 db block gets 4109 consistent gets 4088 physical reads mdw11> select count(*) from child_iot COUNT(*) ---------- 1000000 Execution Plan ------------------------------------------------------------------------- | Id | Operation | Name | Rows | Cost (%CPU)| Time | ------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 4359 (1)| 00:01:05 | | 1 | SORT AGGREGATE | | 1 | | | | 2 | INDEX FAST FULL SCAN| CHIO_PK | 1000K| 4359 (1)| 00:01:05 | ------------------------------------------------------------------------- Statistics ---------------------------------------------------------- 1 recursive calls 0 db block gets 19298 consistent gets 19246 physical reads
The number of consistent gets (and physical reads) are close to the number of leaf blocks in the two segments, though higher. This is because Oracle is scanning the whole index, leaf blocks and branch blocks. The scan is far more expensive for the IOT, simply as the index is so much larger. I’ve not shown timings but on my little laptop, the count(*) takes about 3 seconds on CHILD_HEAP and about 5 seconds on the CHILD_IOT.
That is enough for one post.
[…] …Next post, giving examples>> […]
[…] <..IOT1 – the basics <….IOT2 – Examples and proofs […]
[…] – the basics <….IOT2 – Examples and proofs <……IOT3 – Significantly reducing […]
[…] – the basics <….IOT2 – Examples and proofs <……IOT3 – Significantly reducing […]
[…] – the basics <….IOT2 – Examples and proofs <……IOT3 – Significantly reducing IO <……..IOT4 – Boosting […]
[…] Part 2: Examples and Proofs […]
[…] Part 2: Examples and Proofs […]
Can you answer or provide the link for below your comment.
So much for me saying last post that IOTs can save disk space? I’ll come back to that in a later post…
Great post.Thank you.
I am not sure why a Index Range scan on IOT is more expensive even you queried based on Primary key(may be because of more blocks IOT holds where as for the same data for heap table it holds less blocks?)if that is true do you know Why the IOT holds more blocks for the same data than heap table ,i would assume IOT packs more rows into blocks than a heap table because it is sorted?
I’m not sure I completely understand your question but I’ll see if I can clarify things for your.
In my example it is a range scan of the primary key as I am only supplying one (pare_id)of the two (pare_id,cre_date) columns of the primary key.
Where I have a heap table and a standard primary key, Oracle scans that primary key. All the primary key holds is those two columns. Each entry consists of the primary key columns and a rowid to the relevent row in the heap table.
However, the IOT (as I defined it) holds all the columns in the primary key. Each entry consists of the primary key columns – and then all the other columns in the “table”. There is a lot more data for each entry.
As such fewer index entries can fit in one block and so a range scan for a similar number of records will be spread over more blocks for an IOT than a standard Primary Key.
Of course, if you want columns that are not in the primary key, with the PK and head table you would need to not only scan the PK but also visit blocks in the heap table. With an IOT you do not.