Friday Philosophy – Picture Theft!!! July 28, 2011
Posted by mwidlake in Blogging, Friday Philosophy, Perceptions.Tags: behaviour, Blogging, ethics, Humour
7 comments
Last week’s Friday Philosophy was a bit of a moan about how hard I find it to make nice graphics, how long it takes and no one seems to care that much about the results.
Well, after those two days effort on the pictures and the afore mentioned moan, irony of irony, someone has stolen one of my graphics!. So someone likes my efforts ;-). It is the one that represents how you scan down the levels of an index and then link across to the table via the rowid.
Before I go any further I better make it clear that I am not really upset about it at all :-). In fact, since the scoundrel included a link back to my web page and they are considerably better known than I, my little blog has had a big up-swing in traffic as a result, which is nice. Mind you, as the person who borrowed my diagram is SQL Server expert Steve Jones, of SQLSeverCentral/Redgate fame, most of my new audience are probably pretty focused on the SQL Server RDBMS and not Oracle, so unlikely to make many return visits unless they are work across the RDBMS boundaries.
What also gives me a little smile is that I have stumbled over the fact that I myself, back in November 2009, was looking for such a diagram {of the way Oracle steps down the index to the leaf blocks, gets the rowid and then straight to the table row} to ‘borrow’ for a post of my own on BLevel and heights of indexes. I even confessed at the time to looking for and failing to find one to use…
Humour aside, it set me to thinking though. Borrowing content is a perennial and thorny issue.
Occasionally someone will start putting content out on their blog or web site and it turns out that much of that content is directly obtained from other peoples’ blogs and websites – copy&pasted straight in or with little changes. That is generally seen by the original author as unacceptable and once they find out they object. In such cases it sometimes seems the culprit is unaware of this being a transgression and, once it is explained that they have effectively stolen many hours or days of someone’s efforts, they remove the material. Others seem aware this is theft but do not care until caught. Occasionally the culprit sees no error in their ways at all, even when challenged, as the material had been put “out there” so they now consider it free to all. I certainly do not agree. Perhaps the worst thing you see though is people including parts of published books, or even putting the whole book out there for download. Such people should of course have their hands stapled to their backsides in punishment, that is simple theft. Writing blogs takes a long time and effort, writing technical books takes forever and monumental effort. I know from friends that the financial return for such efforts is pitiful enough as it is.
On the other side of the coin, many of us put our stuff out there on the web to be read and used and are very happy for it to spread, to be borrowed from and disseminated. Like nearly all DBAs and developers, over the years I have written lots of little SQL scripts to pull information out of the data dictionary or do little database management tasks. I happily give away copies of these to anyone who wants them (and you can get them off my web site if you like, but just pretend it is not my website, as it is truly awful). All I ever ask is that whoever takes them leaves my name in them.
I think that is core to the issue. I suspect many of us bloggers are happy for small parts of our output to be borrowed so long as credit is given. I certainly am {but please note, this is my personal opinion – other bloggers may object very strongly and any repercussions on you in respect of taking material from other blogs and web sites is your concern}. However, Volume is also part of it. The larger the chunk you borrow, the more acknowledgement I would need to be happy about it. Borrowing a single diagram or a paragraph out of a page of text is OK, given I am cited for it. Taking most of a post would probably not, unless you asked first, were really nice about it and about me. Nicking a set of course notes I wrote is certainly unacceptable, no matter how much you put “originally written by that wonderful Martin Widlake” on it.
So, I think you need to cite the source as “payment” for using it. Perhaps the best way to do it is by simply linking to the material rather than putting it on your blog/website, but that does not work if you need the content within yours to make sense. In which case, I think Steve Jones’ approach of putting the content in his and including a link is reasonable. It might have been nice if there was a comment saying where the image came from but I can live without it. Despite my joking about it giving me more hits to my blog, it does not matter that his is a popular web site and gives me more hits. Even if a site gets no traffic, if someone has borrowed a small part of my output but cited me as the source, I’m cool with that.
The problem though is judging what is a “small” part to borrow and what is acceptable to the original author. We all perceive such things differently. So the safest thing is to ask the original author. If I want to use an idea that came from someone else in one of my blogs or a solution they came up with, I always ask and I ask if they want to be cited. This includes discussions in email or in the pub. I ask. If when preparing my blogs I learn a lot from someone else’s blog, I stick in a link and a comment, even though I will have written my own text. I hope that so far I have not upset anyone when I borrow a little.
Photos are a different issue though. I am not going to even attempt to cover that one!
IOT 2 – First examples and proofs July 26, 2011
Posted by mwidlake in development, performance.Tags: design, index organized tables, IOT, performance, system development
10 comments
<.. IOT1 – Basics
..>IOT3 – Great reductions in IO for IOTs
….>IOT4 – Boosting Buffer Cache Efficiency
……>IOT5 – Primary Key issues
……….>IOT6a – Slowing Down Insert
…………>IOT6(B) – OLTP Inserts
In my first post on IOTs I ran through the basics of what they are. Here I am going to create some test tables and show you a few things.
I am going to create a simple PARENT table with 9,999 records and then two CHILD tables. CHILD_HEAP, a normal table, and CHILD_IOT, an Index Organized Table. They have the same columns and will hold very similar data.
All of this is on Oracle 11.1 but is exactly the same on 10.2. 8K block size, tablespaces are auto segment space managed.
Here are the creation statements:
--first create the parent table, keyed by ID. -- The other columns are not significant, they just represent "information" create table mdw.parent (id number(10) not null ,name varchar2(100) not null ,date_1 date not null ,num_1 number(2) ,num_2 number(2) ,constraint pare_pk primary key(id) using index tablespace index_01 ) tablespace data_01 / -- --Now put my 9999 parents into the table. insert into parent select rownum ,dbms_random.string('U',mod(rownum,10)+50) ,sysdate-(mod(rownum,500)+1000) ,mod(rownum,99)+1 ,trunc(dbms_random.value(0,100)) from dual connect by level < 10000 / -- -- create the table to hold the children as a heap table create table child_heap (pare_id number(10) not null ,cre_date date not null ,vc_1 varchar2(100) not null ,date_1 date ,num_1 number(2) ,num_2 number(2) ,constraint chhe_pk primary key(pare_id,cre_date) using index tablespace index_01 ) tablespace data_01 / -- -- create the table to hold the children as an IOT table create table child_iot (pare_id number(10) not null ,cre_date date not null ,vc_1 varchar2(100) not null ,date_1 date ,num_1 number(2) ,num_2 number(2) ,constraint chio_pk primary key(pare_id,cre_date) -- using index tablespace index_01 -- CANNOT STATE for IOT. State in table definition ) ORGANIZATION INDEX -- This is it. This makes the table an IOT tablespace data_01 /
There are only two differences between the statements creating the CHILD_HEAP and the CHILD_IOT tables.
The main one is the inclusion of the line ORGANIZATION INDEX and is what instructs Oracle to create the table as an IOT. Note that it does not state the index and you cannot state the index. The IOT is created based on the Primary Key.
The other change is that you now cannot state the tablespace for the Primary Key index. I’ve not played with this at all but I don’t think you can state anything with the “using index” as the table storage clauses are used for the Primary Key index. I personally find this a little illogical as it is the index segment that is created, but I guess others would find it more natural that you still state this at the table level.
When I create IOTs on a real system, I put the IOT in a table tablespace {I still maintain table and index tablespaces, for reasons I won’t go into here}. I put it there as it holds the actual data. If I lose that Primary Key index I am losing real data, not duplicated data.
I then populated the two CHILD tables with data. The method of creating this test data is very important.
I am simulating a very common situation, where data is coming in for a set of Parents (think customers, accounts, scientific instruments, financial entities) and the data is coming in as a record or set of records each day. ie not where the parent and all of it’s child records are created at one time, like an order and it’s order lines. I am simulating where the child data is created a few records at a time, not all in one go.
The code is simple. it loops for one hundred days and for each day it creates 10,000 records for random parents. On each day any given parent will have none, one or several records. On average, each parent will end up with 100 records, but some will have more and some less. The key thing is that the data for any given parent is created a record at a time, with lots of records created for other parents before the next record for that given parent.
The two tables will have the same pattern of data but not identical data. {I could have seeded the random number generator to make the two data sets the same but this will do}. Below is the statement for one table, you just change the table name to populate each table. {BTW I like using the from dual connect by level <=x method of getting the number of rows desired – it is fast and is neat, once you have seen it once}.
declare v_num number :=10000; -- number of people v_str varchar2(60); begin dbms_output.put_line (to_char(SYSTIMESTAMP,'HH24:MI:SS.FF')); for i in 1..100 loop --days to do v_str:=dbms_random.string('U',60); insert into CHILD_HEAP (pare_id,cre_date,vc_1,date_1,num_1,num_2) select trunc(dbms_random.value(1,v_num)) ,sysdate-(100-i) + (rownum/(60*60*24) ) ,substr(v_str,1,51+mod(rownum,10)) ,sysdate-(100-i) + ((mod(rownum,30)+1)/3) ,mod(rownum,20)+1 ,mod(rownum,99)+1 from dual connect by level <=v_num; end loop; dbms_output.put_line (to_char(SYSTIMESTAMP,'HH24:MI:SS.FF')); end; /
I then gathered objects stats on the tables.
Let’s check the size of the tables:
select segment_name, segment_type,tablespace_name,blocks from dba_segments where owner=USER and segment_name like 'CHILD%'; SEGMENT_NAME SEGMENT_TYPE TABLESPACE_NAME BLOCKS --------------- --------------- --------------- ---------- CHILD_HEAP TABLE DATA_01 12288 1 row selected.
ONE row? Where is the other table, where is CHILD_IOT? It does not exists.
Remember from my first post that I made the comment I would have prefered it if Index Organized Tables had been called something like ‘Table Containing Indexes’? The table data has been placed in the Primary Key index and the table segment does not even exist. If you start using IOTs this will catch you out periodically – it does me anyway and I’ve been using them on and off for years :-).
Let’s look at the size of the primary key indexes:
select segment_name, segment_type,tablespace_name,blocks from dba_segments where owner=USER and segment_name like 'CH%PK' and segment_name not like '%ORD%' SEGMENT_NAME SEGMENT_TYPE TABLESPACE_NAME BLOCKS --------------- --------------- --------------- ---------- CHHE_PK INDEX INDEX_01 4224 CHIO_PK INDEX DATA_01 19456 2 rows selected.
Note that the Primary Key index for CHILD_HEAP, CHHE_PK, is there and is 4,224 blocks in size, and the CHILD_IOT Primary Key, CHIO_PK, is a lot larger at 19,456 blocks. In fact, not only is the CHIO_PK index larger than the CHILD_HEAP table, it is larger than the combined size of the CHILD_HEAP table and CHHE_PK index combines. So much for me saying last post that IOTs can save disk space? I’ll come back to that in a later post…
Here are some other stats from one of my scripts:
mdw11> @tab_sci_own owner for Table: mdw Name for Table: child_heap OWNER TABLE_NAME NUM_ROWS BLOCKS AVG_L GLS ULS LST_ANL PRT SAMP_SIZE -------- -------------- ------------- ----------- ----- --- --- ------------ --- ---------- MDW CHILD_HEAP 1000,000 12,137 83 YES NO 250711 22:01 NO 1000000 INDEX_NAME TYP PRT UNQ BL L_BLKS DIST_KEYS CLUSTF LB_KEY DB_KEY LST_ANL --------------- --- --- --- -- ---------- ----------- ------------ ---------- ---------- ------------ CHHE_PK NOR NO UNI 2 4,034 1000,000 995,857 1 1 250711 22:02 INDEX_NAME TABLE_NAME PSN COL_NAME ---------------------------- ---------------- --- ------------------------------------------------ CHHE_PK CHILD_HEAP 1 PARE_ID CHHE_PK CHILD_HEAP 2 CRE_DATE -- -- owner for Table: mdw Name for Table: child_iot OWNER TABLE_NAME NUM_ROWS BLOCKS AVG_L GLS ULS LST_ANL PRT SAMP_SIZE -------- -------------- ------------- ----------- ----- --- --- ------------ --- ---------- MDW CHILD_IOT 1000,000 83 YES NO 250711 22:03 NO 1000000 INDEX_NAME TYP PRT UNQ BL L_BLKS DIST_KEYS CLUSTF LB_KEY DB_KEY LST_ANL --------------- --- --- --- -- ---------- ----------- ------------ ---------- ---------- ------------ CHIO_PK IOT NO UNI 2 17,855 910,881 0 1 1 250711 22:03 INDEX_NAME TABLE_NAME PSN COL_NAME ---------------------------- ---------------- --- ------------------------------------------------ CHIO_PK CHILD_IOT 1 PARE_ID CHIO_PK CHILD_IOT 2 CRE_DATE
Note the lack of BLOCKS for the CHILD_IOT table and the CLUSTERING_FACTOR of 0 for the CHIO_PK.
The clustering factor is the number of times Oracle, when scanning the whole index in order, would have to swap to a different Table block to look up the table record for each index entry. If it is close to the number of blocks in the table, then the clustering factor is low and the order of records in the table matches the order of entries in the index. This would make index range scans that need to visit the table reasonably efficient.
If the clustering factor is close to the number of records in the table then it means there is no correlation between index order and table row order and such index ranges scans that have to visit the table would be inefficient. Again, this is significant and will be the major topic of the next post.
The depth of the index does not change, being 3 in each case (BL or blevel 2)
So, can we see evidence of the theoretical efficiency of looking up single records via the IOT that I mentioned in the fist post? Here we go {oh, usual disclaimer, I run the code twice and show the second run, to remove the parsing overhead}:
-- First the Heap table select * from child_HEAP where PARE_ID=1234 AND cre_date=to_date('24-JUN-11 20:13:21','DD-MON-YY HH24:MI:SS') PARE_ID CRE_DATE VC_1 ---------- --------- ------------------------------------------------------ DATE_1 NUM_1 NUM_2 --------- ---------- ---------- 1234 24-JUN-11 LUTFHOCIJNYREYICQNORREAJOVBRIHFVLXNIGIVZDMFJCTGYFWC 25-JUN-11 11 16 1 row selected. Execution Plan ------------------------------------------------------------------------------------------ | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ------------------------------------------------------------------------------------------ | 0 | SELECT STATEMENT | | 1 | 83 | 3 (0)| 00:00:01 | | 1 | TABLE ACCESS BY INDEX ROWID| CHILD_HEAP | 1 | 83 | 3 (0)| 00:00:01 | |* 2 | INDEX UNIQUE SCAN | CHHE_PK | 1 | | 2 (0)| 00:00:01 | ------------------------------------------------------------------------------------------ Statistics ---------------------------------------------------------- 0 recursive calls 0 db block gets 4 consistent gets --and now the IOT table select * from child_IOT where PARE_ID=1234 AND cre_date=to_date('24-JUN-11 21:23:41','DD-MON-YY HH24:MI:SS') PARE_ID CRE_DATE VC_1 ---------- --------- ------------------------------------------------------- DATE_1 NUM_1 NUM_2 --------- ---------- ---------- 1234 24-JUN-11 CSIGBHSXWNDDTCFRCNWYPRNLEQWPCRYTXQQZHACDEXHOBEYXLNYBHRUHJ 27-JUN-11 7 52 1 row selected. Execution Plan ----------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ----------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 83 | 2 (0)| 00:00:01 | |* 1 | INDEX UNIQUE SCAN| CHIO_PK | 1 | 83 | 2 (0)| 00:00:01 | ----------------------------------------------------------------------------- Statistics ---------------------------------------------------------- 0 recursive calls 0 db block gets 3 consistent gets
{I had to look up the exact values of CRE_DATE of a couple of records to do the above queries}
To look up a single row with the heap table you can see that the explain plan was to carry out a unique scan on the primary key and then look up the row via the rowid and took 4 consistent gets. 3 to walk down the index and get the rowid, one to look up the row block.
For the IOT table the explain plan reveals that there was simply an index unique scan of the Primary Key, nothing more. All data for the row was there in the index entry rather than the rowid. Thus only 3 consistent gets were required.
For single row lookups on the Primary Key, IOTS are more efficient than traditional Heap tables with a Primary Key index. {Please, no one point out that if all the columns you need are in the index you also do not need to go to the table, that is a different topic}.
Quite a few people have shown this efficiency before but the next step is far, far more interesting and shows a much more significant impact of IOTs. That is the topic of the next post :-).
For now, I am going to finish off with what happens with range scans as I suggested they could slow down with an IOT.
Below, I select count(*) for just one of the parent values.
select count(*) from child_heap where pare_id = 2 COUNT(*) ---------- 98 Execution Plan ----------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ----------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 4 | 3 (0)| 00:00:01 | | 1 | SORT AGGREGATE | | 1 | 4 | | | |* 2 | INDEX RANGE SCAN| CHHE_PK | 100 | 400 | 3 (0)| 00:00:01 | ----------------------------------------------------------------------------- Statistics ---------------------------------------------------------- 0 recursive calls 0 db block gets 3 consistent gets -- -- select count(*) from child_iot where pare_id = 2 COUNT(*) ---------- 93 Execution Plan ----------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ----------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 4 | 4 (0)| 00:00:01 | | 1 | SORT AGGREGATE | | 1 | 4 | | | |* 2 | INDEX RANGE SCAN| CHIO_PK | 100 | 400 | 4 (0)| 00:00:01 | ----------------------------------------------------------------------------- Statistics ---------------------------------------------------------- 0 recursive calls 0 db block gets 4 consistent gets
Both statements carry out a range scan on the Primary Key of the table. For the normal HEAP table this takes 3 consistent gets, which is no suprise as we have an 8k block size and only 100 rows for a given parent, they happen to fit into one block of the index. So Oracle works down the depth of the index and looks at one block.
For the IOT the scan works down the index but has to scan three blocks. Even though there are fewer entries, 93 compared to 98, they span three blocks and thus the total number of consistent gets is 5.
Admittedly I was a little lucky in my example above. Sometimes the entries for one parent will scan 2 blocks for the heap table’s Primary Key and occasionally the entries for the IOT will fit into 2 blocks. But if you look at the number of leaf blocks in the earlier stats (4,034 for the normal and 17,855 for the IOT, both for 10,000 entries) usually the 100 or so entries for single parent in the normal index will all fall into one block and the entries for the IOT will fall into between 2 and 3 blocks.
A select count(*) will full scan the smallest segment that can satisfy the query. Let’s try it:
mdw11> select count(*) from child_heap COUNT(*) ---------- 1000000 Execution Plan ------------------------------------------------------------------------- | Id | Operation | Name | Rows | Cost (%CPU)| Time | ------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 989 (1)| 00:00:15 | | 1 | SORT AGGREGATE | | 1 | | | | 2 | INDEX FAST FULL SCAN| CHHE_PK | 1000K| 989 (1)| 00:00:15 | ------------------------------------------------------------------------- Statistics ---------------------------------------------------------- 1 recursive calls 2 db block gets 4109 consistent gets 4088 physical reads mdw11> select count(*) from child_iot COUNT(*) ---------- 1000000 Execution Plan ------------------------------------------------------------------------- | Id | Operation | Name | Rows | Cost (%CPU)| Time | ------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 4359 (1)| 00:01:05 | | 1 | SORT AGGREGATE | | 1 | | | | 2 | INDEX FAST FULL SCAN| CHIO_PK | 1000K| 4359 (1)| 00:01:05 | ------------------------------------------------------------------------- Statistics ---------------------------------------------------------- 1 recursive calls 0 db block gets 19298 consistent gets 19246 physical reads
The number of consistent gets (and physical reads) are close to the number of leaf blocks in the two segments, though higher. This is because Oracle is scanning the whole index, leaf blocks and branch blocks. The scan is far more expensive for the IOT, simply as the index is so much larger. I’ve not shown timings but on my little laptop, the count(*) takes about 3 seconds on CHILD_HEAP and about 5 seconds on the CHILD_IOT.
That is enough for one post.
Fastest £1,000 Server – back from supplier July 23, 2011
Posted by mwidlake in One Grand Server.Tags: hardware, private
6 comments
At the risk of turning my Blog into some sort of half-way-house tweet update thing (correct, I’ve never logged into twitter), as a couple of people asked about the outcome with the broken £1,000 server, I’m happy to report it came back this week. The motherboard had died. I’d convinced myself it was the PSU when I trawled the net as it seems to be one of those things that is most likely to die having fired up in the first place, but no, the motherboard. I guess some solder “dried” or the pc pixies just don’t like me. One month turnaround is not very impressive…
They had another motherboard exactly the same in stock so I got a like-for-like swap. I was kind of hoping for a different one with more SATA3 and USB3 headers 🙂
Now I’m trying to download the latest oracle 11 for 64 bit windows. I live out in the wilds of North Essex (for non-UK people, this is all of 62 Kilometers North-Northeast of London as the crow flies, so not exactly in an obscure and remote part of the UK! For those who DO know the UK, it is nothing like “the only way is Essex” out here. We have trees, fields, wildlife and a lack of youth culture.) As such, my broadband connect is sloooow. The connection keeps breaking and I lose the download. *tsch*. I’m sure I had a download manager somewhere which got around these issues…
Friday Philosophy – PowerPoint Picture Pain July 21, 2011
Posted by mwidlake in Friday Philosophy, humour.Tags: documentation, Presenting
16 comments
The title does not refer to the agony of sitting through a presentation with loads of pointless graphics and zip-all content. Rather, it encompasses the agony of trying to produce some decent graphical content for PowerPoint. Or Word or any other software-based method of presenting information. I think we generally had better presentations when we had overhead projectors or flip charts and I go “powerpoint naked” to present sometimes.
However, I do present quite often using the ubiquitous PowerPoint. I spend hours on my slides, both trying to put good textual content into them for both the actual presentation and also for anyone who downloads them later. I also spend ages trying to put together graphics as we humans are visual creatures and a good {let’s pause for a second and stress the word good} picture or diagram can help enormously. I did a little animated graphic a few years back to show how DNA is unzipped and duplicated, allowing the near perfect transmission of our DNA as cells divide. Modesty prevents me saying it was good but {drat, I just said it} but several people told me it really helped them understand something that text descriptions struggle to put across.
{Hmm, OK, I’ll see if I can get this to work. Try this how-dna-duplicates powerpoint show. It shows how the two strands of DNA are “matched”. A, C, T and G are the 4 letters of the DNA and each one is associated with its “pair” in the second strand. A to T, T to A, C to G and G to C. Split the pair into two strands and you can make each a perfect copy of the original by adding back the pairs. Understand? Oh, and there are very, very, very rare copy mistakes, in this case a C is matched to a T rather than an A. Go on, fire up the slide.}
The reality is though that most of my presentations have very average slides. When you present for the UK Oracle User group they are good enough to solicit feedback from the audience and feed it back to the presenter. You get scored on topic, content, style, slides, charisma, sex appeal… The lowest part of my score is always, Always, ALWAYS the slides. This started getting on my nerves so one year I tried really hard on the slides. It was still my weakest point.
The thing is, I have never had a graphics team (let alone a whole marketing department) to help me out. Either I have worked for a company with no such thing or I have been a lone consultant doing what I can in my spare time. It is especially frustrating as large companies that do have access to teams of graphic artists seem to present with a large number of very pretty, very well drawn slides with zip-all content and the graphics add almost nothing to understanding (and we are back to the first line of this posting). In fact, I often find that slides with lots of pretty icons representing part of the technology stack with chunky arrows sweeping across them showing flow to be {cough} pointless to my understanding. One thing I have picked up from all those pretty pictures though is that a feeling of depth and texture is very effective in altering how a slide looks.
So, I decided for my posts on Index Organised Tables I would spend some time messing around with such things in PowerPoint. My Graphic of the Thread is:
I’m quite pleased with it, especially as in doing so I found out quite a bit about how to use the gradient fills, shadowing, 3D formating and rotating of simple shapes. It is certainly better than my previous stock graphic for an index:
I think most people would agree, it is an improvement. But is it two days worth of effort improvement. Yes, that is how long I spent messing around with the image options in PowerPoint to replace a flat, uninspiring green blob with 3 lines on it to represent an index into a nicely shaped, pseudo-3D image with a tad more information. *sigh*.
I guess I can now convert other flat pictures made up of boxes, triangles and lines into something that look a bit more like a professional diagram than something knocked up by a database geezer late one evening the day before the presentation. But if it does not help me produce a diagram that aids understanding rather than just presentation fluff, like those marketing slides I’ve just wasted a lot of time.
I tell you what though, having spent so long on that diagram, I’m going to use it to death 🙂
Index Organized Tables – the Basics. July 18, 2011
Posted by mwidlake in development, internals, performance.Tags: Database Design, index organized tables, IOT, performance, system development
37 comments
..>IOT2 – Examples and proofs
….>IOT3 – Greatly reducing IO with IOTs
……>IOT4 – Boosting Buffer Cache Efficiency
……..>IOT5 – Primary Key issues
……….>IOT6a – Slowing Down Insert
…………>IOT6(B) – OLTP Inserts
I think Index Organized Tables(IOTs) are a much under-used and yet very useful feature of Oracle. Over the next few postings I’m going to cover some aspect of Index Organised Tables, both good and not-so-good. I am going to cover some benefits of IOTs that I think many people are unaware of. In this first post I am just going to run through the basics of IOTs.
The idea behind an IOT is simple. You hold all the data for the table in the ordered structure of an index. Why would you want to do that? Let us consider a very common requirement, accessing a row in a “large” table via a known, unique key.
Traditionally you have a heap table holding the data you want to access and a standard index to support access to that table. See the first diagram below. The 4-layer triangle represents the index, with a root block, two levels of branch blocks and then the leaf blocks at the “bottom”. The blue rectangle represents the table with the squares being individual rows. Of course, in a large table there would be thousands or millions of “squares”, this is just a simple diagram to show the idea.
When you issue a SQL statement to select the row via the indexed column(s) then oracle will read the root block (1), find the relevent block in the first level of branch blocks (2), then the relevant block in the second level of branch blocks (3) and finally (as far as the index is concerned) the relevant Leaf Block for the unique key. The leaf block holds the indexed column(s) and also the rowid. The rowid is the fastest way to look up a record, it states the file, block and row offset for the row. This allows oracle to go straight to the block and get the row. That is read number (5).
The number of branch blocks {and thus the number of blocks that need to be read to find a row} will vary depending on how much data is indexed, the number and size of the columns in the index, how efficiently the space has been used in the blocks and one or two other factors. In my experience most indexes for tables with thousands or millions of rows have one, two or three levels of branch blocks.
The second diagram shows a representation of the Index Organized Table. The table has in effect disappeared as a distinct object and the information has been moved into the leaf blocks of the index {part of me feels Index Organized Tables should really be called Table Organized Indexes or Table Containing Indexes as that would better indicate what is physically done}:
So with the IOT oracle reads the root block (1), the two branch level blocks (2 and 3) and finally the leaf block (4). The leaf block does not hold the rowid but rather the rest of the columns for the table {this can be changed, a more advanced feature allows you to store some or all the extra columns in an overflow segment}. Thus to access the same data, Oracle has to read only 4 blocks, not 5. Using an IOT saves one block read per unique lookup.
This saving of block reads is probably the main feature that IOTs are known for, but there are others which I will cover in later posts. Two things I will mention now is that, firstly, the use of IOTs is potentially saving disc space. An index is in effect duplication of data held in the table. When you create an index no new information is created but space is used up holding some of the table information in a structure suitable for fast lookup. Secondly, the index and table have to be maintained whenever a change is made to the columns that are indexed. IOTs reduce this maintenance overhead as there is only one thing to maintain.
Now for some drawbacks.
- The IOT has to be indexed on the primary key. There is no option to create an IOT based on other indexes. As such you have to either be accessing the table via the primary key to get the benefit – or you have to be a little cunning.
- The index is going to be larger than it was and very often larger than the original table. This can slow down range scans or full scans of the index and a “full table scan” will now be a full index scan on this large object, so that can also negatively impact performance. However, if a range scan would then have resulted in access to the table to get extra columns, the IOT gives a similar benefit in reducing IO to that for single row lookups.
- I just want to highlight that you now have no rowid for the rows.
- Secondary indexes are supported but will potentially be less efficient due to this lack of rowid.
So, a brief summary is that Index Organised Tables effectively move the table data into the Primary Key index, reduce the number of block lookups needed to select one row, can save some disc space. But you can only organize the table via the Primary Key and it can make full or partial table scans and lookups via other indexes slower.
There are several more important benefits to IOTs {in my opinion} which I will come to over the next week or two.
When to Fix & When to Find Out – Friday Philosophy July 15, 2011
Posted by mwidlake in AWR, Friday Philosophy, Testing.Tags: knowledge, Testing
8 comments
{warning, this is one of my long, rambling Friday Philosophy blogs. Technical blogs are on the way – though there are some nice AWR screen shots in this one 🙂 }
As a DBA (or System Administrator or Network admin or any other role that involves fixing things that are broken before the shouting starts) there is often a tension between two contending “best practices”:
– getting the system working again as soon as possible.
or
– understanding exactly what the problem is.
Some experts point out {and I generally agree} that unless you actually find out exactly what the problem was, what you did to fix it and via a test case demonstrate why the fix worked, you are not really solving the problem – You are just making it go away. You (hopefully!) know what you changed or the action you took so you have something that you can repeat which might fix it again next time. (NB this does not apply to the classic “turn it off, turn it back on again”, that nearly always is just an aversion therapy).
But it might not fix it next time.
Or you might be going to way more trouble to fix it than is needed {I’m remembering how flushing the shared pool used to get touted as a way to resolve performance issues, but it is a pretty brutal thing to do to a production database}.
You might even be making other problems worse {like slowing down everything on the production database as the caches warm up again, having flushed out all that data in the SGA, or that good-old-standard of rebuilding indexes that simply do not benefit from the rebuild}.
There is another issue with “just fixing it by trying things” in that you are not really learning about the cause of the issue or about how the technology you are looking after works. A big part of what makes an “expert” an expert is the drive, desire and opportunity to take the time to work this out. It’s that last part I sometimes get grumpy about, the opportunity.
For many of us, we do not have the luxury of investigating the root cause. Because we are under so much pressure to “get it fixed right now and worry about the detail later”. But we do not get to the detail as there is then the next “fix this problem right now” and so it goes on. {Kellyn Pot’Vin does a nice post about what she calls the Superman Conundrum on this topic}.
I’ve had exactly this dilema just a couple of months ago. Below are the details, it’s a bit long so skip to the end of the post if you like…
I was creating test data and I decided to use parallel processing to speed it up. I created a month’s worth of data with PL/SQL and then decided to copy it with a simple “insert into …select” statement, updating dates and a couple of other incrementing columns as part of the insert, using parallel. The creation of the second month’s data took longer than the PL/SQL code for the first month took. What the!!!??? I pulled up the AWR information and I could see that the problem was (possibly) with inter process communication between the parallel processes, as shown by the PX DEQ CREDIT:send blkd wait event.
The below screenshot shows the overall instance workload, the green is CPU and the Pink is “Other”. Only one SQL statement is responsible for all of this effort, via 5 sessions (four of which are parallel threads) You can see that the issue had been going on for over an hour {oh, and to a certain extent these pretty pictures are pointless – I am not looking for the exact solution now, but having loaded the pictures up to the blog site I am damn well going to pretty-up my blog with them}:
Drilling into that one session shows that the bulk of the waits by far is for PX DEq Credit: Send blkd:
By selecting that wait event, I got the histogram of wait times since the system started {I love those little histograms of wait times}:
Note that these waits are for long periods of time, around 1/10 of a second on average and some are for up to 2 or 4 seconds.
The thing is, I had anticipated this problem and increased my PARALLEL_EXECUTION_MESSAGE_SIZE to 16K from the default of 2K already, as I knew from experience that the default is way to small and has slowed down parallel execution for me before. So why was I seeing poorer performane now than I anticipated? I’m not understanding stuff. So I needed to change one thing and see what impact it has and repeat until I got to the bottom of it.
Except I could not – the next team in line was waiting for this data and they already had massive pressure to deliver. My job, what my employer was giving me money to do, was to fix this problem and move on. So, in this instance I just pragmatically got on with getting the task done as quickly as I could.
I did what we all do {don’t we?} under times of accute time pressure. I made a handful of changes, using the knowledge I already have and guessing a little, hoping that one of them would do the trick. This included reducing the degree of parallelism, adding the /*+ append */ hint (I simply forgot the first time around), pre-allocating the required space to the tablespace, muttering “pleaseopleaseoplease” under my breath….
The job ran in less than 20 minutes and used less resource during that time. Well, it waited less anyway.
The wait histograms show lots and lots of shorter duration waits:
The duplication took 20 minutes when the previous attempt had been terminated after 2 hours when other factors forced it to be curtailed. Job done.
But the thing is, the problem was not fixed. I got the task done in a timescale that was required, I satisfied the need of the client, but I was and am not sure exactly why.
If I was a permanent employee I would consider pressing for being allowed to spend some time working this out, as my employer benefits from me extending my knowledge and skills. This is not always a successful strategy 🙂 {but it does indicate to me that my employer has a Problem and I need to address that}. In any case, as I was a consultant on this job, I was being paid to already know stuff. So it is now down to me, some personal time, the internet and people more knowledgeble than me who I can ask for help to sort this out.
And that was my main point. Often at work you have to get over the issue. Later on, you work out what the issue was. I wish I could get paid for the latter as well as the former. The real blow for me is that I no longer have access to that site and the information. My job was complete and, whether they have really shut down my access or not, I am not allowed to go back into the systems to dig around. I think I now know the issue, but I can’t prove it.
Fastest £1,000 server – what happened? July 12, 2011
Posted by mwidlake in One Grand Server, performance.Tags: hardware
7 comments
A couple of people have asked me recently what happened to that “fastest Oracle server for a grand” idea I had last year, after all I did announce I had bought the machine.
{Update – it came back.}
Well, a couple of things happened. Firstly, what was a small job for a client turned into a much more demanding job for a client – not so much mentally harder as time-consuming harder and very time consuming it was. So the playing had to go on hold, the client comes first. The server sat in the corner of the study, nagging me to play with it, but it remained powered down.
Secondly, when the work life quietened down last month and I decided to spend a weekend getting that server set up I hit an issue. I turned on the server and it turned itself straight off. It than rested for 5 seconds and turned itself back on for half a second – and then straight off. It would cycle like that for as long as I was willing to let it.
OK, duff power switch, mother board fault, something not plugged in right, PSU not reaching stable voltage… I opened the case and checked everything was plugged in OK and found the manufacturer had covered everything with that soft resin to hold things in place. I pressed on all the cards etc in hope but no, it was probably going to have to go back. It is still in warranty, the manufacturer can fix it.
So I rang the manufacturer and had the conversation. They were not willing to try and diagnose over the phone so I had to agree to ship it back to them to be fixed {I did not go for on-site support as the only time I did, with Evesham Micros, they utterly refused to come out to fix the problem. Mind you, it turns out they were counting down the last week or two before going bust and, I suspect, knew this}. I shipped it back and the waiting began. Emails ignored, hard to get on touch over the phone. Over three weeks on and they only started looking at the machine last Friday (they claim).
On the positive side, this delay means that solid state storage is becoming very affordable and I might be able to do some more interesting things within my budget.
On the bad side the technology has moved on and I could get a better server for the same money now, but that is always the case. Mine does not have the latest Sandy Bridge Intel processor for example. Also, I have time now to work on it, I hope not to have time next month as I’d like to find some clients to employ me for a bit!
I better go chase the manufacturer. If it is not fixed and on its way back very, very soon then they will be off my list of suppliers and I’ll be letting everyone know how good their support isn’t.
Friday Philosophy – Why do I work with Oracle Technology? July 8, 2011
Posted by mwidlake in Friday Philosophy.Tags: behaviour
23 comments
As an Oracle Expert {*cough* bear with me, despite the lack of humility} I make a living based on my skills and knowledge about Oracle Technology. But why Oracle?
I was prompted to think about this as a side issue to a discussion within the OakTable network, about being aligned with companies – and it was suggested we are aligned with Oracle {my personal feeling is strongly that we are not – we are independent of Oracle Corp}.
How many people reading this blog woke up one morning and thought “Hey, I’m going to review all database technologies, work out the best one and work with that!” And then, after investigations, threw their weight behind Oracle? I certainly did not. I joined a company called Oracle UK almost as a way to escape my then management structure. I thought they were a teletext company {see the “about me” tab if you want a bit more detail on that}. I remain working with Oracle technology primarily because that is what I know the most about. It is by luck that I had stumbled into a technical area that went from strength to strength.
Don’t get me wrong, I think some Oracle technology is very, very good. Most of it is OK and some of it is, well, neither of those two. It is much better than the technology I came from (a language called MUMPS) which, though I still have fond memories of, I decided not to return to when the opportunity came up a couple of years into my Oracle life.
I think it is fair to say that I do not work within the Oracle sphere because I am dedicated to the Larry Ellison world vision {if I could work out what it is, but it seems to have less and less to do with a relational database and more and more about being the IT answer to all business needs, for good or bad}. I work in it because it is where I am and it is good enough technology to not demoralise me too much, plus there is enough work to pay for the cat food and the beer. I also suspect most of you are like me – you work with Oracle Technology because you woke up one day and realised that was where you were. Of course, this probably applies to 95% of people in 95% of jobs.
If I was to have the chance to choose my working career again, would I do the same? Would I work with Oracle technology? Well, it is one of the largest technologies around and so it provides a good source of work. It is probably not a bad choice if you are starting out right now. Actually, if I was to play this game again I would probably not be aiming for computer technology at all, I would hope to be brave enough to aim for what I really wanted to do and try to do medicine and become a surgeon {I did not as I feared my woodwork skills were so poor I would end up a GP, which strikes me as a bloody awful job} or stick with the genetics, which I still love. But if it had been IT? Hmmm, I think I would have gone lower level. I wish I knew how hardware really works. But then, how many jobs are there now in low-level firmware?
So I am in the world of Mr Ellison’s RDBMS and happy there. But importantly, I feel independent of Oracle. I can like what I like about the technology and dislike what I don’t and I can say which is which, just to link back to the topic of being aligned to a company. That is a small part of why I went back to being and independent consultant too. I do not feel obligated to support a company as they pay my wage. I suppose I feel obligated to be not-negative about a company that employs my services, but that can be another topic another day.