jump to navigation

Time to wake up April 1, 2014

Posted by mwidlake in Meeting notes, Private Life, UKOUG, Uncategorized.
Tags: , , ,
3 comments

This post is nothing more than a line in the sand, really.

After my post on “retiring” in November last year and then the one about working to network, as opposed to working to network just prior to 2013 year end, I have been very, very quiet. Well, I had things outside of Oracle and IT to do and they were more important to me. I’d say I have made a reasonable stab at them. My wife would say I have been a lazy and good-for-nothing so-and-so. If you do not know me, trust my wife. If you do know me, you already know to trust my wife :-) .

I do need to nod my head at a few friends who got in touch when it all went quite after my retirement post and privately checked I was not dying. I’m not, I’m fine, and I was touched by the concern. I do seem to be currently surrounded by people who have died or are dying, but so far no one I married, lived with, am related by blood to or bought (The CAT, you strange people – but she is old and was “odd” last month, I did think for a while it was white-coat time) has hit that particular end-point, but has become a constant background concern. Hmm, sometimes foreground, but still part of the benefit of not working is being able to be there when others need.

However, in my state of not-working, I did a rare check on the calendar this week and saw that 1st April 2014 was approaching – and that is my line in the sand. I had to make a break from working in London (or, rather, commuting in and out of London) and also sort a few non-IT things out in my real life, which I think I have. So I am entering the weird world of Oracle IT once more. Last week I went to the second Oracle Midlands user group meeting and it was very, very good. If you are within 100 miles of Birmingham, google it and get along. The next meeting is 20th May and Christian Antognini is doing a double-header presentation and you will benefit from being there.

The next day I was in a meeting in London about organising the next UKOUG tech conference. This year it will be in Liverpool and a week later than normal. That might sound negative (way out THERE and THAT late?) but the venue is just right for the conference. There is more to do around that area of Liverpool than last year in Manchester or ineed there was for so many comfortable years in central Birmingham {I did start to drift more to locations in central Brum these last 3 or 4 years but it was a real effort to get people to go with me} and a lot of effort is going in to looking at the feedback from prior events to improve this net event in 2014. I am determined to get some of that feedback through. Though I would say that seeing as I am involved :-). I’ll mention some more details later this week or next week, depending on how my non-Oracle life taking it’s demands.

Tomorrow (today?) I am at the next UKOUG AIM SIG – it needs a better title – but it is the old RAC-HA SIG conjoined with my SIG that was about managing Oracle in large, complex or demanding environments, called the AIM SIG – but as it had the word “management” in it, so it scared many IT people away (it was more about *coping* with corporate management than being *about* corporate management). Anyway, we need to re-title it so you buggers realise it is actually a technical SIG aimed at helping us look at at and handle cluster issues and massive-system issues. Yes, it need to be two SIGs again, but the UKOUG is struggling with that, partly as your companies stopped letting you lot come to these meetings. I despair of large corporations, I really do… :-)

So that was a load of fluff about me coming back to the user-based fold and playing a role. I do intend to do some technical posts too, but that take a lot of effort. I have some half written but as I have lost access to the systems I did the real work on {hmmm, some I can still access but, legally, I should not even be TRYing} that make it less-than-easy for me to demonstrate my points with real-world but obfuscated examples. Recreating those examples on play systems is NOT a piece of play-time.

Which leads me on to one odd point I am sure I will come back to:

I’m “retired”.
I do not need to earn.
Do you have an interesting performance/architecture issue with Oracle you are stumped with?
I won’t work for free (after all, some people pay the bills doing this stuff and I DO need to earn enough to go and present/teach and the garden needs my free time). But I am kind of an easy mark at the moment.

Anyway, April 1st and I need to be in Reading for the next AIM SIG so I better finish this off.

So finally….

It’s (worryingly) good to be back.

Martin W

Time for Bed, said Zebedee November 15, 2013

Posted by mwidlake in Private Life, Uncategorized.
Tags: , ,
6 comments

Zebedee has just boinged onto my TV screen on his big-spring lower body and told us all that it is “time for bed”. I concur Zebedee. Good night everyone!

zebedee

For those of you who are not of UK origin and of “a certain age”, Zebedee was a character on the BBC Children’s TV Program The Magic Roundabout. At the end of the program (5 minutes of oddly surreal kids entertainment) more often than not Zebedee would arrive (if not already present as part of the narrative arc) and announce “Time for Bed” and that would be the end of the program. I won’t go into the rest of the details but for those of us who grew up with it, Dougal, Florence, Brian, Zebedee and Ermintrude the cow were large in our childhood. Dylan though was odd… {for anyone wanting a nostalgia trip, try this episode}

Well, for me it is “Time for Bed” children. I started working in IT almost exactly 24 years ago on the 13th November 1989, arriving wearing my new (cheap) suit with (shiny) briefcase clutched in my hand at a (tatty) office block in Bristol owned by the NHS. It housed their regional computer department for the South West of England.
And on the 15th November 2013, I will exit a much plusher office, no suit on, no briefcase, and I will go home and, for many complex reasons, I will “retire”. This does not mean I will stop working in IT and it *may* result in me being a more regular blogger and more of a teacher than I have been for 2 years, but let’s just see, Boys and Girls.

What it does mean is I am stopping doing the 2 hour commute into and out of London every day, wasting my life. It means I am not signing up for 6 months on a single project on the basis of a couple of 30 minute interviews. I am also not signing up for 4 weeks and then finding it becomes six months – as when that happens it is *always* 6 months of misery trying to undo the work of a dedicated hoard of hopeful but under-skilled youngsters being forced to do a job they lack the training for by people who should not be allowed anywhere near management positions (as they see “management” as a way to gauge their importance, polish their egos and garner more cash – rather than as a role that should encompass “nurturing, developing, admonishing and making the most of your team(s)”.). It means I won’t have to play by corporate rules any more, which is good as I am finding it harder and harder and harder to do that without coming home and being really quite mean to my cat, and she does not deserve that as she is friendly and quite forgiving really. Neither does my wife deserve it, but if I am mean to her, she punches me.

What I do hope will happen is that, after a couple of months to get a few things sorted, I will continue to get the odd call I seem to get about once a month where someone would like me to just help out with a specific problem. Occasionally I have been able to say “yes” and it has been interesting and enjoyable, but usually I have been in the middle of one of the situations that resulted in my rant above and have had to say no. If I do get the calls, it will be great. I would like to earn enough so I can still do the conferences and present and meet people and talk about interesting problems and drink beer. If not, I have a lot of other things I want to have a crack at.

So I am not sure what awaits me now. But I am “retired”. I have calculated that from now to my demise I can afford to buy a tin of beans every day until I die. Plus some bread, so beans on toast is possible. That is enough. I did not retire 9 months ago (as was my original plan) so, as a result, now I can afford once a week to buy a very cheap bottle of wine. Something better than Blue Nun. If I get more work, I could step up to beans and cheese on toast. Hmmmm!

Hey Mum, I’m Famous!!! April 28, 2013

Posted by mwidlake in Private Life, Uncategorized.
Tags:
6 comments

I got a mail this week from Richard Harrison:

“Hi Martin
See you made it in to oracle magazine this month.That’s the pinnacle of any oracle professionals career – all downhill from here on in :-)”

I was not aware of my sudden raise to fame, but Richard is right – I’m in this month’s “peer to peer” section, which just gives some details about recent Oracle Ace’s I think. I’d forgotten that I had done a form they sent me before Christmas, answering a set of questions. It is interesting to see what they picked out of all my answers to include.

I think most of us would feel it is nice to see something about ourselves in print (so long as it is not derogatory or critical, of course!), though when I come to think of it, I don’t really know why it is nice – other than the rather self-serving feeling of having our egos polished. And as my friends I drink with would (and probably will) comment, my ego certainly does not need much polishing :-). I’ve of course made it worse by blogging about how famous I now am. Polish, polish, polish.

Don’t worry, my wife stepped in to put me back in my place. “You could tell your mum when you next ring her – not that she’ll be impressed at all!”. Thanks wife. She’s right. My mum will just say “that’s nice” in a tone that in no way convinces me she means it, and will then proceed to talk at me about her new cats, what’s on TV and all the terrible things going on in the world, according to the “Daily Mail” (An utterly horrible and vacuous daily tabloid paper her in the UK).

So thank you for the heads-up Richard. I’m looking forward to the rapid decline of my career as you predict… :-)

Re-forming the Martin Cluster in Norway April 5, 2013

Posted by mwidlake in Friday Philosophy, Meeting notes, Presenting, Uncategorized.
Tags: , ,
7 comments

Later this month, on April 17-20th, I am presenting again at the Norwegian Oracle user group (OUGN) spring conference {modern browsers will allow you to translate any Norwegian if you need to} . I loved it last year, as you can see from my third-day post on it. I’ve been lucky to attend some very good conferences over the last few years but those three days at the OUGN conference last year were, I think, my favourite single event to date. If you are within striking distance of Oslo and can negotiate the time out of the office, I would strongly recommend the event. If you can’t negotiate the time, heck, take a holiday and go anyway :-)

Part of what I really enjoyed about the event was the fact that two of the days are spent on a ferry/cruise ship from Oslo to Kiel and back. And oddly enough, that is what initially put me off going to the conference – I am very susceptible to Sea Sickness. I had no problems though, partly due to the large quantities of travel calm pills I swallowed, partly due to the good weather, but mostly because the talks were excellent and the atmosphere was fantastic. I don’t mean “hey, it was a bit like a holiday” {though in some ways it was as it was such fun} but because the speakers and the attendees can’t escape, at least not without a long swim, everyone is around all day and all evening. It just adds to the whole event. I spoke to more “new” people during that conference than I ever have before.

At most conferences the presentations at the end of the day tend to be less well attended and there can be a feeling of wind-down, especially on the last day. A fair number of people feel the need to make an early exit to get home before the worst of the traffic or are under pressure to get back to the office and just sort out some issue that is pressing. The people around in the evening tend to be the presenters and the conference die-hards and so are usually the same sad old bunch of geezers and gals :-) . However, on the OUGN Boat this is not the case. All sessions tend to be well attended and in the evening nearly everyone is out having something to eat, a few drinks (those North Europeans sure do like the odd drink, but in a relaxed and affable way) and just being sociable.

Over the last few years the conference has developed a reputation for being technically strong too. This is of course partly due to the excellent atmosphere attracting good presenters and the good presenters in turn help make the conference better. popular and well attended – and that in turn attracts presenters. A nice positive feedback loop. I certainly learnt a lot of stuff last year and I cannot think of a poor presentation that I attended. Hmm, maybe one of mine was a little weak :-| . The organisers do an excellent job of helping the presenters feel relaxed and appreciated too. For example, I was nervous about the boat part of the trip to they gave me one slot on the mainland the day before we sailed and suggested I could bail out at Kiel if I was suffering. As a presenter, that sort of consideration counts for a lot. I don’t want or expect to be treated like some minor celebrity and I was not, but for the whole conference I just felt like the organisers appreciated my taking time out from work and flying out to come and present.

The final reason I am looking forward to the event (and thus the odd title) is the re-forming of the Martin Oracle Cluster :-) – this is myself, Martin Nash and Martin Bach. We all do several conferences a year, we all used to go along to the London Oracle Beers and we have become good friends. Other Oracle Martin’s are welcome to join us – At the OUGN last year there was also Martin Büchi, who I had not met before, but this year I think we are the only Martins presenting. We just don’t seem to have managed to re-from the cluster for many months now, partly as Mr Bach returned to Germany.

Martins_1

Martin Nash – Martin Büchi – Martin Bach – Martin Widlake
Thanks to Øyvind Isene for the picture.

I suppose I should mention what I am presenting on? Well, as I mentioned in my last Friday Philosophy, I am concentrating more on introductory presentations. You can see my official agenda here. I am doing:

  • an introductory presentation on Row Level Security, VPD and hiding rows or columns of data {it will be interesting to see how large the audience is for that one!}
  • an introduction to SQL tuning where I cover the absolute basics, but hopefully in a way that allows those new to it (or maybe even not so new) to treat tuning as a logical and sensible process, as opposed to Black Magic of favourite hints and arcane practices
  • my disasters talk. I love giving my disasters talk. I’ve “been in the vicinity” of a lot of disasters and I only ever talk about things I have seen first hand, so no urban myths.

And so the evenings start drawing out (honest!) December 13, 2011

Posted by mwidlake in Uncategorized.
1 comment so far

I know I’ve blogged about this before, but it was early on when very few people read my ramblings, so I am mentioning it again…

For those of us in the Northern Hemisphere, today is the day when the evenings start drawing out, which personally I find a relief as going home in the dark depresses me. Sunset tomorrow will be later than today – by all of a few seconds but, heck, later is later. {If I am out by a day, please don’t tell me – don’t shatter my good mood!}

However, as many of you are probably thinking, shortest day in the Northern hemisphere is not until the 22nd December (it’s the 21st or 22nd, depending on how long ago the last leap year was). Mornings continue to get later until around the 3rd January. It is because the earth is not “standing totally upright” in it’s orbit. If you think of the plane in which the earth circles around the sun as a flat surface, the north pole is at the top of the planet and that there is a pole sticking though the earth that it spins around every day, that pole is leaning back away from the sun today and slightly to one side, like a staggering drunk.

For the timing of sunrise and sunset for the city nearest you, check out this nice website here. This link will show London but you can change that.

The original post is here. It does not say any more but there are a couple of pretty sunset pictures on it.

Of course, if you are in the Southern Hemisphere {say Perth, Australia} then your sunrises have just started getting later by today. But time for the Barby in the evening is still drawing out for a week or two. We can all be happy :-)

Was the Oracle UK logo Blue back in 1991? December 6, 2011

Posted by mwidlake in history, Uncategorized.
2 comments

I think I might be going mad. I was sure that when I joined Oracle UK back in 1991 that the massive “Oracle” sign above the main office on “The Ring” in Bracknell was blue. It was the building that looked like a load of cubes balanced on each other.

As I remember it, the office stationary had “Oracle UK” on it in blue and my business cards were similarly coloured. I can’t find any 20 year old stationary to prove it and I owe Bryn Llewellyn a bottle of wine if I turn out to be wrong.

I’m sure I also remember fellow consultants joking in around 1993, when the annual bonus was particularly poor, that it was due to all the money spent going from blue to red stationary and signs when our UK identity was absorbed into the parent beast…

IOT Part 5 – Primary Key Drawback – and Workaround August 17, 2011

Posted by mwidlake in Architecture, development, performance, Uncategorized.
Tags: , , , ,
18 comments

<..IOT1 – the basics
<….IOT2 – Examples and proofs
<……IOT3 – Significantly reducing IO
<……..IOT4 – Boosting Buffer Cache efficiency
……….>IOT6a – Slowing Down Insert
…………>IOT6(B) – OLTP Inserts

One of the drawbacks of IOTs is that they have to be organised by the primary key of the table. If your table does not have a primary key, it cannot be Index Organized.

I would argue that any table that holds persistent data (ie it is not transient data about to be loaded into the database proper or a temporary working set) should have a Primary Key. If I am working on a system and come across a table without a Primary Key I immediately challenge it. {There are occasional, valid reasons for a persistent table to lack a PK, but I confess I am struggling right now to come up with one – but I digress}. I’m a big fan of database-enforced referential integrity.

The problem is, if you you are making a table into an Index Organized Table so that the records are clustered to match how you process the data, it could well be that the primary key is not related to how you want to order the data. Let me give you an example. {Oh, and for brevity, I’ll put the SQL statements to create the examples at the end of this post}.

mdw11> desc ACCOUNT
 Name                                                  Null?    Type
 ----------------------------------------------------- -------- ----------------------
 ACCO_TYPE                                             NOT NULL NUMBER(2)  ---PKK
 ACCO_ID                                               NOT NULL NUMBER(10) ---PK
 NAME                                                  NOT NULL VARCHAR2(100)
 DATE_1                                                NOT NULL DATE
 NUM_1                                                          NUMBER(2)
 NUM_2                                                          NUMBER(2)

mdw11> desc TRANSACTION_HEAP
 Name                                                  Null?    Type
 ----------------------------------------------------- -------- ----------------------
 TRAN_TYPE                                             NOT NULL NUMBER(2)  ---PK
 TRAN_ID                                               NOT NULL NUMBER(10) ---PK
 ACCO_TYPE                                             NOT NULL NUMBER(2)
 ACCO_ID                                               NOT NULL NUMBER(10)
 CRE_DATE                                              NOT NULL DATE
 VC_1                                                  NOT NULL VARCHAR2(100)
 DATE_1                                                         DATE
 NUM_1                                                          NUMBER(2)
 NUM_2                                                          NUMBER(2)

This is a classic parent-child relationship, each account has a set of transactions. I’ve expanded on my prior example by:

  • changing the parent to be called ACCOUNT and giving it a two-part Primary Key, ACCO_TYPE and ACCO_ID.
  • Changing the child to be called TRANSACTION and given it a Primary Key of TRAN_TYPE and TRAN_ID.
  • In a real system I would create a foreign key from TRANSACTION.ACCO_TYPE,ACCO_ID to the ACCOUNT table primary key.

Note that the Primary Key on the TRANSACTION table is NOT based on the account columns. Maybe in theory the primary key on the transaction table would be the account columns and the cre_date – if the cre_date held a datetime AND two records could not be created on the same second.  If we used a timestamp then you might be able to argue no record would be created in the same fraction of a second – except that often transactions get given a fixed time. Midnight springs to mind (consider when you would add the accrued interest on a savings account). So, a new surrogate Primary Key is intoduced, a transaction type and ID. TRAN_TYPE and TRAN_ID are the primary key of the TRANSACTION table.

I’d say that I see such two-part primary keys more often then single column primary keys these days. Possibly because so many databases receive information from other systems or even applications on the same database.

As before, I create 10,000 parent records (ACCOUNT) and 10,000 random child records (TRANSACTION_HEAP) each day for 100 days. 

Also as before, I want to select information grouped by account. I want all the transactions for an account, not all transactions on a day or for a range of transaction IDs. Hopefully this is a scenario most of you will recognise. 

Selecting a sum of one of the non-indexed columns and a count of records for a given account takes quite a bit of effort on the part of the HEAP table:

select sum(num_1), count(*) from transaction_heap th where acco_type=10 and acco_id=123

SUM(NUM_1)   COUNT(*)
---------- ----------
      1201        116
Elapsed: 00:00:02.68

Execution Plan
---------------------------------------------------------------------------------------
| Id  | Operation          | Name             | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |                  |     1 |    10 |  3466   (1)| 00:00:52 |
|   1 |  SORT AGGREGATE    |                  |     1 |    10 |            |          |
|*  2 |   TABLE ACCESS FULL| TRANSACTION_HEAP |   100 |  1000 |  3466   (1)| 00:00:52 |
---------------------------------------------------------------------------------------

Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
      13929  consistent gets
      13921  physical reads

Of course, it has to do a full table scan as my Primary Key is on two columns that have nothing to do with the query. I can repeat this statement as often as I like, it takes the same number of physical reads and consistent gets as it is not caching the information.

I add an index on the ACCO_TYPE, ACCO_ID and CRE_DATE columns and re-run the query:

select sum(num_1),count(*) from transaction_heap th where acco_type=10 and acco_id=123

SUM(NUM_1)   COUNT(*)
---------- ----------
      1201        116
Elapsed: 00:00:00.01

Execution Plan
---------------------------------------------------------------------------------------------------
| Id  | Operation                    | Name               | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |                    |     1 |    10 |   103   (0)| 00:00:02 |
|   1 |  SORT AGGREGATE              |                    |     1 |    10 |            |          |
|   2 |   TABLE ACCESS BY INDEX ROWID| TRANSACTION_HEAP   |   100 |  1000 |   103   (0)| 00:00:02 |
|*  3 |    INDEX RANGE SCAN          | TRHE_ACCO_CRDA_IDX |   100 |       |     3   (0)| 00:00:01 |
---------------------------------------------------------------------------------------------------

Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
        120  consistent gets
          0  physical reads

I ran it twice to get rid of the parse overhead, but the first time it did a load of physical reads to support those 120 consistent gets.

I could recreate the TRANSACTION_HEAP table as an IOT of course – but it will be organized by the TRAN_TYPE and TRAN_ID columns. That is useless to me. Even if I add a secondary index on the ACCO_TYPE, ACCO_ID and CRE_DATE columns it will at best be no better than the above HEAP table and, because the secondary index will hold rowid guesses and will sometimes have to use the primary key information to walk down the index, it will be worse. {I am not sure I have explained that bit yet about row guesses. Post 6?}

So, if you want the information organized in an order that is not helped by the Primary Key of the table, an IOT is useless to you. You cannot achieve that physical record grouping by the IOT method.

I am going to do something else though. I’m going to sort of change the rules to work around the issue.

As far as the physical implementation is concerned, a Primary Key is in effect just a unique index and two rules. The rules are that all the columns in the Primary Key must be mandatory and there can only be one PK on a table. I can have as many unique indexes as I like, so long as the key combinations lead to no duplicate rows. I can alter my Primary Key – it is not set in stone.

Before I go any further I am going to stress that I am about to abuse the concept of the Primary Key. I’d need to do a seperate blog to fully justify saying what a Primary Key is, but part of the concept is that no column must be derivable from other columns in the PK and it must be the minimum number of columns required to make the key unique.

We want to group the data by the account columns and the creation date. So let’s define a Primary Key that is ACCO_TYPE, ACCO_ID, CRE_DATE and whatever else we need to guarantee the key is unique. In our case that would be TRAN_TYPE and TRAN_ID – the current Primary Key! If I knew I would always want all records for the account, I could drop the CRE_DATE out of my fake Primary Key, but I know that the creation date is very often important. You may want activity for the last month, last quarter, a stated date or even an exact datetime. For all those cases, including the CRE_DATE column is highly beneficial.

So, I create TRANSACTION_IOT below and populate it with data.

desc transaction_iot
 Name                                                        Null?    Type
 ----------------------------------------------------------- -------- --------------
 TRAN_TYPE                                                   NOT NULL NUMBER(2)
 TRAN_ID                                                     NOT NULL NUMBER(10)
 ACCO_TYPE                                                   NOT NULL NUMBER(2)
 ACCO_ID                                                     NOT NULL NUMBER(10)
 CRE_DATE                                                    NOT NULL DATE
 VC_1                                                        NOT NULL VARCHAR2(100)
 DATE_1                                                               DATE
 NUM_1                                                                NUMBER(2)
 NUM_2                                                                NUMBER(2)

--
--

OWNER    TABLE_NAME          NUM_ROWS      BLOCKS AVG_L GLS ULS LST_ANL      PRT  SAMP_SIZE
-------- -------------- ------------- ----------- ----- --- --- ------------ --- ----------
MDW      TRANSACTION_IO      1000,000                94 YES NO  160811 23:05 NO     1000000
         T
INDEX_NAME      TYP PRT UNQ BL     L_BLKS   DIST_KEYS       CLUSTF     LB_KEY     DB_KEY LST_ANL
--------------- --- --- --- -- ---------- ----------- ------------ ---------- ---------- ------------
TRIO_PK         IOT NO  UNI  2     21,433    1058,381            0          1          1 160811 23:05
TRIO_TRAN_UQ    NOR NO  UNI  2      4,386    1000,000      999,405          1          1 160811 23:05

INDEX_NAME                   TABLE_NAME       PSN COL_NAME
---------------------------- ---------------- --- ------------------------------------------------
TRIO_PK                      TRANSACTION_IOT  1   ACCO_TYPE
TRIO_PK                      TRANSACTION_IOT  2   ACCO_ID
TRIO_PK                      TRANSACTION_IOT  3   CRE_DATE
TRIO_PK                      TRANSACTION_IOT  4   TRAN_TYPE
TRIO_PK                      TRANSACTION_IOT  5   TRAN_ID
TRIO_TRAN_UQ                 TRANSACTION_IOT  1   TRAN_TYPE
TRIO_TRAN_UQ                 TRANSACTION_IOT  2   TRAN_ID

Now let’s select our data from that IOT.

select sum(num_1),count(*) from transaction_IOT th where acco_type=10 and acco_id=123

SUM(NUM_1)   COUNT(*)
---------- ----------
      1030         97
Elapsed: 00:00:00.00

Execution Plan
-----------------------------------------------------------------------------
| Id  | Operation         | Name    | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |         |     1 |    10 |     5   (0)| 00:00:01 |
|   1 |  SORT AGGREGATE   |         |     1 |    10 |            |          |
|*  2 |   INDEX RANGE SCAN| TRIO_PK |   100 |  1000 |     5   (0)| 00:00:01 |
-----------------------------------------------------------------------------

Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
          5  consistent gets
          0  physical reads

5 consistent gets. It has walked down the IOT and scanned 3 blocks to collect that data. Our IOT based on an abused Primary Key does the job of supporting range scans efficiently, with the benefits to the Block Buffer Cache I refered to in IOT4

That “Primary Key” I created is NOT a real Primary key. It is not the minimum number of columns I need to uniquely identify a column. My Primary key is on ACCO_TYPE, ACCO_ID, CRE_DATE,TRAN_TYPE and TRAN_ID – the account, the datetime of the transaction and the transaction. What if I was to alter the datetime by a second? I could create a record with the same account, the same transaction_id as an existing record but a second into the future. That is just wrong. After all, the whole point of the TRAN_TYPE and TRAN_ID is to uniquely identify a record. If created the new record I stated above, there would be two records for the one TRAN_TYPE/TRAN_ID.

I protect against this ability to create incorrect records by creating a UNIQUE KEY against the table also, against columns TRAN_TYPE and TRAN_ID. This is unique index TRIO_TRAN_UQ as displayed in the information above. A Primary Key is usually the referenced parent of any referential integrity, ie foreign keys, between this table and any children. However, a Unique Key can also be the target of Referential Integrity. I cannot create a record in TRANSACTION_IOT with the same TRAN_TYPE/TRAN_ID as already exists due to this unique constraint:

insert into transaction_iot_p
values
(2,163 -- existing transaction type and id
,10,11111
,sysdate,'ASCAFWEWEHGWSHERJH',SYSDATE,7,7)
/

insert into transaction_iot_p
*
ERROR at line 1:
ORA-00001: unique constraint (MDW.TIP_TRAN_UQ) violated

Elapsed: 00:00:00.34

So, I have my IOT to support querying code and I have my Unique Constraint to police my original Primary Key and be used as the target for any Foreign Key requirements I might need. This is not a perfect solution – the design will look a little strange to anyone who looks at this database and the Unique Key is supported by a secondary index on an IOT which can have some issues. But it does work.

My “primary key” is no longer a true Primary Key. It is just a tool for allowing me to organise the data physically in a way that will support my application. That is what I meant about changing the rules.

I am willing to abuse a Primary Key in this way because of the performance benefits. It is a solution for a system where most of the query access is against a set of records which would be scatter-gunned across a table if you did not use some sort of physical grouping. If you are reading this and thinking “oh, I am not sure about you doing that to a Primary Key Martin” then you are probably OK to consider this solution. If you can’t see a problem with it then you are either very used to turning off referential integrity and understand the consequences – or you simply do not understand what RI does for your database. If you are in the latter camp, do not even consider doing this. If you are one of those people who works on data warehouse and for whom is it just part of the DW process to turn off RI as that is what you do for data warehouses – DON’T do this!

OK, I’m nearly at the end of this topic but I want to touch on partitioning. You can range partitition an Index Organized Table from 9i I think. It is certainly supported in Oracle 10 upwards. Partitioning is important in this technique because a unique index must contain the partition key if the index is to be locally partitioned – otherwise the index must be global, ie the one index object references all the partitions across the table.

Below is my table creation statement for the IOT organized by the account, creation date and transaction. The table is ranged partitioned by CRE_DATE, into months.

create table transaction_IOT_P
(tran_type number(2)     not null
,tran_id   number(10)    not null
,acco_type number(2)     not null
,acco_id   number(10)    not null
,cre_date  date          not null
,vc_1      varchar2(100) not null
,date_1    date
,num_1     number(2)
,num_2     number(2)
,constraint tip_pk primary key(ACCO_TYPE,ACCO_ID,CRE_DATE,TRAN_TYPE,TRAN_ID) 
--  using index tablespace index_01
,constraint tip_tran_uq unique (TRAN_TYPE,TRAN_ID)
  using index tablespace index_01
)
organization index
tablespace data_01
partition by range  (cre_date)
(partition rm20110601 values less than (to_date('01-06-2011','DD-MM-YYYY')) 
  tablespace data_01
,partition rm20110701 values less than (to_date('01-07-2011','DD-MM-YYYY'))
  tablespace data_01
,partition rm20110801 values less than (to_date('01-08-2011','DD-MM-YYYY'))
  tablespace data_01
,PARTITION RMTOP  VALUES LESS THAN (MAXVALUE)
  tablespace USERS
)
/

You can see the definition of my fake Primary Key and the fact that it does not have a tablespace defined for it – as the ‘organization index’ statement lower down causes the table to be an IOT and the segment will go into the “table” tablespace.
I then state my Unique Index to police the integrity of my table – TIP_TRAN_UQ
I then state the partition clause, ‘partition by range (cre_date)’ followed by my initial partition definitions. It’s as simple as that to partition an IOT.

What gets created? A set of four segments for the IOT, which are primary key index segments of course, not table segments:

@seg_dets
Enter value for seg_name: tip_pk
Enter value for owner: mdw

OWNER    SEG_NAME        SEG TS_NAME     BYTES_K    BLOCKS exts   INI_K   NXT_K
-------- --------------- --- -------- ---------- --------- ---- ------- -------
MDW      TIP_PK RM201106 IP  DATA_01      45,056     5,632   59      64    1024
         01
MDW      TIP_PK RM201107 IP  DATA_01      60,416     7,552   74      64    1024
         01
MDW      TIP_PK RM201108 IP  DATA_01      61,440     7,680   75      64    1024
         01
MDW      TIP_PK RMTOP    IP  USERS        34,816     4,352   49      64    1024

Note that the SEG (type) is “IP” – my script decodes the type into a short mnemonic and IP is Index Partition. You can see the tablespaces those segments are in and the size of the segments. What about that unique index I created?

@seg_dets
Enter value for seg_name: tip_tran_uq
Enter value for owner: mdw

OWNER    SEG_NAME        SEG TS_NAME     BYTES_K    BLOCKS exts   INI_K   NXT_K
-------- --------------- --- -------- ---------- --------- ---- ------- -------
MDW      TIP_TRAN_UQ     IND INDEX_01     35,840     4,480   50      64    1024

It is a single segment, a normal index. I cannot have it as a locally partitioned index as it is a unique index and lacks the partitioning key in it’s definition.

This could be a problem. The usual reason you partition a table is because it is too large to comfortably be held as a single segment {and also for the benefit of partition exclusion, but you don’t usually need that on small tables!}. This means that the global index to support that primary key is going to be large. Now, I made a “mistake” when I created my partitioned IOT – I did not create a partition for this month, some data has gone into the MAXVALUE partition (see the size of the segment above, 34K and 49 extents). If I split that last partition to create a new partition for this month and a new MAXVALUE partition, I will invalidate the global index and I will have to rebuild it. Very large indexes can take a long time and a heck of a lot of temporary space to gather and sort the data. That could be an ongoing maintenance nightmare.

In a recent implementation I did using IOTs I did not create a global unique index to replace the original foreign key. I create a non-unique, locally partitioned index to support some queries using those columns and the table had no children so no Foreign Keys were needed. But there was something else I needed to do as I had removed the referential integrity rules for that table. Remember I sad I am a fan of database enforced referential integrity? Now I “know” the application will not create data that will break the removed Primary Key rule, I “know” I documented what I had done. And I know that in 12 months time there will almost certainly be data that will have duplicate values for that Primary Key if it is not enforced somehow, because it always happends. I need to implement a little script to regularly check for duplicate TRAN_TYPE/TRAN_ID conmbinations being created. If you remove RI from a relational database, you should replace it in some way. Otherwise, you will pretty soon have a non-relational database.

That’s it for this topic. The below is my example script for creating most of the above, in case anyone wants it or wants to verify what I have said.

-- test_iot2.sql
-- create test tables to show how you can work around the PK issue and
-- partition an IOt - and the possible impact on my PK workaround.
spool test_iot2.lst
--
set feed on timi on pause off
--
drop table account purge;
drop table transaction_heap purge;
drop table transaction_iot purge;
drop table transaction_iot_p purge;
--
-- create 10,000 parent records
create table mdw.account
(ACCO_type  number(2)     not null 
,ACCO_id       number(10)    not null 
,name     varchar2(100) not null
,date_1   date          not null
,num_1    number(2)
,num_2    number(2) 
,constraint ACCO_pk primary key(ACCO_type,ACCO_id) 
 using index tablespace index_01
)
tablespace data_01
/
insert into account
select 10 
,rownum
,dbms_random.string('U',mod(rownum,10)+50)
,sysdate-(mod(rownum,500)+1000)
,mod(rownum,99)+1
,trunc(dbms_random.value(0,100))
from dual connect by level <= 5000
/
insert into account
select 15 
,rownum
,dbms_random.string('U',mod(rownum,10)+50)
,sysdate-(mod(rownum,500)+1000)
,mod(rownum,99)+1
,trunc(dbms_random.value(0,100))
from dual connect by level <= 5000
/
--
-- create the table to hold the children as a heap table
create table transaction_heap
(tran_type number(2)     not null
,tran_id   number(10)    not null
,ACCO_type number(2)     not null
,ACCO_id   number(10)    not null
,cre_date  date          not null
,vc_1      varchar2(100) not null
,date_1    date
,num_1     number(2)
,num_2     number(2)
,constraint trhe_pk primary key(tran_type,tran_id) 
 using index tablespace index_01
)
tablespace data_01
/
--
create index trhe_ACCO_crda_idx
on transaction_heap(ACCO_type,ACCO_id,cre_date)
tablespace index_01
/
-- populate the Heap table
-- 100 days, 10000 people
declare
v_num number :=10000; -- number of people
v_str varchar2(60);
begin
dbms_output.put_line (to_char(SYSTIMESTAMP,'HH24:MI:SS.FF'));
for i in 1..100 loop --days to do
  v_str:=dbms_random.string('U',60);
  insert into transaction_heap
    (tran_type,tran_id,ACCO_type,ACCO_id,cre_date,vc_1,date_1,num_1,num_2)
  select mod(rownum,3)+1 
   ,((i-1)*v_num)+rownum
   , 5+(trunc(dbms_random.value(1,3))*5)
   ,trunc(dbms_random.value(1,v_num/2))
   ,sysdate-(100-i) + (rownum/(60*60*24) )
   ,substr(v_str,1,51+mod(rownum,10))
   ,sysdate-(100-i) + ((mod(rownum,30)+1)/3)
   ,mod(rownum,20)+1
   ,mod(rownum,99)+1
  from dual connect by level <=v_num;
end loop;
dbms_output.put_line (to_char(SYSTIMESTAMP,'HH24:MI:SS.FF'));
end;
/
--
--
--
create table transaction_IOT
(tran_type number(2)     not null
,tran_id   number(10)    not null
,ACCO_type number(2)     not null
,ACCO_id   number(10)    not null
,cre_date  date          not null
,vc_1      varchar2(100) not null
,date_1    date
,num_1     number(2)
,num_2     number(2)
,constraint trio_pk primary key(ACCO_type,ACCO_id,cre_date,tran_type,tran_id) 
--  using index tablespace index_01
,constraint trio_tran_uq unique (tran_type,tran_id)
  using index tablespace index_01
)
organization index
tablespace data_01
/
--
-- populate the IOT table
-- 100 days, 10000 people
declare
v_num number :=10000; -- number of people
v_str varchar2(60);
begin
dbms_output.put_line (to_char(SYSTIMESTAMP,'HH24:MI:SS.FF'));
for i in 1..100 loop --days to do
  v_str:=dbms_random.string('U',60);
  insert into transaction_IOT
    (tran_type,tran_id,ACCO_type,ACCO_id,cre_date,vc_1,date_1,num_1,num_2)
  select mod(rownum,3)+1 
   ,((i-1)*v_num)+rownum
   , 5+(trunc(dbms_random.value(1,3))*5)
   ,trunc(dbms_random.value(1,v_num/2))
   ,sysdate-(100-i) + (rownum/(60*60*24) )
   ,substr(v_str,1,51+mod(rownum,10))
   ,sysdate-(100-i) + ((mod(rownum,30)+1)/3)
   ,mod(rownum,20)+1
   ,mod(rownum,99)+1
  from dual connect by level <=v_num;
end loop;
dbms_output.put_line (to_char(SYSTIMESTAMP,'HH24:MI:SS.FF'));
end;
/
create table transaction_IOT_P
(tran_type number(2)     not null
,tran_id   number(10)    not null
,ACCO_type number(2)     not null
,ACCO_id   number(10)    not null
,cre_date  date          not null
,vc_1      varchar2(100) not null
,date_1    date
,num_1     number(2)
,num_2     number(2)
,constraint tip_pk primary key(ACCO_type,ACCO_id,cre_date,tran_type,tran_id) 
--  using index tablespace index_01
,constraint tip_tran_uq unique (tran_type,tran_id)
  using index tablespace index_01
)
organization index
tablespace data_01
partition by range  (cre_date)
(partition rm20110601 values less than (to_date('01-06-2011','DD-MM-YYYY')) 
  tablespace data_01
,partition rm20110701 values less than (to_date('01-07-2011','DD-MM-YYYY'))
  tablespace data_01
,partition rm20110801 values less than (to_date('01-08-2011','DD-MM-YYYY'))
  tablespace data_01
,PARTITION RMTOP  VALUES LESS THAN (MAXVALUE)
  tablespace USERS
)
/
-- populate the IOT_P table
-- 100 days, 10000 people
declare
v_num number :=10000; -- number of people
v_str varchar2(60);
begin
dbms_output.put_line (to_char(SYSTIMESTAMP,'HH24:MI:SS.FF'));
for i in 1..100 loop --days to do
  v_str:=dbms_random.string('U',60);
  insert into transaction_IOT_P
    (tran_type,tran_id,ACCO_type,ACCO_id,cre_date,vc_1,date_1,num_1,num_2)
  select mod(rownum,3)+1 
   ,((i-1)*v_num)+rownum
   , 5+(trunc(dbms_random.value(1,3))*5)
   ,trunc(dbms_random.value(1,v_num/2))
   ,sysdate-(100-i) + (rownum/(60*60*24) )
   ,substr(v_str,1,51+mod(rownum,10))
   ,sysdate-(100-i) + ((mod(rownum,30)+1)/3)
   ,mod(rownum,20)+1
   ,mod(rownum,99)+1
  from dual connect by level <=v_num;
end loop;
dbms_output.put_line (to_char(SYSTIMESTAMP,'HH24:MI:SS.FF'));
end;
/
commit;
--
exec dbms_stats.gather_table_stats(ownname=>USER,tabname=>'ACCOUNT')
exec dbms_stats.gather_table_stats(ownname=>USER,tabname=>'TRANSACTION_HEAP')
exec dbms_stats.gather_table_stats(ownname=>USER,tabname=>'TRANSACTION_IOT')
exec dbms_stats.gather_table_stats(ownname=>USER,tabname=>'TRANSACTION_IOT_P')
--
select * from transaction_iot_p
where rownum < 10
/
insert into transaction_iot_p
values
(2,163 -- existing transaction type and id
,1,11111
,sysdate,'ASCAFWEWEHGWSHERJH',SYSDATE,7,7)
/
insert into transaction_iot_p
values
(3,163 -- new transaction type and id
,1,11111 -- but the whole of the rest of the record is the same.
,sysdate,'ASCAFWEWEHGWSHERJH',SYSDATE,7,7)
/
--
BEGIN
dbms_output.put_line (to_char(SYSTIMESTAMP,'HH24:MI:SS.FF'));
END;
/
--
spool off

What, me? An OakTable member? August 23, 2010

Posted by mwidlake in Uncategorized.
Tags: , ,
22 comments

The title rather gives it away, but I have been invited to become a member of the OakTable network. For anyone not aware of the OakTable, it is a group of some of the very best Oracle practitioners around and you would recognise many of the names in the group. Most of them also present at conferences around the globe and set up the Oak Table challenge at various of these venues, where they try and address any oracle-based question you might have.

All of them are very bright, all are very knowledgeable.

Which is where I come a little unstuck. Without wanting to sound like some vapid actor at a Hollywood award ceremony decrying “I am so unworthy of this nomination”, whilst secretly thinking “I so deserve this”… Well, my initial thought when receiving the invite was “I am so unworthy…”. I’ve had the weekend to think about it though. And I still think “I am so unworthy…”

I’m actually on record as suggesting that we might also need a “Formica Table”, though the only online reference to it I can now find {I MUST put my old presentations on my web site} is from the archives of Andrew Clark’s Radiofreetooting blog about a presentation I did at the UKOUG in 2007. {Follow the link and search for Widlake or Formica, it is way down the bottom}. If you can’t be bothered looking at the original, Andrew said this:

I was particularly taken with the Formica Table. This would be a forum where “not bad” DBAs answered questions which fitted 95% of all scenarios; sort of an Oak Table For The Rest Of Us.

I think his quote of me was actually better than the original. The idea was that the real experts on the Oak Table {is it actually one word guys? “OakTable”!?} deal with the hard, tricky, complex issues and this secondary formica table could deal with the rest of the world. Because I could just about cope with formica level. The intention being, of course, that I would sit on said plastic-laminate-coated-chipboard table.

Am I being falsely modest here? I do not think I am. I know I am good at what I do and I know I have achieved some impressive things. I also know most people who employ me ask me to stay longer (and I usually do). But I am realistic. I’m very good but I am not fantastic (at Oracle anyway :-) ). And no way as capable as many OakTable members. But the people on the OakTable have some other things in common. From the home page of the website:

The OakTable network is a network for the Oracle scientist, who believes in better ways of administering and developing Oracle based systems.

The impression I get from spending some time with the handful of members of the OakTable that I already know is that they generally all feel that you need to not only be knowledgeable about Oracle (or whatever area of knowledge you are interested in) but you need to be able to demonstrate and show that the knowledge is real. You create test cases and try things out. Just saying “you should use a large block size for data warehouses” is just not really enough, it is so much more powerful if you can say why you think this is so and then produce test cases to show that. And if someone produces a test case to show the opposite, well you need to reconsider. It is what is at the core of the scientific method. You test things and have to adapt or change if the tests refute your theory. If someone will not provide test cases or real-world examples to support their facts, they are in fact, opinions. Which is fine, just don’t sell them as facts.

The other common thread is a willingness (and perhaps a worrying compulsion) to teach. I’ve seen many of the OakTable present and I know a lot of them do courses all over the globe. Sometimes it is paid work, often it is not, it is done as a benefit to the community. That is nearly always the case with user group presentations.

I’m figuring that is why I’ve been invited to join. Technically, most if not all the OakTable are a step or three better than me and I reserve my right to respect that. But I really believe in demonstrating what you think is going on with Oracle is what is really going on and I have an almost worryingly compulsive willingness to teach.

So, have I turned down the invite? Are you kidding!?! It’s great to be invited and I really look forward to having more to do with this bunch of talented and helpful people. And I am also looking forward to contributing my little bit to the group and, through it, to the wider Oracle community.

It is slightly ironic that I have been asked to join a group of people right now who are characterised by their willingness and drive to scientifically investigate and then disseminate information on Oracle-based technology when I have spent the last month doing nothing of the sort. I have been digging ditches, cleaning out ponds, chopping down trees and doing major DIY, all of which I am utterly unsuited to but I enjoy. So I now feel obliged to stop that, pick up a keyboard and continue to investigate the edges of my ignorance. I’ll try and keep you informed of progress.

Oh, and I have another problem now. How do I get the OakTable Icon onto this blog? Somewhere on the right I think…

Friday Philisophy – To Manage or to Not Manage March 26, 2010

Posted by mwidlake in Friday Philosophy, Management, Uncategorized.
Tags: ,
6 comments

Recently a friend of mine Graham Oaks blogged about his decision to step back from management and return to the Technical Coal Face.

I made a similar decision 3 or 4 years back, so I have a lot of empathy for his position and his decision. I found that to do the job of a manager takes up a lot more time, effort, patience and emotional effort than I had realised. Team leading is bad enough, having to coordinate the efforts of a half dozen people and sorting out the myriad issued they throw your way. Being in charge of multiple teams, responsible for strategy, dealing with staff development and moral, being a buffer against HR and having to deal with the politics created by people who WANT to be managers and wield power is more than a full-time job. Trying to hold onto a technical element as well, I found I could only manage it by doing the technical job as a “hobby”, in my own time. It was just too much to keep going year after year.

I had to chose. Give up the technical to give me enough personal resource to remain a manager and get better at it, or stop being a manager and start re-gaining my technical skills. I chose the latter.

Since I made my decision 3 years ago, I have met several people who have made the same conscious decision to step away from management and return to a more technical role. You may stand to earn more as a manager {which is something I objected to before being a manager and I still object to having been one – it should be possible to earn the same doing either} but for some of us it is not enough to make losing the hands-on work a sacrifice worth making.

One of the points Graham makes in his blog is that his spell as a manager has given him an appreciation of the challenges of management and the particular hells and stresses of the role. I think this is something that people who have never been managers have trouble really understanding.

I was working with a guy a couple of years ago and he was telling me how much of “a Moron” his boss was. In fact, he felt his current boss was even more of a moron than his previous boss. He then confessed that all of his bosses had been morons. “What, every single one of them?” I asked. Yes, absolutely all of them. That struck me as incredibly unfortunate, that every single one of these managers (and he’d had a lot as he moved between teams and positions on a regular basis), most of whom had come up through the technical ranks, were all Morons. I pointed out this unfortunate coincidence and wondered if there might actually be a common factor with all of these managers. He told me there was; They were all Morons.

He himself had never been a manager. He said he was too smart. Not smart enough to get what I was hinting at with the common factor suggestion though.

Obviously, some managers are poor at what they do; there are poor people in every job. But something I took away from my time being a manager is a lack of empathy for anyone saying all managers are a waste of time when they have never done the job themselves.

After all, I doubt there is any job where just doing it means you are an idiot.

Except Sys Admins – They are all idiots :-) (ducks behind server).

Friday Philosophy – Software being Good is not Good Enough February 5, 2010

Posted by mwidlake in Friday Philosophy, Uncategorized.
Tags: ,
4 comments

In a previous Friday Philosophy on In Case of Emergency I mention that something being simply a “Good Idea” with technology is not good enough. Even being a GREAT idea is not enough. It also has to be:

  1. Easy and simple to use. After all, using that funny stick thing behind your steering wheel in a car, to indicate which direction you are turning, seems to be too much of an effort for many people. If installing some bit of softare or running a web page is more than a little bit of effort, most people will not bother.
  2. Quick. No one has patience anymore, or spare time. This will probably be the fourth year in a row I do not plant any vegetables in our garden as I need to spend a day or two clearing and digging over the spot for said veg. You can’t beat home-grown veg. Similarly, I won’t use a web pages that takes as long to load as it does to plant a carrot seed.
  3. Known about. There could be a really fantastic little program Out There that allows you to take a screen shot, add a title and comment and pastes it straight into a document for you, converting to half a dozen common formats on the fly. But I do not know about it. { ScreenHunter is pretty good, I have to say, and when I have shown it to people a lot of them like it}.
  4. Popular. This is not actually the same as “known about”. For a stand-alone application to be good for you, you just need to know where it exists. Like maybe a free building architecture package. Whether thousands of people use it is moot, so long as you can get your extension drawings done in it with ease, that makes it great. But something that relies on the community, like a service to rate local eataries, unless lots of people use it and add ratings, well who cares. There are dozens (if not hundreds) of such community “good ideas” started every day but unless enough people start to use it, it will fizzle out, as the vast majority of them do.

Point 4 is highly relevant to “In Case Of Emergency” as it is simple, quick and relativley known about. It just needs to be ubiquitous.

I became very aware of point 3 a few years ago and also of the ability for very clever people to be sometimes very stupid when it comes to dealing with their fellow humans.

I was working on a database holding vast quantities of DNA information. If you don’t know, DNA information is basically represented by huge long strings of A, C, T and G. So something like AACTCGTAGGTACGGGTAGGGGTAGAGTTTGAGATTGACTGAGAGGGGGAAAAATGTGTAGTGA…etc, etc, etc. These strings are hundreds, thousand, hundreds of thousands of letters long. And Scientists like to search against these strings. Of which there are millions and millions. Not for exact match mind, but kind-of-similar, fuzzy matches, where for example 95% of the ACTGs match but some do not. It’s called a BLAST match.

Anway, suffice to say, it takes a lot of compute power to do this and a fair amount of time to run. There was a service in America which would allow you to submit a BLAST query and get the answer in 20 minutes or so {I have no idea how fast it is now}. 

Some extremely clever chaps I had the pleasure of working with came up with a faster solution. Same search, under 5 seconds. Now that is GREAT. We put together the relevant hadware and software and started the service.  Now I thought it went beyond Good or even Great. It was Amazing (and I mean it, I was amazed we could do a fuzzy search against a billion such strings in 2, 3 seconds using a dozen or so PC-type servers).

No one used it. This was because almost no one knew about it and there was already this slow service people were used to using. People who used the old service never really thought to look for a new one and the chances were they would not have found ours anyway.

I pushed for more to be made of this new, faster service, that it should be advertised to the community, that it should be “sold” to people (it was free to use, by “sold” I mean an attempt made to persuade the scientific community it was worth their while investigating). The response I was given?

“If the service is worth using, people will come and use it”.

No they won’t. And indeed they didn’t. It was, I felt, a stupid position to take by an incredibly inteligent person. How were people to know it existed? Were they just supposed to just wake up one morning knowing a better solution was out there? The internet pixies would come along in the night and whisper about it in your ear? In the unlikely event of someone who would be interested in it just coming across it, were they then going to swap to using it? After all no one else seemed to know about it and it was 2 orders of magnitude faster, suspiciously fast, how could it be any good?

The service got shut down as it was just humming in the corner consuming electricity. No one knew it existed, no one found it, no one came. I can’t but help wonder how much it could have helped the scientific community.

There must be thousands of other “failed” systems across the IT world that never took off just because the people who could use it never knew it existed. Depressing huh?

Follow

Get every new post delivered to your Inbox.

Join 156 other followers