jump to navigation

Saturday Philosophy – The unbelievably small world of VLDBs June 12, 2010

Posted by mwidlake in VLDB.
Tags: , , ,
9 comments

Yesterday I posted about the potential for a Oracle in Science community within the UK Oracle user group {and wider for that matter, there is after all a world Oracle Life Science community but it is currently less vibrant than it was, sadly}.

My friend and occasional drinking partner Peter Scott replied to say he felt there was “a place for a SIG for stonking great databases” {now wouldn’t SGDB be a better TLA than VLDB? :-) }.

Well, I would agree but for one small detail. An apparent lack of anyone willing to be part of the community.

When I was building a very considerable VLDB {and I’m sorry I keep going on about it, I’ll try and stop soon} back in the early to mid 2000’s I seemed to be working in a vacuum of information, let alone prior experience. Yes, there was stuff in the Oracle manuals about how big things could theoretically be made and some vague advice on some aspects of it, but an absolute lack of any visible Oracle customers with anything even approaching the sizes I was contemplating. 2TB was about the limit and I was already way beyond that. Was this because I really was pushing the boundaries of database size? Well, I have since found out that whilst I was up there just behind the leading edge, there were several databases much, much bigger than mine and others already envisioned that might hit the Petabyte level, let alone Terabyte.

The thing is, no one would speak about them. At all.

We were left to do it all pretty much from scratch and it would not have been possible if I had not spent years building up with VLDBS as the definition of a VLDB size increased, plus of course cracking support by the other DBAs and Systems Admins around me. And to be fair, Oracle Corp helped us a lot with our efforts to build these massive databases. Interestingly, one Oracle Consultant would regularly tell me that our systems really were not so unusually big and there were plenty larger. He usually said this when I asked, exasperatedly as something else failed to scale, if Oracle had every tested things at this level :-). But despite constantly asking to meet with these people with massive systems, so we could exchange war stories and share advice, and being promised such contacts by Oracle, they never materialized except for CERN – who we already talked to as a fellow scientific organisation – and Amazon, who it turns out did things in a very different way to us {but it was really good to talk to them and find out how they did do their big databases, thanks guys}. Both were at the same scale or just behind where we were.

This is because most of the people with massive oracle databases will not talk about them as they are either run by the largest financial organisations, are to do with defense or in some other way just not talked about. In his comment Peter refers to a prior client with an OLTP-type system that is now around the PB scale. I would be pretty sure Peter can’t say who the client is or any details about how the system was designed.

So although I think there is a real need for a “stonking great databases” forum, I think there is a real problem in getting a user community of such people/organisations together. And if you did, none of the members would be allowed to say much about how they achieved it, so all you could do would be sit around and brag about who has the biggest. There is an Oracle community about such things, called the Terabyte Club, but last I knew it was invite-only and when I managed to get invited, it turned out that mine was biggest by a considerable margin, so I was still not meeting these elusive groups with 500TB databases. Maybe there is an Oracle-supported ├╝ber database society but as I never signed the official secrets act might not have been eligible to play.

If I am wrong and anyone does form such a user group (or is in one!) I would love to be a member and I would strive to present and help.

I’ll finish with what appears to be a contradiction to what I have just written. There already is a UKOUG User Group that deals with large systems and I chair it – the Management and Infrastructure SIG. {sorry, the info on the web page could do with some updating}. Part of what we cover is VLDBs. But we also cover Very Many DataBases (companies with thousands of instances) and Very Complex DataBases plus how you go about the technical and management aspects of working in a massive IT Infrastructure. It might be that we could dedicate a meeting to VLDBs and see how it goes, but I know that whilst many who come along are dealing with database of a few TB, no one is dealing with hundreds of TB or PB database. Either that or they are keeping quiet about it, which takes us back to my main point. The MI SIG is probably the closest to a VLDB SIG we have in Europe though, and is a great bunch of people, so if you have a VLDB and want to meet some fellow sufferers, we have our next meeting on 23rd September in the Oracle City office.

Follow

Get every new post delivered to your Inbox.

Join 158 other followers