jump to navigation

Oracle Exadata – does it work? July 26, 2009

Posted by mwidlake in development, VLDB.
Tags: ,
12 comments

Does Oracle Exadata work? 

That is a tricky question as, unlike most Oracle database features, you can’t download it and give it a test.

You can try out partitioning, bigfiles, oracle Text, InterMedia {sorry, Multimedia),} all sorts of things by downloading the software. You can even try out RAC pretty cheaply, using either VM-Ware or a couple of old machines and linux, and many hundreds of Oracle techies have. The conclusion is that it works. The expert conclusion is “yes it works, but is it a good idea? It depends {my fees are reasonable}” :-).

I digress, this ability to download and play allows Oracle technophiles to get some grounding in these things, even if their employer is not currently looking to implement them {BTW how often do you look at something in your own private time that your company will not give you bandwidth for – only to have them so very interested once you have gained a bit of knowledge? Answers on a postcard please…}.

Exadata is another beast, as it is hardware. I think this is an issue.

I was lucky enough to get John Nangle to come and present on Exadata at the last UKOUG Management and Infrastructure meeting, having seen his talk at a prior meeing. John gave a very good presentation and interest was high. I have also seen Joel Goodman talk {another top presenter}, so I understand the theory. I have to say, it looks very interesting, especially in respect of what is ,perhaps, my key area of personal expertise, VLDB. Databases of 10’s of terabytes.

I don’t plan to expand here on the concepts or physical attributes of Exadata too much, it is enough to say that it appears to gain it’s advantage via two main aspects:-

  • Intelligence is sited at the “disc controller” level {which in this case is a cheap 4-cpu HP server, not really the disc controller} which basically pre-filters the data coming off storage so only the data that is of interest is passed back to the database.  This means that only blocks of interest are chucked across the network to the database.
  • The whole system is balanced. Oracle have looked at the CPU-to-IO requirements of data warehouses and decide what seems to be a good balance, they have implemented fast, low latency IO via infiniband and made sure there are a lot of network pipes from the storage up the stages to the database servers. That’s good.

The end result is that there is lots of fast, balanced IO from the storage layer to the database and only data that is “of interest” is passed up to the database.

It all sounds great in theory and Oracle Corp bandy around figures of up to 100 times (not 100%, 100 times) speedup for datawarehouse activity, with no need to re-design your implementation. At the last M&I UKOUG meeting there was also someone who had tried it in anger and they said it was 70 times faster. Unless this was a clever plant by Oracle, that is an impressive independent stated increase.

I am still very interested in the technology, but still sceptical. After all, RAC can be powerful, but in my experience it is highly unlikely that by dropping an existing system onto RAC you will get any performance (or high availability) increase. In fact, you are more likely to just make life very, very difficult for yourself. RAC works well when you design your system up-front with the intention of working on the same data on the same nodes. {Please note, this is NOT the oft-cited example of doing different work types on different nodes, ie data load on one node, OLTP on another and batch work on the third. If all three are working on the same working set, you could well be in trouble. You are better off having all load, OLTP and Batch for one set of data on one node, OLTP-load-batch  for another set of data on another node etc, etc, etc. If your RAC system is not working well, this might be why}.  Similarly, partitioning is an absolutely brilliant feature – IF you designed it up-front into your system. I managed to implement a database that has scaled to 100 TB with 95% of the database read-only {so greatly reducing the backup and recovery woes} as it was designed in from the start.

Where was I? Oh yes, I remain unconvinced about Exadata. It sounds great, it sounds like it will work for datawarehouse systems where full table scans are used to get all the data and the oracle instance then filters most of the data out. Now the storage servers will do that for you.  You can imagine how instead of reading 500GB of table off disc, across the network and into Oracle memory and then filtering it, the  eight disc servers will do the filtering and send a GB of data each up to the database. It has to be faster.

BUT.

What if you have some OLTP activity and some of the data is in the SGA? That is what stops full-table-scans working at Multi-Block-Read_Count levels of efficiency.

What happens if some of the table is being updated by a load process at the same time?

 What happens if you want some of the data hosted under ASM and full Exadata performance brilliance but you have several 10’s of TB of less-active data you just want to store on cheap SATA raid 5 discs as well? How does Exadata integrate then?

You can’t test any of this out. I did email and ask John about this inability to play with and discover stuff about a solution that is hardware and very expensive. And he was good enough to respond, but I think he missed the point of my question {I should ask again, he is a nice chap and will help if he can}. He just said that the DBA does not have to worry about the technology, it just works. There are no special considerations.

Well, there are. And I can’t play with it as I would need to buy a shed load of hardware to do so. I can’t do that, I have a wife and cat to feed.

So even though Exadata sound great, it is too expensive for anyone but large, seriously interested companies to look in to.

And I see that as a problem. Exadata experts will only come out of organisations that have invested in the technology or Oracle itself. And I’m sorry, I’ve worked for Oracle and as an employee you are always going to put the best face forward.  So, skills in this area are going to stay scarce unless it takes off and I struggle to see how it will take off unless it is not just as good as Oracle says , but better than Netezza and Teradata by a large margin.

Does anyone have an exadata system I can play on? I’d love to have a try on it.

Friday Philosophy – Simply Complex July 24, 2009

Posted by mwidlake in development, Management.
Tags: , ,
3 comments

Piet de Visser is an ardent champion of simple solutions within the Oracle arena – and I completely agree with his outlook on this. {Almost}.

Simple solutions usually win long-term as they are easy to understand, easy to look after and easy to change. OK, you may not be getting the absolute best performance you can, you may not be doing the job as completely as you can, but if it is a simple solution then you probably implemented it easily and quickly. This probably also means it cost you not-a-lot in person time and brain power, so you can personally afford to blow it away and do it again. In fact, with all that saved time, money and brain power you can probably afford to create a couple more simple solutions to other problems to.

Perhaps the only thing you are probably losing out on is the kudos of having been smart enough to come up with something very cunning and complicated, to impress everyone with. You’ll get over it, people don’t stay impressed for very long, especially when your mind-bendingly cunning and complicated solution melts into a heap {as a couple of mine have, ho-hum}.

Is your chosen solution simple? I give you a simple test – Explain it to someone.

  • If you can explain it to your colleagues, it is probably quite simple. If you can’t, either the solution is not so simple or your colleagues are.
  • If you can explain it to your boss then it is probably an excellent example of simplicity.
  • If you can explain it to your mum, you have blindingly clever simplicity and your work here is done.

Oh, you remembered that I said I almost agreed with Piet.

I can think of four reasons for not implementing a simple solution. I have listed them in ascending order of being a good reason (best last). And, unfortunately, also descending order of likelihood (most likely last).

  • You were sold a complex solution as complex solutions earn the vendor more money.
  • You needed to impress someone with your incredible technical skills {this could be your peers, quite often it is your boss, but for most of us it is usually ourselves, let’s be honest :-) }
  • You really do need to do something complex for a very valid business reason, like 99.999% availability {eg for that system people can ring up as they have a cough but are convinced they are dying of “swine flu”}.
  • you are creating a generic solution.

What do I mean by the last point? I mean that your one solution has to work for a lot of different situations or usages. The system has to work for a large range of inputs or do a lot of things.

The whole Oracle database  is {in my opinion} a generic solution. A vast and complex one to be sure, but it is intended to work for a little database on a desktop keeping track of the parts in your workshop, an integrated system in eg medical or scientific robots keeping track of thousands of samples, vast data stores of telephone calls so you can do your bills, huge on-line web-based shopping systems, a front end to image or video stores.., the list is a big one. You might need a little Oracle database to hold the list in.

With version 10 Oracle Corp made a big thing about the database looking after itself .  The database was a generic, self-managing, handle-anything solution and you would not need those pesky DBA types to keep the beast working for much longer.

That is why it is so complex and, not matter how much some of us complain {as I myself often do}, it has to be and it is getting more complex with every version. I’ll take my current favorite topic, stats gathering, as an example.

Back with the rule based optimiser, you had a set of rules. 15-16 I think {I’ll let you lot check – google “rule based optimizer oracle -burleson”}. You learnt the rules, understood them and you were home and dry. Except that the RBO could not cope with data-specific oddities, how different access methods were better for different table sizes and index specificity.

So Oracle added statistics and the cost based optimiser. To make use of the cost based logic a load of mathematical calculations and considerations had to be added (and continues to be added), based on statistics you had to collect at the right time and the right level and many sites did not. People complained the CBO “just did not work”, which it did not if you didn’t collect the stats {and sometimes even when you had} but it was doing a lot to cope with a wider range of systems automatically. Histogram stats now allowed skewed data to be coped with, in most situations. 

So they introduced a job to do it for you but it had to detect the right level and type of statistics to gather on all objects, be they tiny tables, massive tables, tables with skewed data, indexes, global indexes on partitioned tables… And yes, once again, it is another complexity you have to get to grips with if it does not fit your particular system demands.

I’m sure you can argue with me over the details, but I think I’m making a valid point that every time a system {be it Oracle or an O/S} is modified to cope automatically with more senarios, it becomes a more complex system. You need a DBA with several manuals welded to their brains to get the best out of this thing now, not less as claimed back at 10’s release {did they make the same claims for 11? I was too busy keeping systems running to really notice at the time}.

Maybe the answer is to stop using generic systems like Oracle and swap them out for a mixture of spreadsheets, MySQL-type simplistic databases, netezza-type systems for datawarehouses, hand cut ‘C’ applications for handling image stores, JAVA apps and flat files for web sites…Only you are going to have to learn how to use all those things to create all the systems you need to create.

You are only going to manage this if you create those dozens of systems as simple ones.

Fuggles was very simple. The lights were on but nobody was home. But that was fine as all she wanted was to sit on you and be scratched.

Fuggles was very simple. The lights were on but nobody was home. But that was fine as all she wanted was to sit on you and be scratched.

A 100TB Database June 23, 2009

Posted by mwidlake in development, VLDB.
Tags: , ,
add a comment

Some of you may know of me as the guy who constantly presented on the “massive database of genetic information” he was creating. I started presenting about it around 2003 and I said it would be 7TB. As I built it, the data kept flooding in and by the time I had it up and running and fully populated, around 2005, it was getting scary – it had grown to over 25TB. Who am I kidding? It was beyond scary, it kept me awake at nights.

Well, it still exists and continues to grow. I saw it up to 45TB before I left the Sanger institute {where I built it} and it continues to grow towards the 100TB I designed it to scale to.

Why am I bragging about this? {” hey, have you seen the size of my database”?!}. Well, I am very proud of it. It was my pet project.

But pet project or not, I had a team of DBAs at the Sanger and of course, when I say “I built it” I should say “My team and I built it”. And they looked after it after I departed and it got even bigger.

Last week I got an email off one of them to invite me over for a small celebration this week. What are we celebrating? The first database on-site to hit 100TB. Am I proud? Hell yes, I am proud.

But not proud of what you probably think I am, given my careful preamble.

It is not my database that has broached the 100TB limit.

It is another one, a database the team put together after I left and that they have looked after without my input. What I am really proud about is that, with Shanthi Sivadasan who was holding the fort when I arrived at the Sanger {and remains there}, we put together a team of DBAs that is capable of creating and looking after such a large behemoth. It could not be done without excellent support from the Systems Administrators as well, but I feel particularly proud of the DBAs.

So congratulations to the DBAs at the Wellcome Trust Sanger Institue: Shanthi Sivadasan, Tony Webb, Andy Bryant, Aftab Ahmed, Kalyan Kallepally and Karen Ambrose. You went further with this than I did.

I hope that the cake to celebrate is very nice :-)

The Knowledge Curtain. June 8, 2009

Posted by mwidlake in development, Management, Perceptions.
Tags: , , ,
5 comments

I came across the concept of “The Knowledge Curtain” in the late 90’s, from a man called Lee Young, who I worked with for 2 or 3 years, along with a friend Mike Cox, who was key to showing me how to peek through the curtain.

The below is taken from a powerpoint presentation I sometimes give on project disasters (how to avoid, not how to encounter!):

When systems designers talk to end users, both parties usually end up not really understanding each other

When systems designers talk to end users, both parties usually end up not really understanding each other

The basic principle is that, the knowledge of your users/customers is partly hidden from you. When you ask people for whom you are designing a computer system about their job, they don’t tell you lots of stuff.

  • Things that they assume you know.
  • Things that it does not occur to them to mention.
  • Things that they do not want to admit to doing as it sounds silly or badly managed.
  • I’ve even had people not mention parts of their job as they don’t want their manager knowing they do it.

But that is OK, when you talk about information technology and computer systems to them, they have exactly the same problems with what you say :-).

Lee’s presentation, with N. Mehandjiev, predated the all-encompasing rise of the internet and this is one of the few references to it I can find. {Among the relatively few other hits on the topic, amongst the ones about knowing how to make curtains, are references to the “Knowledge Curtain” as the concept that the Web is not as available in other areas of the world. A related but very different issue}.

So, how do you peak beyond the knowledge curtain? Systems designers and developers with experience learn how to ask the questions “what are the exceptions to what  you have just told me” and “what else do you do” in many, many ways and without giving offence. After all, you have to keep asking these two questions and people naturally get irritated with that and some feel you are suggesting they are either stupid or lying. It can be tricky. 

I think that unless you have someone who is fully IT literate and has done the job the computer system is to assist with, you will inevitably only catch a part of the requirements.

For massive projects over many months or years, I think this lack of a clear understanding of what people do is a major factor to their failures. This is made even worse when the analysis is done by one group of people and the specifications are then shipped out to an external party for the application to be developed. With all that missing knowledge, it is no wonder some systems are initially so poor.

I only know of one method that reliably allows you really see beyond the knowledge curtain. That is prototyping and user feedback. Only when you show the people who are going to use the system what you have put together to help them will they know if it works. These sessions are incredibly valuable and only in development projects where they have been key to process have I seen the project deliver something truely useful.

I now have a general way of developing anything.

  • I ask the users for 3 or 4 major things that they want the system to do.
  • I develop those 3 or 4 features as quickly as possible and show them to the users.
    • One will be almost exactly what they need.
    • One or two will be close to what they need.
    • One will be utterly useless.
    • During the above, one or two critical new needs will come up.
  • Fix the close ones, dump or redo the useless one and add the new needs to the list.

Simply cycle around the above, asking for new features when you have got less than 4 features you  are actively working on. And only polish features (add all the nice screen touches and widgets) once is is exactly what they need or pretty damned close. {You hardly ever run out of features before you run out of time and money!} You end up with an excellent system that evolves to better help the customer as time goes on.

There are lots of development methodologies that have the above principle (or a variation of it) as their core, so I am certainly not the first person to find that this method works. Which is why I find it difficult to understand why so many projects don’t use it?

BTW, I thought I would just add that one of the factors in my starting a blog was a comment by Andrew Clarke on his, several years ago before Blogging really took off. It was for a presentation I did which included the Knoweldge Curtain concept. He was very nice about my presentation and I still appreciate it. This is the link, but as it is an archive you will have to search for my name.

Follow

Get every new post delivered to your Inbox.

Join 199 other followers