jump to navigation

Oracle Exadata – does it work? July 26, 2009

Posted by mwidlake in development, VLDB.
Tags: ,
12 comments

Does Oracle Exadata work? 

That is a tricky question as, unlike most Oracle database features, you can’t download it and give it a test.

You can try out partitioning, bigfiles, oracle Text, InterMedia {sorry, Multimedia),} all sorts of things by downloading the software. You can even try out RAC pretty cheaply, using either VM-Ware or a couple of old machines and linux, and many hundreds of Oracle techies have. The conclusion is that it works. The expert conclusion is “yes it works, but is it a good idea? It depends {my fees are reasonable}” :-).

I digress, this ability to download and play allows Oracle technophiles to get some grounding in these things, even if their employer is not currently looking to implement them {BTW how often do you look at something in your own private time that your company will not give you bandwidth for – only to have them so very interested once you have gained a bit of knowledge? Answers on a postcard please…}.

Exadata is another beast, as it is hardware. I think this is an issue.

I was lucky enough to get John Nangle to come and present on Exadata at the last UKOUG Management and Infrastructure meeting, having seen his talk at a prior meeing. John gave a very good presentation and interest was high. I have also seen Joel Goodman talk {another top presenter}, so I understand the theory. I have to say, it looks very interesting, especially in respect of what is ,perhaps, my key area of personal expertise, VLDB. Databases of 10’s of terabytes.

I don’t plan to expand here on the concepts or physical attributes of Exadata too much, it is enough to say that it appears to gain it’s advantage via two main aspects:-

  • Intelligence is sited at the “disc controller” level {which in this case is a cheap 4-cpu HP server, not really the disc controller} which basically pre-filters the data coming off storage so only the data that is of interest is passed back to the database.  This means that only blocks of interest are chucked across the network to the database.
  • The whole system is balanced. Oracle have looked at the CPU-to-IO requirements of data warehouses and decide what seems to be a good balance, they have implemented fast, low latency IO via infiniband and made sure there are a lot of network pipes from the storage up the stages to the database servers. That’s good.

The end result is that there is lots of fast, balanced IO from the storage layer to the database and only data that is “of interest” is passed up to the database.

It all sounds great in theory and Oracle Corp bandy around figures of up to 100 times (not 100%, 100 times) speedup for datawarehouse activity, with no need to re-design your implementation. At the last M&I UKOUG meeting there was also someone who had tried it in anger and they said it was 70 times faster. Unless this was a clever plant by Oracle, that is an impressive independent stated increase.

I am still very interested in the technology, but still sceptical. After all, RAC can be powerful, but in my experience it is highly unlikely that by dropping an existing system onto RAC you will get any performance (or high availability) increase. In fact, you are more likely to just make life very, very difficult for yourself. RAC works well when you design your system up-front with the intention of working on the same data on the same nodes. {Please note, this is NOT the oft-cited example of doing different work types on different nodes, ie data load on one node, OLTP on another and batch work on the third. If all three are working on the same working set, you could well be in trouble. You are better off having all load, OLTP and Batch for one set of data on one node, OLTP-load-batch  for another set of data on another node etc, etc, etc. If your RAC system is not working well, this might be why}.  Similarly, partitioning is an absolutely brilliant feature – IF you designed it up-front into your system. I managed to implement a database that has scaled to 100 TB with 95% of the database read-only {so greatly reducing the backup and recovery woes} as it was designed in from the start.

Where was I? Oh yes, I remain unconvinced about Exadata. It sounds great, it sounds like it will work for datawarehouse systems where full table scans are used to get all the data and the oracle instance then filters most of the data out. Now the storage servers will do that for you.  You can imagine how instead of reading 500GB of table off disc, across the network and into Oracle memory and then filtering it, the  eight disc servers will do the filtering and send a GB of data each up to the database. It has to be faster.

BUT.

What if you have some OLTP activity and some of the data is in the SGA? That is what stops full-table-scans working at Multi-Block-Read_Count levels of efficiency.

What happens if some of the table is being updated by a load process at the same time?

 What happens if you want some of the data hosted under ASM and full Exadata performance brilliance but you have several 10’s of TB of less-active data you just want to store on cheap SATA raid 5 discs as well? How does Exadata integrate then?

You can’t test any of this out. I did email and ask John about this inability to play with and discover stuff about a solution that is hardware and very expensive. And he was good enough to respond, but I think he missed the point of my question {I should ask again, he is a nice chap and will help if he can}. He just said that the DBA does not have to worry about the technology, it just works. There are no special considerations.

Well, there are. And I can’t play with it as I would need to buy a shed load of hardware to do so. I can’t do that, I have a wife and cat to feed.

So even though Exadata sound great, it is too expensive for anyone but large, seriously interested companies to look in to.

And I see that as a problem. Exadata experts will only come out of organisations that have invested in the technology or Oracle itself. And I’m sorry, I’ve worked for Oracle and as an employee you are always going to put the best face forward.  So, skills in this area are going to stay scarce unless it takes off and I struggle to see how it will take off unless it is not just as good as Oracle says , but better than Netezza and Teradata by a large margin.

Does anyone have an exadata system I can play on? I’d love to have a try on it.

Follow

Get every new post delivered to your Inbox.

Join 158 other followers