Big Discs are Bad September 27, 2009
Posted by mwidlake in development, performance, Uncategorized.Tags: performance, Storage, system development, VLDB
8 comments
I recently came across this article on large discs for database by Paul Vallee. The article is over 3 years old but is still incredibly valid. It’s a very good description of why big discs are a problem for Oracle Database Performance. {Paul also introduces the BAHD-DB campaign – Battle Against Huge Disks for Databases, which I quite like}.
To summarise the article, and the problem in general, IT managers will buy big discs as they provide more GB per pound sterling. It saves money.
However, less discs is Bad For Performance. As an extreme example, you can now buy a single disc that is a TB in size, so you could put a 1TB Oracle database on one such disc. This one disc can only transfer so much data per second and it takes this one disc say 10ms to search for any piece of data. If you want the index entry from one place and the table row from another, that is at least two seeks. This will not be a fast database {and I am not even touching on the consideration of disc resilience}.
Now spread the data over 10 discs. In theory these 10 discs can transfer 10 times the total data volume and one disc can be looking for information while the others are satisfying IO requests {This is a gross over-simplification, but it is the general idea}.
IT Managers will understand this 1-to-10 argument when you go through it.
Kind of.
But then discussions about how many modern “fast” discs are need to replace the old “slow” discs ensure. It can be very, very hard to get the message through that modern discs are not much faster. A 1TB disc compared to a 4-year-old 100GB disc will not have a transfer speed 10 times faster and it will certainly not have a seek time ten times less, chances are the seek time is the same. And then there are the discussion of how much impact the larger memory caches of modern storage units have. Answer,(a) quite a lot so long as it is caching what you want and (b) even if it is perfectly caching what you want, as soon as you have read a cache-sized set of data, you are back to disc IO speed.
Bottom line. Disc Drives are now slower in proportion to the disc acerage than they used to be.
Anyway, I am certainly not the only person to have had these discussions, though I have had them for longer than most {due to my accidental work history of having worked on VLDBs for so long}. There are certainly practitioners of Oracle Arts who understand all of this far better than I and one of them, James Morle, recently started blogging. It’s looking good so far. If he keeps it up for a month, I’ll put him on my blog roll 🙂
There is, however, one aspect of the Big Disc Performance issue that does not seem to get much mention but is something I have suffered from more than a couple of times.
As a Database Performance person you have had the argument about needing spindles not disc acreage and won. The IT manager buys enough spindles to provide the I/O performance your system needs. Success.
However, the success has left a situation behind. You need 10 spindles over a couple of RAID 10 arrays to give you the IO you need. 250GB discs were the smallest you could buy. So you have 1.25TB of available storage (RAID 10 halves the storage) and have a 500GB database sitting on it. There is 750GB of empty storage there…
That 750GB of empty storage will not be left inviolate. Someone will use it. Someone will need “a bit of temporary storage” and that nice chunk of fast storage will be too inviting. Especially if it IS fast storage. It will be used.
Now your database, who’s storage you specified to support said database, is sharing it’s storage with another app. An app that steals some of your IO and potentially {heck, let’s say this straight WILL} impact your database performance. And the galling thing? Twice, I had no idea my storage had become shared until I started getting odd IO latency issues on the database.
You may be able to make a logical argument for the spindles you need at design time. But you have almost no chance of protecting those spindles in the future. But who said working life was easy? 🙂