jump to navigation

Return from The Temple of Apple June 22, 2015

Posted by mwidlake in Private Life, rant.
Tags: , ,
4 comments

I doubt many of you are on tenterhooks as to how I got on with my phone today {after my << rant last Friday}. But I’m going to tell you anyway.

Overall, Apple have gone some way to redeeming themselves.

I got myself down into Cambridge this morning to visit the Apple Store, at my allotted slot of 10:10 {I later witnessed someone attempting to be 15 minutes early for their slot – and they were asked to go and have a coffee and come back. The customer was unimpressed as they had lugged some huge Apple monitor in with them}.

I have to say, walking into the store was somewhat like entering some form of modern temple. The clean lines, the two parallel runs of “desks” with precisely & spaciously laid-out items to worship, lit by discrete banks of lights in the ceiling. Down the center was a clear path to allow you to move deeper into the hallowed space, with a scattering of worshipful believers moving between the icons. And, at the end, a cluster of acolytes in blue tops gathered around and before the “Apple Genius Bar” alter.

I approached the alter…err, service desk… and was very soon approached by an acolyte holding a prayer tablet (iPad mini 3) in front of them. My name was on the list, my time was now. I would be granted an audience. I was directed to a stool to one side to await my turn.

Thankfully, the wait was short and ended when Dave came over, Dave turned out to be a friendly, open and helpful chap who managed to take the edge off what was frankly a bit of an OTT ambiance if you ask me. So far my impression had been that (a) you can see why the kit is so expensive to support this sort of shop frontage and space-to-item ratio, something I had only really come across before in Bose shops & car dealerships and (b) it’s just a shop selling I.T. kit, get over yourselves. Dave (not his real name, I’m afraid I forgot his real name, but he looked like a Dave – and had a great beard) listened to my potted history of the battery woes and upgrade deaths, looked over the phone briefly and then plugged it into one of the banks of MACs. It pulled up the ID of the phone and {Huzzah!!!!} set about blatting everything on it and reloading the OS I think. It took a few minutes (I read my paper magazine – “New Scientist”) and then the phone rebooted…. and put up the Apple icon… and thought about it. I could see Dave thinking “this is taking a bit longer than normal”. Anyway, the thing finally came alive.

We chatted about what the root cause could be as he said he had not heard of anyone having multiple upgrade issues and it just locking like this. He went and asked a more senior acolyte (perhaps already in the priesthood) and his opinion was that it might be a faulty motherboard – in which case all bets were off and I’d have to basically buy a new phone for £200. Dave said I might as well not bother and put the money towards getting a nice, new iPhone 6, as they were only £500 or so. I wonder what the Apple shop staff get paid to think £500 is no big shakes.

Meanwhile, Dave had verified the phone battery was indeed covered by the recall and it would be two hours to complete the work. Was I happy to get that done today? Sure, I’m happy to drink coffee and eat a bun somewhere for 2 hours. So off I went. And came back (witnessing the taking to task of a customer arriving before their time – they did let them leave the monitor behind in the end). My phone was presented back to me, working, and I just had to sign on a tablet. Sorry about using the indelible marker pen, guys. I took a photo of the temple and made a quick test call outside the shop to ensure all was OK – and it was. And apart from the brief suggestion of buying a new iPhone 6, no financial cost had been incurred (except the park & ride in, cost of coffee & bun and a lost morning).

I was soon back home and ready to restore my backup from last week. I plugged in the phone, iTunes recognised it, ran the restore… and the phone is no different – none of my contacts, no change to icons, layout or background, nothing – but now iTunes says it does not recognise the device. Ohhhh shit. Oh, and the photo of the Apple Temple is gone (it was going to be at the start of this update). A couple of hours later and trying many things, I think I know what the issues are and maybe were:

1) The device is just a bit dodgy and sometimes/often the connection with iTunes just ends (I’ve swapped cables, I know it is not that) .
2) It would not restore the backup with “Find My iPhone” running – but due to (1) it usually did not get so far as telling me that. I wonder if updates would fail for the same reason? They were very insistent I turn off the feature before I went into the shop, but of course with a locked up phone I could only do this at the web end.

I turned off the feature on the phone, ran the restore again and this time it completed and left me with a phone that worked and looked like it did a week ago.

So I eventually got the phone restored and it works as well as it did – but hopefully with more battery life. It will be interesting to see if the reception issues are any better. I kind of doubt it. It’s now at iOS 8.3 as well. Deep Joy.

My final conundrum now is that, given that my phone contract that partially paid for the phone in the first place ended a couple of months back, do I stick with this device and hope all is now OK? Or do I spend more money replacing something that is only just over 2 years old? And do I get anything but an iPhone? After all, both my wife’s iPhones have worked OK and they are nice when working. But I’m not a member of the Apple Congregation and have no desire to join.

One thing I do know. I won’t be putting the old Samsung phone I’ve had to fall back on away just yet.

Friday Philosophy – Friday Afternoon Phone June 19, 2015

Posted by mwidlake in Friday Philosophy, Private Life.
Tags: , ,
6 comments

{<<my earlier attempts to sort out my phone}
{Update on my trip to the Apple Store >>}

There used to be a phrase in the car industry in the UK (I don’t know about elsewhere) a “Friday Afternoon Car“. This is a car which is unusually unreliable, as it was built on Friday afternoon when the workers were tired, the weekend was coming and, heck, they might have been to the pub at lunch. It is occasionally used just to describe something that is a bit crap and unreliable.

I have a Friday Afternoon Phone it would seem. I am fast becoming quite disillusioned with it. You may remember my post about my sitting on said phone to make it work again. It’s an iPhone 5, I bought it as I was finally persuaded that it would be more useful to have a smart phone than the “temporary” cheap Samsung I had bought about 2 years prior to then – as an emergency replacement for my previous web-enabled phone that committed suicide with a jar of pickled onions (it’s a long, hardly believable story). I expected the Samsung to keep me going for a month or two but it was so simple and reliable it just stayed in use for over 2 years.

Your Honour, allow me to present item A and item B

Your Honour, allow me to present evidence item A and item B

Comparison:
. . . . . . . . . . . . . . Phone A. . . . . . . . . . . . Phone B
Cost. . . . . . . . . . .£400 or so . . . . . . . . . £15 with a free £10 pay-as-you-go top up.
Battery . . . . . . . . New, 8-12 hours. . . . . .New, a week
. . . . . . . . . . . . . . Now, 4-5 hours. . . . . . Now, a week!
Reliability . . . . . . Breaks every update . . .No issues ever
Making calls. . . . .6/10. . . . . . . . . . . . . . .9/10
Receiving calls . . .4/10. . . . . . . . . . . . . . 9/10
Plays Angry Birds. Yes. . . . . . . . . . . . . . . .No
Taking pictures . . 9/10. . . . . . . . . . . . . . 0/10
Helps me up a . . . Yes. . . . . . . . . . . . . . . No
mountain
Connection to web 6/10. . . . . . . . . . . . . .Are you kidding? But I’m mostly sat at a computer anyway
Impresses friends. No. . . . . . . . . . . . . . . Yes, for all the wrong reasons :-)

{There must be a better way to line up text in this wordpress theme!!!}

The Web Enabled Phone that Does not Like to Connect
In some ways the iPhone has been really good. The screen (at the time I bought it) was very good, apps run nice and fast, way too much software is available for it and it can hold a lot of pictures and videos before running out of space. Its size also suits me. But phone and web reception has always been a bit poor and its ability to hold onto a wireless connection seems to be especially questionable – as soon as a few other devices are contending for a router with my iPhone, my iPhone seems to give up its hold on the connection and sulk. I’ve had this in several places in several countries. I’m the only one up? Phone connects fine. 2 others wake up and connect? I’m off the network. I’ve also often been in a busy place (conference, sports event) and everyone else seems to be on the net but my phone just pretends.

Battery Blues
And of course, there is the issue of the battery becoming very poor. It runs on a full charge for only a few hours and if it gets cold it has a tendency to act like a squirrel and hibernate. I now carry around the spare battery pack my wife got given by her work for her work phone use abroad. The good news is, having been put on to it by Neil Chandler, I am now aware my phone has been recalled for a battery replacement. What I am a little irked about is that Apple have my details and the serial number of the phone but have never contacted me directly to let me know. OK, it is not a car (it’s just like a car – a Friday Afternoon Car) so I am unlikely to die as a result of the fault, but if they know it has a fault and it did cost a good whack of cash to buy, they should be being moral and contacting me.

Upsetting Upgrades
But the thing that utterly hacks me off is how it does not handle upgrading to the next version of iOS. I had an upgrade early on in my relationship with the phone and it blew everything off the phone. Not a big issue then as I had not had it long. But it made me cautious about upgrading. About this time last year the phone was insisting it must be upgraded and things were getting flaky (I suspect software manufacturers do this on purpose – I’ve noticed my PC running Windows can start acting odd when an update is due). Before doing anything, I backed it up. Or tried to. The first attempt said it worked but it was too swift to have backed up anything, let alone back up my photos. After then it just refused to back up. But the phone utterly refused to allow me access to the photos from my PC – it should do of course but no, nothing would pries those images out of the phone. I was particularly concerned as I had lots of snaps from a friend’s wedding. Said friend eventually helped me out by pushing all my photos to an iCloud account (It’s Just A Server On The Net) in a way he could access. I then updated the phone and, yep it failed. And locked the phone and I had to factory reset and lost all the photos. It had also lied about uploading the pictures to the net (which it took hours to not do) so they had gone for good. Grrrrr.

So this time when it started getting dodgy I managed to save all my photos (Huzzah!), backed it up, ran through the update – and it failed and locked up the phone. *sigh!!!!*. Only, this time it won’t respond even after a factory reset. My iTunes is up-to-date, it could see the phone OK at the start of the update (because I was doing it via iTunes!) but now it won’t see the phone and once I try, guess what, iTunes also locks up. So the phone is useless. I can’t help wonder if the battery issue and the failure to ever upgrade smoothly are linked somehow (by eg it being rubbish).

So I pop along to the kitchen drawer with the odds n’ sods in and pull out the old Samsung & charger and plug it in. 20 minutes later, I have a working phone. Turns out I have no credit on it anymore but I can sort that out. It even gets reception in the kitchen (I have to lean out the window of the back bedroom to get the iPhone to pick up a reliable signal at home).

Oh No! I have to Contact Apple!
Now the real fun starts. I contact my local Apple shop. Only I don’t, I access a damned annoying voice system that smugly announces “I understand full sentences” and immediately knows who I am and what my device is and when it was bought (as Apple have my details including home phone) – and it was over 2 years ago and it wants me to agree to a paid support package to go further. Of course it won’t give me options to speak to a human or understand “full sentences” even when I shout “battery issue recall” and “your update killed my phone!” plus various permutations at it. It also did not understand the sentence “I want to speak to a person”.

I eventually trick it by pretending that I will buy a support package. Huzzah, a human to talk to. Said human is helpful, pleasant, a bit hard to understand (usual call center woes of background noise and she has the microphone clipped to her socks). I explain that the phone has a recall on it and I just want that sorted and a proper reset. She’s not sure I can have this without a support package {after all, her job is to sell me some support and I am breaking her script} but she says the battery might be replaced under the recall (she has all my details, she can see the iPhone serial number, she could check!). “So I can drop it off at the store?”.

I expect “yes”. I get “no”. I have to organise an appointment. A 10 minute slot. Why? I want to drop off some kit for you to repair and I’ll come back another day. I am not making an appointment to see a doctor to discuss my piles. No, I have to have an appointment. On Monday at 10:10 or “plrbsburhpcshlurp” as the mike once once slips down the sock. OK, 10:10 Monday, she’s getting tired of me saying “please repeat that”. Then she says what sounded like “and the repair may cost up to £210 if there is a hardware fault”. WHAT?!? I don’t fully understand what she says next – but she understands I am not going to pay £210 to fix a device that has a known fault and has been screwed over again by their software update, so she backs off to “they can look at the device and advise me”.

It’ll be interesting to see how it goes on Monday. At 10:10 am. If they try and charge me hundreds of pounds to reset the damned thing or tell me (after I’ve checked) that they won’t replace the dying battery, I can imagine me becoming one of those ranting, incoherent people you see on YouTube. If they want anything more than the cost of an evening in the pub to get it working, I think it will become a shiny, expensive paperweight.

Meanwhile, welcome back Reliable Samsung Phone. You still seem to make calls just fine. Still not able to play Angry Birds though.

With Modern Storage the Oracle Buffer Cache is Not So Important. May 27, 2015

Posted by mwidlake in Architecture, Hardware, performance.
Tags: , , , ,
11 comments

With Oracle’s move towards engineered systems we all know that “more” is being done down at the storage layer and modern storage arrays have hundreds of spindles and massive caches. Does it really matter if data is kept in the Database Buffer Cache anymore?

Yes. Yes it does.

Time for a cool beer

Time for a cool beer

With much larger data sets and the still-real issue of less disk spindles per GB of data, the Oracle database buffer cache is not so important as it was. It is even more important.

I could give you some figures but let’s put this in a context most of us can easily understand.

You are sitting in the living room and you want a beer. You are the oracle database, the beer is the block you want. Going to the fridge in the kitchen to get your beer is like you going to the Buffer Cache to get your block.

It takes 5 seconds to get to the fridge, 2 seconds to pop it open with the always-to-hand bottle opener and 5 seconds to get back to your chair. 12 seconds in total. Ahhhhh, beer!!!!

But – what if there is no beer in the fridge? The block is not in the cache. So now you have to get your car keys, open the garage, get the car out and drive to the shop to get your beer. And then come back, pop the beer in the fridge for half an hour and now you can drink it. That is like going to storage to get your block. It is that much slower.

It is only that much slower if you live 6 hours drive from your beer shop. Think taking the scenic route from New York to Washington DC.

The difference in speed really is that large. If your data happens to be in the memory cache in the storage array, that’s like the beer already being in a fridge – in that shop 6 hours away. Your storage is SSD-based? OK, you’ve moved house to Philadelphia, 2 hours closer.

Let's go get beer from the shop

Let’s go get beer from the shop

To back this up, some rough (and I mean really rough) figures. Access time to memory is measured in Microseconds (“us” – millionths of a second) to hundreds of Nanoseconds (“ns” – billionths of a second). Somewhere around 500ns seems to be an acceptable figure. Access to disc storage is more like Milliseconds (“ms” – thousandths of a second). Go check an AWR report or statspack or OEM or whatever you use, you will see that db file scattered reads are anywhere from low teens to say 2 or 3 ms, depending on what your storage and network is. For most sites, that speed has hardly altered in years as, though hard discs get bigger, they have not got much faster – and often you end up with fewer spindles holding your data as you get allocated space not spindles from storage (and the total sustainable speed of hard disc storage is limited to the total speed of all the spindles involved). Oh, the storage guys tell you that your data is spread over all those spindles? So is the data for every system then, you have maximum contention.

However, memory speed has increased over that time, and so has CPU speed (though CPU speed has really stopped improving now, it is more down to More CPUs).

Even allowing for latching and pinning and messing around, accessing a block in memory is going to be at the very least 1,000 times faster than going to disc, maybe 10,000 times. Sticking to a conservative 2,000 times faster for memory than disc , that 12 seconds trip to the fridge equates to 24,000 seconds driving. That’s 6.66 hours.

This is why you want to avoid physical IO in your database if you possibly can. You want to maximise the use of the database buffer cache as much as you can, even with all the new Exadata-like tricks. If you can’t keep all your working data in memory, in the database buffer cache (or in-memory or use the results cache) then you will have to do that achingly slow physical IO and then the intelligence-at-the-hardware comes into it’s own, true Data Warehouse territory.

So the take-home message is – avoid physical IO, design your database and apps to keep as much as you can in the database buffer cache. That way your beer is always to hand.

Cheers.

Update. Kevin Fries commented to mention this wonderful little latency table. Thanks Kevin.

“Here’s something I’ve used before in a presentation. It’s from Brendan Gregg’s book – Systems Performance: Enterprise and the Cloud”

Fixing my iPhone with my Backside May 18, 2015

Posted by mwidlake in Hardware, off-topic, Perceptions, Private Life.
Tags: , ,
10 comments

{Things got worse with this phone >>}

iPhone 5 battery getting weak? Damn thing is saying “out of charge” and won’t charge? Read on…

NB a link to the battery recall page is at the end of this post and the eventual “death” of my phone and my subsequent experience of the recall with my local apple store can be found by following the above link…

Working with Oracle often involves fixing things – not because of the Oracle software (well, occasionally it is) but because of how it is used or the tech around it. Sometimes the answer is obvious, sometime you can find help on the web and sometimes you just have to sit on the issue for a while. Very, very occasionally, quite literally.

Dreaded “out of battery” icon

Last week I was in the English Lake District, a wonderful place to walk the hills & valleys and relax your mind, even as you exhaust your body. I may have been on holiday but I did need to try and keep in touch though – and it was proving difficult. No phone reception at the cottage, the internet was a bit slow and pretty random, my brother’s laptop died – and then my iPhone gave up the ghost. Up on the hills, midday, it powers off and any attempt to use it just shows the “feed me” screen. Oh well, annoying but not fatal.

However, I get back to base, plug it in…and it won’t start. I still get the “battery out of charge” image. I leave it an hour, still the same image. Reset does not help, it is an ex-iPhone, it has ceased to be.

My iPhone is version 5 model, black as opposed to the white one shown (picture stolen from “digitaltrends.com” and trimmed), not that the colour matters! I’ve started having issues with the phone’s battery not lasting so well (as, I think, has everyone with the damned thing) and especially with it’s opinion of how much charge is left being inaccurate. As soon as it gets down to 50% it very rapidly drops to under 20%, starts giving me the warnings about low battery and then hits 1%. And sometimes stays at 1% for a good hour or two, even with me taking the odd picture. And then it will shut off. If it is already down at below 20% and I do something demanding like take a picture with flash or use the torch feature, it just switches off and will only give me the “out of charge” image. But before now, it has charge up fine and, oddly enough, when I put it on to charge it immediately shows say 40-50% charge and may run for a few hours again.

So it seemed the battery was dying and had finally expired. I’m annoyed, with the unreliable internet that phone was the only verbal way to keep in touch with my wife, and for that contact I had to leave the cottage and go up the road 200 meters (thankfully, in the direction of a nice pub).

But then I got thinking about my iPhone and it’s symptoms. It’s opinion of it’s charge would drop off quickly, sudden drain had a tendency to kill it and I had noticed it lasting less well if it was cold (one night a couple of months ago it went from 75% to dead in 10, 15 mins when I was in a freezing cold car with, ironically, a dead battery). I strongly suspect the phone detects it’s level of charge by monitoring the amperage drop, or possibly the voltage drop, as the charge is used. And older rechargeable batteries tend to drop in amperage. And so do cold batteries {oddly, fully charged batteries can have a slightly higher voltage as the internal resistance is less, I understand}.

Perhaps my battery is just not kicking out enough amperage for the phone to be able to either run on it or “believe” it can run on it. The damn thing has been charging for 2 or 3 hours now and still is dead. So, let’s warm it up. Nothing too extreme, no putting it in the oven or on top of a radiator. Body temperature should do – We used to do this on scout camps with torches that were almost exhausted. So I took it out of it’s case (I have a stupid, bulky case) so that it’s metal case is uncovered and I, yep, sat on it. And drank some wine and talked balls with my brother.

15 minutes later, it fires up and recons it is 70% charged. Huzzah, it is not dead!

Since then I have kept it out it’s case, well charged and, frankly, warm. If I am out it is in my trouser pocket against my thigh. I know I need to do something more long-term but right now it’s working.

I tend to solve a lot of my IT/Oracle issues like this. I think about the system before the critical issue, was there anything already going awry, what other symptoms were there and can I think of a logical or scientific cause for such a pattern. I can’t remember any other case where chemisty has been the answer to a technology issue I’m having (though a few where physics did, like the limits on IO speed or network latency that simply cannot be exceeded), but maybe others have?

Update: if you have a similar problem with your iPhone5, there is a recall on some iPhone5’s sold between December 2012 and January 2013 – https://www.apple.com/uk/support/iphone5-battery
{thanks for letting me know that, Neil}.

Friday Philosophy – Tosh Talked About Technology February 17, 2012

Posted by mwidlake in Friday Philosophy, future, Hardware, rant.
Tags: , ,
9 comments

Sometimes I can become slightly annoyed by the silly way the media puts out total tosh and twaddle(*) that over-states the impact or drawbacks about technology (and science ( and especially medicine (and pretty much anything the media decides to talk about)))). Occasionally I get very vexed indeed.

My attention was drawn to some such thing about SSDs (solid State Discs) via a tweet by Gwen Shapira yesterday {I make no statement about her opinion in this in any way, I’m just thanking her for the tweet}. According to Computerworld

SSDs have a ‘bleak’ future, researchers say

So are SSDs somehow going to stop working or no longer be useful? No, absolutely not. Are SSDs not actually going to be more and more significant in computing over the next decade or so? No, they are and will continue to have a massive impact. What this is, is a case of a stupidly exaggerated title over not a lot. {I’m ignoring the fact that SSDs can’t have any sort of emotional future as they are not sentient and cannot perceive – the title should be something like “the future usefulness of SSDs looks bleak”}.

What the article is talking about is a reasonable little paper about how if NAND-based SSDS continue to use smaller die sizes, errors could increase and access times increase. That is, if the same technology is used in the same way and manufacturers continue to shrink die sizes. It’s something the memory technologists need to know about and perhaps find fixes for. Nothing more, nothing less.

The key argument is that by 2024 we will be using something like 6.4nm dies and at that size, the physics of it all means everything becomes a little more flaky. After all, Silicon atoms are around 0.28nm wide (most atoms of things solid at room temperature are between 0.2nm and 0.5nm wide), at that size we are building structures with things only an order of magnitude or so smaller. We have all heard of quantum effects and tunneling, which means that at such scales and below odd things can happen. So error correction becomes more significant.

But taking a reality check, is this really an issue:

  • I look at my now 4-year-old 8GB micro-USB stick (90nm die?) and it is 2*12*30mm, including packaging. The 1 TB disc on my desk next to it is 24*98*145mm. I can get 470 of those chips in the same space as the disc, so that’s 3.8TB based on now-old technology.
  • Even if the NAND materials stay the same and the SSD layout stays the same and the packaging design stays the same, we can expect about 10-50 times the current density before we hit any problems
  • The alternative of spinning platers of metal oxides is pretty much a stagnant technology now, the seek time and per-spindle data transfer rate is hardly changing. We’ve even exceeded the interface bottleneck that was kind-of hiding the non-progress of spinning disk technology

The future of SSD technology is not bleak. There are some interesting challenges ahead, but things are certainly going to continue to improve in SSD technology between now and when I hang up my keyboard. I’m particularly interested to see how the technologists can improve write times and overall throughput to something closer to SDRAM speeds.

I’m willing to lay bets that a major change is going to be in form factor, for both processing chips and memory-based storage. We don’t need smaller dies, we need lower power consumption and a way to stack the silicon slices and package them (for processing chips we also need a way to make thousands of connections between the silicon slices too). What might also work is simply wider chips, though that scales less well. What we see as chips on a circuit board is mostly the plastic wrapper. If part of that plastic wrapper was either a porous honeycomb air could move through or a heat-conducting strip, the current technology used for SSD storage could be stacked on top of each other into blocks of storage, rather then the in-effect 2D sheets we have at present.

What could really be a cause of technical issues? The bl00dy journalists and marketing. Look at digital cameras. Do you really need 12, 16 mega-pixels in your compact point-and-shoot camera? No, you don’t, you really don’t, as the optics on the thing are probably not up to the level of clarity those megapixels can theoretically give you, the lens is almost certainly not clean any more and, most significantly, the chip is using smaller and smaller areas to collect photons (the sensor is not getting bigger with more mega-pixels you know – though the sensor size is larger in proper digital SLRs which is a large part of why they are better). This less-photons-per-pixel means less sensitivity and more artefacts. What we really need is maybe staying with 8MP and more light sensitivity. But the mega-pixel count is what is used to market the camera at you and I. As a result, most people go for the higher figures and buy something technically worse, so we are all sold something worse. No one really makes domestic-market cameras where the mega-pixel count stays enough and the rest of the camera improves.

And don’t forget. IT procurement managers are just like us idiots buying compact cameras.

(*) For any readers where UK English is not a first language, “twaddle” and “tosh” both mean statements or arguments that are silly, wrong, pointless or just asinine. oh, Asinine means talk like an ass :-) {and I mean the four-legged animal, not one’s bottom, Mr Brooks}

Will the Single Box System make a Comeback? December 8, 2011

Posted by mwidlake in Architecture, future, Hardware.
Tags: , ,
15 comments

For about 12 months now I’ve been saying to people(*) that I think the single box server is going to make a comeback and nearly all businesses won’t need the awful complexity that comes with the current clustered/exadata/RAC/SAN solutions.

Now, this blog post is more a line-in-the-sand and not a well researched or even thought out white paper – so forgive me the obvious mistakes that everyone makes when they make a first draft of their argument and before they check their basic facts, it’s the principle that I want to lay down.

I think we should be able to build incredible powerful machines based on PC-type components, machines capable of satisfying the database server requirements of anything but the most demanding or unusual business systems. And possibly even them. Heck, I’ve helped build a few pretty serious systems where the CPU, memory and inter-box communication is PC-like already. If you take the storage component out of needing to be centralise (and this shared), I think that is a major change is just over the horizon.

At one of his talks at the UKOUG conference this year, Julian Dyke showed a few tables of CPU performance, based on a very simple PL/SQL loop test he has been using for a couple of years now. The current winner is 8 seconds by a… Core i7 2600K. ie a PC chip and one that is popular with gamers. It has 4 cores and runs two threads per core, at 3.4GHz and can boost a single core to 3.8 GHz. These modern chips are very powerful. However, chips are no longer getting faster so much as wider – more cores. More ability to do lots of the same thing at the same speed.

Memory prices continue to tumble, especially with smart devices and SSD demands pushing up the production of memory of all types. Memory has fairly low energy demands so you can shove a lot of it in one box.

Another bit of key hardware for gamers is the graphics card – if you buy a top-of-the-range graphics card for a PC that is a couple of years old, the graphics card probably has more pure compute grunt than your CPU and a massive amount of data is pushed too and fro across the PCIe interface. I was saying something about this to some friends a couple of days ago but James Morle brought it back to mind when he tweeted about this attempt at a standard about using PCI-e for SSD. A PCI-e 16X interface has a theoretical throughput of 4000MB per second – each way. This compares to 600MB for SATA III, which is enough for a modern SSD. A single modern SSD. {what I am not aware of is the latency for PCI-e but I’d be surprised if it was not pretty low}. I think you can see where I am going here.

Gamers and image editors have probably been most responsible for pushing along this increase in performance and intra-system communication.

SSD storage is being produced in packages with a form factor and interface to enable an easy swap into the place of spinning rust, with for example a SATA3 interface and 3.5inch hard disk chassis shape. There is no reason that SSD (or other memory-based) storage cannot be manufactured in all sorts of different form factors, there is no physical constraint of having to house a spinning disc. Density per dollar of course keeps heading towards the basement. TB units will soon be here but maybe we need cheap 256GB units more than anything. So, storage is going to be compact and able to be in form factors like long, thin slabs or even odd shapes.

So when will we start to see cheap machines something like this: Four sockets for 8/16/32 core CPUs, 128GB main memory (which will soon be pretty standard for servers), memory-based storage units that clip to the external housing (to provide all the heat loss they require) that combine many chips to give 1Gb IO rates, interfaced via the PCIe 16X or 32X interface. You don’t need a HBA, your storage is internal. You will have multipath 10GbE going in and out of the box to allow for normal network connectivity and backup, plus remote access of local files if need be.

That should be enough CPU, memory and IO capacity for most business systems {though some quote from the 1960’s about how many companies could possible need a computer spring to mind}. You don’t need shared storage for this, in fact I am of the opinion that shared storage is a royal pain in the behind as you are constantly having to deal with the complexity of shared access and maximising contention on the flimsy excuse of “sweating your assets”. And paying for the benefit of that overly complex, shared, contended solution.

You don’t need a cluster as you have all the cpu, working memory and storage you need in a 1U server. “What about resilience, what if you have a failure?”. Well, I am swapping back my opinion on RAC to where I was in 2002 – it is so damned complex it causes way more outage than it saves. Especially when it comes to upgrades. Talking to my fellow DBA-types, the pain of migration and the number of bugs that come and go from version to version, mix of CRS, RDBMS and ASM versions, that is taking up massive amounts of their time. Dataguard is way simpler and I am willing to bet that for 99.9% of businesses other IT factors cause costly system outages an order of magnitude more times than the difference between what a good MAA dataguard solution can provide you compared to a good stretched RAC one can.

I think we are already almost at the point where most “big” systems that use SAN or similar storage don’t need to be big. If you need hundreds of systems, you can virtualize them onto a small number of “everything local”
boxes.

A reason I can see it not happening is cost. The solution would just be too cheap, hardware suppliers will resist it because, hell, how can you charge hundreds of thousands of USD for what is in effect a PC on steroids? But desktop games machines will soon have everything 99% of business systems need except component redundancy and, if your backups are on fast SSD and you a way simpler Active/Passive/MAA dataguard type configuration (or the equivalent for your RDBMS technology) rather than RAC and clustering, you don’t need that total redundancy. Dual power supply and a spare chunk of solid-state you can swap in for a failed raid 10 element is enough.

Friday Philosophy – Oracle Performance Silver Bullet August 5, 2011

Posted by mwidlake in Architecture, Friday Philosophy, performance.
Tags: , , ,
15 comments

Silver Cartridge and Bullet

For as long as I have been working with Oracle technology {which is now getting towards 2 decades and isn’t that pause for thought} there has been a constant search for Performance Silver Bullets – some trick or change or special init.ora parameter {alter system set go_faster_flag=’Y’} you can set to give you a guaranteed boost in performance. For all that time there has been only one.

There are a few performance Bronze Bullets…maybe Copper Bullets. The problem is, though, that the Oracle database is a complex piece of software and what is good for one situation is terrible for another. Often this is not even a case of “good 90% of the time, indifferent 9% of the time and tragic 1% of the time”. Usually it is more like 50%:30%:20%.

Cartridge with copper bullet &spent round

I’ve just been unfair to Oracle software actually, a lot of the problem is not with the complexity of Oracle, it is with the complexity of what you are doing with Oracle. There are the two extremes of OnLine Transaction Processing (lots of short running, concurrent, simple transactions you want to run very quickly by many users) and Data Warehouse where you want to process a vast amount of data by only a small number of users. You may well want to set certain initialisation parameters to favour quick response time (OLTP) or fastest processing time to completion (DW). Favouring one usually means a negative impact on the other. Many systems have both requirements in one… In between that there are the dozens and dozens of special cases and extremes that I have seen and I am just one guy. People get their database applications to do some weird stuff.

Partitioning is a bronze bullet. For many systems, partitioning the biggest tables makes them easier to manage, allows some queries to run faster and aids parallel activity. But sometimes (more often than you might think) Partitioning can drop rather than increase query or DML performance. In earlier versions of Oracle setting optimizer_index_caching and optimizer_index_cost_adj was often beneficial and in Oracle 9/8/7 setting db_file_multiblock_read_count “higher” was good for DWs….Go back to Oracle 7 and doing stuff to increase the buffer cache hit ratio towards 98% was generally good {and I will not respond to any comments citing Connors magnificent “choose your BCHR and I’ll achieve it” script}.
You know what? There was an old trick in Oracle 7 you could maybe still look at as a bronze bullet. Put your online redo logs and key index tablespaces on the fastest storage you have and split your indexes/tables/partitions across the faster/slower storage as is fit. Is all your storage the same speed? Go buy some SSD and now it isn’t….

Cartridge with Wooden Bullet

Then there are bronze bullets that you can use that very often improve performance but the impact can be catastrophic {Let’s call them wooden bullets :-) }. Like running your database in noarchivelog mode. That can speed up a lot of things, but if you find yourself in the situation of needing to do a recovery and you last cold backup is not recent enough – catastrophe. A less serious but more common version of this is doing things nologging. “oh, we can just re-do that after a recovery”. Have you done a test recovery that involved that “oh, we can just do it” step? And will you remember it when you have a real recovery situation and the pressure is on? Once you have one of these steps, you often end up with many of them. Will you remember them all?

How many of you have looked at ALTER SYSTEM SET COMMIT_WRITE=’BATCH,NOWAIT’? It could speed up response times and general performance on your busy OLTP system. And go lose you data on crash recovery. Don’t even think about using this one unless you have read up on the feature, tested it, tested it again and then sat and worried about could possibly go wrong for a good while.

That last point is maybe at the core of all these Performance Bronze Bullets. Each of these things may or may not work but you have to understand why and you have to understand what the payback is. What could now take longer or what functionality have I now lost? {hint, it is often recovery or scalability}.

So, what was that one Silver Bullet I tantalizingly left hanging out for all you people to wait for? You are not going to like this…

Look at what your application is doing and look at the very best that your hardware can do. Do you want 10,000 IOPS a second and your storage consists of less than 56 spindles? Forget it, your hardware cannot do it. No matter what you tune or tweak or fiddle with. The one and only Performance Silver Bullet is to look at your system and your hardware configuration and work out what is being asked and what can possibly be delivered. Now you can look at:

  • What is being asked of it. Do you need to do all of that (and that might involve turning some functionality off, if it is a massive drain and does very little to support your business).
  • Are you doing stuff that really is not needed, like management reports that no one has looked at in the last 12 months?
  • Is your system doing a heck of a lot to achieve a remarkably small amount? Like several hundred buffer gets for a single indexed row? That could be a failure to do partition exclusion.
  • Could you do something with physical data positioning to speed things up, like my current blogging obsession with IOTs?
  • You can also look at what part of your hardware is slowing things down. Usually it is spindle count/RAID level, ie something dropping your IOPS. Ignore all sales blurb from vendors and do some real-world tests that match what you app is or wants to do.

It’s hard work but it is possibly the only Silver Bullet out there. Time to roll up our sleeves and get cracking…

{Many Thanks to Kevin Closson for providing all the pictures – except the Silver Bullet, which he only went and identified in his comment!}

Fastest £1,000 Server – back from supplier July 23, 2011

Posted by mwidlake in One Grand Server.
Tags: ,
6 comments

At the risk of turning my Blog into some sort of half-way-house tweet update thing (correct, I’ve never logged into twitter), as a couple of people asked about the outcome with the broken £1,000 server, I’m happy to report it came back this week. The motherboard had died. I’d convinced myself it was the PSU when I trawled the net as it seems to be one of those things that is most likely to die having fired up in the first place, but no, the motherboard. I guess some solder “dried” or the pc pixies just don’t like me. One month turnaround is not very impressive…

They had another motherboard exactly the same in stock so I got a like-for-like swap. I was kind of hoping for a different one with more SATA3 and USB3 headers :-)

Now I’m trying to download the latest oracle 11 for 64 bit windows. I live out in the wilds of North Essex (for non-UK people, this is all of 62 Kilometers North-Northeast of London as the crow flies, so not exactly in an obscure and remote part of the UK! For those who DO know the UK, it is nothing like “the only way is Essex” out here. We have trees, fields, wildlife and a lack of youth culture.) As such, my broadband connect is sloooow. The connection keeps breaking and I lose the download. *tsch*. I’m sure I had a download manager somewhere which got around these issues…

Fastest £1,000 server – what happened? July 12, 2011

Posted by mwidlake in One Grand Server, performance.
Tags:
7 comments

A couple of people have asked me recently what happened to that “fastest Oracle server for a grand” idea I had last year, after all I did announce I had bought the machine.

{Update – it came back.}
Well, a couple of things happened. Firstly, what was a small job for a client turned into a much more demanding job for a client – not so much mentally harder as time-consuming harder and very time consuming it was. So the playing had to go on hold, the client comes first. The server sat in the corner of the study, nagging me to play with it, but it remained powered down.
Secondly, when the work life quietened down last month and I decided to spend a weekend getting that server set up I hit an issue. I turned on the server and it turned itself straight off. It than rested for 5 seconds and turned itself back on for half a second – and then straight off. It would cycle like that for as long as I was willing to let it.

OK, duff power switch, mother board fault, something not plugged in right, PSU not reaching stable voltage… I opened the case and checked everything was plugged in OK and found the manufacturer had covered everything with that soft resin to hold things in place. I pressed on all the cards etc in hope but no, it was probably going to have to go back. It is still in warranty, the manufacturer can fix it.

So I rang the manufacturer and had the conversation. They were not willing to try and diagnose over the phone so I had to agree to ship it back to them to be fixed {I did not go for on-site support as the only time I did, with Evesham Micros, they utterly refused to come out to fix the problem. Mind you, it turns out they were counting down the last week or two before going bust and, I suspect, knew this}. I shipped it back and the waiting began. Emails ignored, hard to get on touch over the phone. Over three weeks on and they only started looking at the machine last Friday (they claim).

On the positive side, this delay means that solid state storage is becoming very affordable and I might be able to do some more interesting things within my budget.
On the bad side the technology has moved on and I could get a better server for the same money now, but that is always the case. Mine does not have the latest Sandy Bridge Intel processor for example. Also, I have time now to work on it, I hope not to have time next month as I’d like to find some clients to employ me for a bit!

I better go chase the manufacturer. If it is not fixed and on its way back very, very soon then they will be off my list of suppliers and I’ll be letting everyone know how good their support isn’t.

Follow

Get every new post delivered to your Inbox.

Join 198 other followers