jump to navigation

Friday Philosophy – The Singular Stupidity of the Sole Solution April 22, 2016

Posted by mwidlake in Architecture, Exadata, Friday Philosophy, Hardware.
Tags: , , ,
13 comments

I don’t like the ‘C’ word, it’s offensive to some people and gets used way too much. I mean “cloud” of course. Across all of I.T. it’s the current big trend that every PR department seems to feel the need to trump about and it’s what all Marketing people are trying to sell us. I’m not just talking Oracle here either, read any computing, technical or scientific magazine and there are the usual adds by big I.T. companies like IBM and they are all pushing clouds (and the best way to push a cloud is with hot air). And we’ve been here before so many times. It’s not so much the current technical trend that is the problem, it is the obsession with the one architecture as the solution to fit all requirements that is damaging.

No clouds here yet

No clouds here yet

When a company tries to insist that X is the answer to all technical and business issues and promotes it almost to the exclusion of anything else, it leads to a lot of customers being disappointed when it turns out that the new golden bullet is no such thing for their needs. Especially when the promotion of the solution translates to a huge push in sales of it, irrespective of fit. Technicians get a load of grief from the angry clients and have to work very hard to make the poor solution actually do what is needed or quietly change the solution for one that is suitable. The sales people are long gone of course, with their bonuses in the bank.

But often the customer confidence in the provider of the solution is also long gone.

Probably all of us technicians have seen it, some of us time after time and a few of us rant about it (occasionally quite a lot). But I must be missing something, as how can an organisation like Oracle or IBM not realise they are damaging their reputation? But they do it in a cyclical pattern every few years, so whatever they gain by mis-selling these solutions is somehow worth the abuse of the customer – as that is what it is. I suppose the answer could be that all large tech companies are so guilty of this that the customer end up feeling it’s a choice between a list of equally dodgy second hand car salesemen.

Looking at the Oracle sphere, when Exadata came along it was touted by Oracle Sales and PR as the best solution – for almost everything. Wrongly. Utterly and stupidly wrongly. Those of us who got involved in Exadata with the early versions, especially I think V2 and V3, saw it being implemented for OLTP-type systems where it was a very, very expensive way of buying a small amount of SSD. The great shame was that the technical solution of Exadata was fantastic for a sub-set of technical issues. All the clever stuff in the storage cell software and maximizing hardware usage for a small number of queries (small sometimes being as small as 1) was fantastic for some DW work with huge full-segment-scan queries – and of no use at all for the small, single-account-type queries that OLTP systems run. But Oracle just pushed and pushed and pushed Exadata. Sales staff got huge bonuses for selling them and the marketing teams seemed incapable of referring to the core RDBMS without at least a few mentions of Exadata
Like many Oracle performance types, I ran into this mess a couple of times. I remember one client in particular who had been told Exadata V2 would fix all their problems. I suspect based solely on the fact it was going to be a multi-TB data store. But they had an OLTP workload on the data set and any volume of work was slaying the hardware. At one point I suggested that moving a portion of the workload onto a dirt cheap server with a lot of spindles (where we could turn off archive redo – it was a somewhat unusual system) would sort them out. But my telling them a hardware solution 1/20th the cost would fix things was politically unacceptable.

Another example of the wonder solution is Agile. Agile is fantastic: rapid, focused development, that gets a solution to a constrained user requirement in timescales that can be months, weeks, even days. It is also one of the most abused terms in I.T. Implementing Agile is hard work, you need to have excellent designers, programmers that can adapt rapidly and a lot, and I mean a LOT, of control of the development and testing flow. It is also a methodology that blows up very quickly when you try to include fix-on-fail or production support workloads. It also goes horribly wrong when you have poor management, which makes the irony that it is often implemented when management is already failing even more tragic. I’ve seen 5 agile disasters for each success, and on every project there are the shiny-eyed Agile zealots who seem to think just implementing the methodology, no matter what the aims of the project or the culture they are in, is guaranteed success. It is not. For many IT departments, Agile is a bad idea. For some it is the best answer.

Coming back to “cloud”, I think I have something of a reputation for not liking it – which is not a true representation of my thoughts on it, but is partly my fault as I quickly tired of the over-sell and hype. I think some aspect of cloud solutions are great. The idea that service providers can use virtualisation and container technology to spin up a virtual server, a database, an application, an application sitting in a database on a server, all in an automated manner in minutes, is great. The fact that the service provider can do this using a restricted number of parts that they have tested integrate well means they have a way more limited support matrix and thus better reliability. With the Oracle cloud, they are using their engineered systems (which is just a fancy term really for a set of servers, switches, network & storage configured in a specific way with their software configure in a standard manner) so they can test thoroughly and not have the confusion of a type of network switch being used that is unusual or a flavor of linux that is not very common. I think these two items are what really make cloud systems interesting – fast, automated provisioning and a small support matrix. Being available over the internet is not such a great benefit in my book as that introduces reasons why it is not necessarily a great solution.

But right now Oracle (amongst others) is insisting that cloud is the golden solution to everything. If you want to talk at Oracle Open World 2016 I strongly suspect that not including the magic word in the title will seriously reduce your chances. I’ve got some friends who are now so sick of the term that they will deride cloud, just because it is cloud. I’ve done it myself. It’s a potentially great solution for some systems, ie running a known application that is not performance critical that is accessed in a web-type manner already. It is probably not a good solution for systems that are resource heavy, have regulations on where the data is stored (some clinical and financial data cannot go outside the source country no matter what), alter rapidly or are business critical.

I hope that everyone who uses cloud also insists that the recovery of their system from backups is proven beyond doubt on a regular basis. Your system is running on someone else’s hardware, probably managed by staff you have no say over and quite possibly with no actual visibility of what the DR is. No amount of promises or automated mails saying backs occurred is guarantee of recovery reliability. I’m willing to bet that within the next 12 months there is going to be some huge fiasco where a cloud services company loses data or system access in a way that seriously compromises a “top 500” company. After all, how often are we told by companies that security is their top priority? About as often as they mess it up and try to embark on a face-saving PR exercise. So that would be a couple a month.

I just wish Tech companies would learn to be a little less single solution focused. In my book, it makes them look like a bunch of excitable children. Give a child a hammer and everything needs a pounding.

The “as a Service” paradigm. October 27, 2015

Posted by mwidlake in Architecture, Hardware, humour.
Tags: , , ,
4 comments

For the last few days I have been at Oracle Open World 2015 (OOW15) learning about the future plans and directions for Oracle. I’ve come to a striking realisation, which I will reveal at the end.

The message being pressed forward very hard is that of compute services being provided “As A Service”. This now takes three flavours:

  1. Being provided by a 3rd party’s hardware via the internet, ie in The Cloud.
  2. Having your own hardware controlled and maintained by you but providing services with the same tools and quick-provisioning ideology as “cloud”. This is being called On Premise (or just “On Prem” if you are aiming to annoy the audience), irrespective of the probably inaccuracy of that label (think hosting & dedicated compute away from head office)
  3. A mix of the two where you have some of your system in-house and some of it floating in the Cloud. This is called Hybrid Cloud.

There are many types of  “as a Service offerings, the main ones probably being

  • SaaS -Software as a Service
  • PaaS – Platform as a Service
  • DBaas – Database as a a Service
  • Iass – Infrastructure as a Service.

Whilst there is no denying that there is a shift of some computer systems being provided by any of these, or one of the other {X}aaS offerings, it seems to me that what we are really moving towards is providing the hardware, software, network and monitoring required for an IT system. It is the whole architecture that has to be considered and provided and we can think of it as Architecture as a Service or AaaS. This quick provisioning of the architecture is a main win with Cloud, be it externally provided or your own internal systems.

We all know that whilst the provision time is important, it is really the management of the infrastructure that is vital to keeping a service running, avoiding outages and allowing for upgrades. We need a Managed Infrastructure (what I term MI) to ensure the service provided is as good as or better than what we currently have. I see this as a much more important aspect of Cloud.

Finally, it seems to me that the aspects that need to be considered are more than initially spring to mind. Technically the solutions are potentially complex, especially with hybrid cloud, but also there are complications of a legal, security, regulatory and contractual aspect. If I have learnt anything over the last 2+ decades in IT it is that complexity of the system is a real threat. We need to keep things simple where possible – the old adage of Keep It Simple, Stupid is extremely relevant.

I think we can sum up the whole situation by combining these three elements of architecture, managed infrastructure and simplicity into one encompassing concept, which is:

KISS MI AaaS.

.

.

And yes, that was a very long blog post for a pretty weak joke. 5 days of technical presentations and non-technical socialising does strange things to your brain

With Modern Storage the Oracle Buffer Cache is Not So Important. May 27, 2015

Posted by mwidlake in Architecture, Hardware, performance.
Tags: , , , ,
11 comments

With Oracle’s move towards engineered systems we all know that “more” is being done down at the storage layer and modern storage arrays have hundreds of spindles and massive caches. Does it really matter if data is kept in the Database Buffer Cache anymore?

Yes. Yes it does.

Time for a cool beer

Time for a cool beer

With much larger data sets and the still-real issue of less disk spindles per GB of data, the Oracle database buffer cache is not so important as it was. It is even more important.

I could give you some figures but let’s put this in a context most of us can easily understand.

You are sitting in the living room and you want a beer. You are the oracle database, the beer is the block you want. Going to the fridge in the kitchen to get your beer is like you going to the Buffer Cache to get your block.

It takes 5 seconds to get to the fridge, 2 seconds to pop it open with the always-to-hand bottle opener and 5 seconds to get back to your chair. 12 seconds in total. Ahhhhh, beer!!!!

But – what if there is no beer in the fridge? The block is not in the cache. So now you have to get your car keys, open the garage, get the car out and drive to the shop to get your beer. And then come back, pop the beer in the fridge for half an hour and now you can drink it. That is like going to storage to get your block. It is that much slower.

It is only that much slower if you live 6 hours drive from your beer shop. Think taking the scenic route from New York to Washington DC.

The difference in speed really is that large. If your data happens to be in the memory cache in the storage array, that’s like the beer already being in a fridge – in that shop 6 hours away. Your storage is SSD-based? OK, you’ve moved house to Philadelphia, 2 hours closer.

Let's go get beer from the shop

Let’s go get beer from the shop

To back this up, some rough (and I mean really rough) figures. Access time to memory is measured in Microseconds (“us” – millionths of a second) to hundreds of Nanoseconds (“ns” – billionths of a second). Somewhere around 500ns seems to be an acceptable figure. Access to disc storage is more like Milliseconds (“ms” – thousandths of a second). Go check an AWR report or statspack or OEM or whatever you use, you will see that db file scattered reads are anywhere from low teens to say 2 or 3 ms, depending on what your storage and network is. For most sites, that speed has hardly altered in years as, though hard discs get bigger, they have not got much faster – and often you end up with fewer spindles holding your data as you get allocated space not spindles from storage (and the total sustainable speed of hard disc storage is limited to the total speed of all the spindles involved). Oh, the storage guys tell you that your data is spread over all those spindles? So is the data for every system then, you have maximum contention.

However, memory speed has increased over that time, and so has CPU speed (though CPU speed has really stopped improving now, it is more down to More CPUs).

Even allowing for latching and pinning and messing around, accessing a block in memory is going to be at the very least 1,000 times faster than going to disc, maybe 10,000 times. Sticking to a conservative 2,000 times faster for memory than disc , that 12 seconds trip to the fridge equates to 24,000 seconds driving. That’s 6.66 hours.

This is why you want to avoid physical IO in your database if you possibly can. You want to maximise the use of the database buffer cache as much as you can, even with all the new Exadata-like tricks. If you can’t keep all your working data in memory, in the database buffer cache (or in-memory or use the results cache) then you will have to do that achingly slow physical IO and then the intelligence-at-the-hardware comes into it’s own, true Data Warehouse territory.

So the take-home message is – avoid physical IO, design your database and apps to keep as much as you can in the database buffer cache. That way your beer is always to hand.

Cheers.

Update. Kevin Fries commented to mention this wonderful little latency table. Thanks Kevin.

“Here’s something I’ve used before in a presentation. It’s from Brendan Gregg’s book – Systems Performance: Enterprise and the Cloud”

Fixing my iPhone with my Backside May 18, 2015

Posted by mwidlake in Hardware, off-topic, Perceptions, Private Life.
Tags: , ,
10 comments

{Things got worse with this phone >>}

iPhone 5 battery getting weak? Damn thing is saying “out of charge” and won’t charge? Read on…

NB a link to the battery recall page is at the end of this post and the eventual “death” of my phone and my subsequent experience of the recall with my local apple store can be found by following the above link…

Working with Oracle often involves fixing things – not because of the Oracle software (well, occasionally it is) but because of how it is used or the tech around it. Sometimes the answer is obvious, sometime you can find help on the web and sometimes you just have to sit on the issue for a while. Very, very occasionally, quite literally.

Dreaded “out of battery” icon

Last week I was in the English Lake District, a wonderful place to walk the hills & valleys and relax your mind, even as you exhaust your body. I may have been on holiday but I did need to try and keep in touch though – and it was proving difficult. No phone reception at the cottage, the internet was a bit slow and pretty random, my brother’s laptop died – and then my iPhone gave up the ghost. Up on the hills, midday, it powers off and any attempt to use it just shows the “feed me” screen. Oh well, annoying but not fatal.

However, I get back to base, plug it in…and it won’t start. I still get the “battery out of charge” image. I leave it an hour, still the same image. Reset does not help, it is an ex-iPhone, it has ceased to be.

My iPhone is version 5 model, black as opposed to the white one shown (picture stolen from “digitaltrends.com” and trimmed), not that the colour matters! I’ve started having issues with the phone’s battery not lasting so well (as, I think, has everyone with the damned thing) and especially with it’s opinion of how much charge is left being inaccurate. As soon as it gets down to 50% it very rapidly drops to under 20%, starts giving me the warnings about low battery and then hits 1%. And sometimes stays at 1% for a good hour or two, even with me taking the odd picture. And then it will shut off. If it is already down at below 20% and I do something demanding like take a picture with flash or use the torch feature, it just switches off and will only give me the “out of charge” image. But before now, it has charge up fine and, oddly enough, when I put it on to charge it immediately shows say 40-50% charge and may run for a few hours again.

So it seemed the battery was dying and had finally expired. I’m annoyed, with the unreliable internet that phone was the only verbal way to keep in touch with my wife, and for that contact I had to leave the cottage and go up the road 200 meters (thankfully, in the direction of a nice pub).

But then I got thinking about my iPhone and it’s symptoms. It’s opinion of it’s charge would drop off quickly, sudden drain had a tendency to kill it and I had noticed it lasting less well if it was cold (one night a couple of months ago it went from 75% to dead in 10, 15 mins when I was in a freezing cold car with, ironically, a dead battery). I strongly suspect the phone detects it’s level of charge by monitoring the amperage drop, or possibly the voltage drop, as the charge is used. And older rechargeable batteries tend to drop in amperage. And so do cold batteries {oddly, fully charged batteries can have a slightly higher voltage as the internal resistance is less, I understand}.

Perhaps my battery is just not kicking out enough amperage for the phone to be able to either run on it or “believe” it can run on it. The damn thing has been charging for 2 or 3 hours now and still is dead. So, let’s warm it up. Nothing too extreme, no putting it in the oven or on top of a radiator. Body temperature should do – We used to do this on scout camps with torches that were almost exhausted. So I took it out of it’s case (I have a stupid, bulky case) so that it’s metal case is uncovered and I, yep, sat on it. And drank some wine and talked balls with my brother.

15 minutes later, it fires up and recons it is 70% charged. Huzzah, it is not dead!

Since then I have kept it out it’s case, well charged and, frankly, warm. If I am out it is in my trouser pocket against my thigh. I know I need to do something more long-term but right now it’s working.

I tend to solve a lot of my IT/Oracle issues like this. I think about the system before the critical issue, was there anything already going awry, what other symptoms were there and can I think of a logical or scientific cause for such a pattern. I can’t remember any other case where chemisty has been the answer to a technology issue I’m having (though a few where physics did, like the limits on IO speed or network latency that simply cannot be exceeded), but maybe others have?

Update: if you have a similar problem with your iPhone5, there is a recall on some iPhone5’s sold between December 2012 and January 2013 – https://www.apple.com/uk/support/iphone5-battery
{thanks for letting me know that, Neil}.

Friday Philosophy – Tosh Talked About Technology February 17, 2012

Posted by mwidlake in Friday Philosophy, future, Hardware, rant.
Tags: , ,
9 comments

Sometimes I can become slightly annoyed by the silly way the media puts out total tosh and twaddle(*) that over-states the impact or drawbacks about technology (and science ( and especially medicine (and pretty much anything the media decides to talk about)))). Occasionally I get very vexed indeed.

My attention was drawn to some such thing about SSDs (solid State Discs) via a tweet by Gwen Shapira yesterday {I make no statement about her opinion in this in any way, I’m just thanking her for the tweet}. According to Computerworld

SSDs have a ‘bleak’ future, researchers say

So are SSDs somehow going to stop working or no longer be useful? No, absolutely not. Are SSDs not actually going to be more and more significant in computing over the next decade or so? No, they are and will continue to have a massive impact. What this is, is a case of a stupidly exaggerated title over not a lot. {I’m ignoring the fact that SSDs can’t have any sort of emotional future as they are not sentient and cannot perceive – the title should be something like “the future usefulness of SSDs looks bleak”}.

What the article is talking about is a reasonable little paper about how if NAND-based SSDS continue to use smaller die sizes, errors could increase and access times increase. That is, if the same technology is used in the same way and manufacturers continue to shrink die sizes. It’s something the memory technologists need to know about and perhaps find fixes for. Nothing more, nothing less.

The key argument is that by 2024 we will be using something like 6.4nm dies and at that size, the physics of it all means everything becomes a little more flaky. After all, Silicon atoms are around 0.28nm wide (most atoms of things solid at room temperature are between 0.2nm and 0.5nm wide), at that size we are building structures with things only an order of magnitude or so smaller. We have all heard of quantum effects and tunneling, which means that at such scales and below odd things can happen. So error correction becomes more significant.

But taking a reality check, is this really an issue:

  • I look at my now 4-year-old 8GB micro-USB stick (90nm die?) and it is 2*12*30mm, including packaging. The 1 TB disc on my desk next to it is 24*98*145mm. I can get 470 of those chips in the same space as the disc, so that’s 3.8TB based on now-old technology.
  • Even if the NAND materials stay the same and the SSD layout stays the same and the packaging design stays the same, we can expect about 10-50 times the current density before we hit any problems
  • The alternative of spinning platers of metal oxides is pretty much a stagnant technology now, the seek time and per-spindle data transfer rate is hardly changing. We’ve even exceeded the interface bottleneck that was kind-of hiding the non-progress of spinning disk technology

The future of SSD technology is not bleak. There are some interesting challenges ahead, but things are certainly going to continue to improve in SSD technology between now and when I hang up my keyboard. I’m particularly interested to see how the technologists can improve write times and overall throughput to something closer to SDRAM speeds.

I’m willing to lay bets that a major change is going to be in form factor, for both processing chips and memory-based storage. We don’t need smaller dies, we need lower power consumption and a way to stack the silicon slices and package them (for processing chips we also need a way to make thousands of connections between the silicon slices too). What might also work is simply wider chips, though that scales less well. What we see as chips on a circuit board is mostly the plastic wrapper. If part of that plastic wrapper was either a porous honeycomb air could move through or a heat-conducting strip, the current technology used for SSD storage could be stacked on top of each other into blocks of storage, rather then the in-effect 2D sheets we have at present.

What could really be a cause of technical issues? The bl00dy journalists and marketing. Look at digital cameras. Do you really need 12, 16 mega-pixels in your compact point-and-shoot camera? No, you don’t, you really don’t, as the optics on the thing are probably not up to the level of clarity those megapixels can theoretically give you, the lens is almost certainly not clean any more and, most significantly, the chip is using smaller and smaller areas to collect photons (the sensor is not getting bigger with more mega-pixels you know – though the sensor size is larger in proper digital SLRs which is a large part of why they are better). This less-photons-per-pixel means less sensitivity and more artefacts. What we really need is maybe staying with 8MP and more light sensitivity. But the mega-pixel count is what is used to market the camera at you and I. As a result, most people go for the higher figures and buy something technically worse, so we are all sold something worse. No one really makes domestic-market cameras where the mega-pixel count stays enough and the rest of the camera improves.

And don’t forget. IT procurement managers are just like us idiots buying compact cameras.

(*) For any readers where UK English is not a first language, “twaddle” and “tosh” both mean statements or arguments that are silly, wrong, pointless or just asinine. oh, Asinine means talk like an ass🙂 {and I mean the four-legged animal, not one’s bottom, Mr Brooks}

Will the Single Box System make a Comeback? December 8, 2011

Posted by mwidlake in Architecture, future, Hardware.
Tags: , ,
15 comments

For about 12 months now I’ve been saying to people(*) that I think the single box server is going to make a comeback and nearly all businesses won’t need the awful complexity that comes with the current clustered/exadata/RAC/SAN solutions.

Now, this blog post is more a line-in-the-sand and not a well researched or even thought out white paper – so forgive me the obvious mistakes that everyone makes when they make a first draft of their argument and before they check their basic facts, it’s the principle that I want to lay down.

I think we should be able to build incredible powerful machines based on PC-type components, machines capable of satisfying the database server requirements of anything but the most demanding or unusual business systems. And possibly even them. Heck, I’ve helped build a few pretty serious systems where the CPU, memory and inter-box communication is PC-like already. If you take the storage component out of needing to be centralise (and this shared), I think that is a major change is just over the horizon.

At one of his talks at the UKOUG conference this year, Julian Dyke showed a few tables of CPU performance, based on a very simple PL/SQL loop test he has been using for a couple of years now. The current winner is 8 seconds by a… Core i7 2600K. ie a PC chip and one that is popular with gamers. It has 4 cores and runs two threads per core, at 3.4GHz and can boost a single core to 3.8 GHz. These modern chips are very powerful. However, chips are no longer getting faster so much as wider – more cores. More ability to do lots of the same thing at the same speed.

Memory prices continue to tumble, especially with smart devices and SSD demands pushing up the production of memory of all types. Memory has fairly low energy demands so you can shove a lot of it in one box.

Another bit of key hardware for gamers is the graphics card – if you buy a top-of-the-range graphics card for a PC that is a couple of years old, the graphics card probably has more pure compute grunt than your CPU and a massive amount of data is pushed too and fro across the PCIe interface. I was saying something about this to some friends a couple of days ago but James Morle brought it back to mind when he tweeted about this attempt at a standard about using PCI-e for SSD. A PCI-e 16X interface has a theoretical throughput of 4000MB per second – each way. This compares to 600MB for SATA III, which is enough for a modern SSD. A single modern SSD. {what I am not aware of is the latency for PCI-e but I’d be surprised if it was not pretty low}. I think you can see where I am going here.

Gamers and image editors have probably been most responsible for pushing along this increase in performance and intra-system communication.

SSD storage is being produced in packages with a form factor and interface to enable an easy swap into the place of spinning rust, with for example a SATA3 interface and 3.5inch hard disk chassis shape. There is no reason that SSD (or other memory-based) storage cannot be manufactured in all sorts of different form factors, there is no physical constraint of having to house a spinning disc. Density per dollar of course keeps heading towards the basement. TB units will soon be here but maybe we need cheap 256GB units more than anything. So, storage is going to be compact and able to be in form factors like long, thin slabs or even odd shapes.

So when will we start to see cheap machines something like this: Four sockets for 8/16/32 core CPUs, 128GB main memory (which will soon be pretty standard for servers), memory-based storage units that clip to the external housing (to provide all the heat loss they require) that combine many chips to give 1Gb IO rates, interfaced via the PCIe 16X or 32X interface. You don’t need a HBA, your storage is internal. You will have multipath 10GbE going in and out of the box to allow for normal network connectivity and backup, plus remote access of local files if need be.

That should be enough CPU, memory and IO capacity for most business systems {though some quote from the 1960’s about how many companies could possible need a computer spring to mind}. You don’t need shared storage for this, in fact I am of the opinion that shared storage is a royal pain in the behind as you are constantly having to deal with the complexity of shared access and maximising contention on the flimsy excuse of “sweating your assets”. And paying for the benefit of that overly complex, shared, contended solution.

You don’t need a cluster as you have all the cpu, working memory and storage you need in a 1U server. “What about resilience, what if you have a failure?”. Well, I am swapping back my opinion on RAC to where I was in 2002 – it is so damned complex it causes way more outage than it saves. Especially when it comes to upgrades. Talking to my fellow DBA-types, the pain of migration and the number of bugs that come and go from version to version, mix of CRS, RDBMS and ASM versions, that is taking up massive amounts of their time. Dataguard is way simpler and I am willing to bet that for 99.9% of businesses other IT factors cause costly system outages an order of magnitude more times than the difference between what a good MAA dataguard solution can provide you compared to a good stretched RAC one can.

I think we are already almost at the point where most “big” systems that use SAN or similar storage don’t need to be big. If you need hundreds of systems, you can virtualize them onto a small number of “everything local”
boxes.

A reason I can see it not happening is cost. The solution would just be too cheap, hardware suppliers will resist it because, hell, how can you charge hundreds of thousands of USD for what is in effect a PC on steroids? But desktop games machines will soon have everything 99% of business systems need except component redundancy and, if your backups are on fast SSD and you a way simpler Active/Passive/MAA dataguard type configuration (or the equivalent for your RDBMS technology) rather than RAC and clustering, you don’t need that total redundancy. Dual power supply and a spare chunk of solid-state you can swap in for a failed raid 10 element is enough.