jump to navigation

Free Webinar – How Oracle Works! September 15, 2017

Posted by mwidlake in Architecture, internals, Knowledge, Presenting.
Tags: , , ,
3 comments

Next Tuesday (19th September) I am doing a free webinar for ProHuddle. It lasts under an hour and is an introduction to how some of the core parts of the Oracle RDBMS work, I call it “The Heart of Oracle: How the Core RDBMS Works”. Yes, I try and explain all of the core Oracle RDBMS in under an hour! I’m told I just about manage it. You can see details of the event and register for it here. I’ve done this talk a few times at conferences now and I really like doing it, partly as it seems to go down so well and people give me good feedback about it (and occasionally bad feedback, but I’ll get on to that).

The idea behind the presentation is not to do the usual “Intro” and list what the main Oracle operating systems processes – SMON, PMON, RECO etc – are or what the various components of the shared memory do. I always found those talks a little boring and they do not really help you understand why Oracle works the way it does when you use it. I aim to explain what redo is, why it is so important, what actually happens when you commit, how data is written to and read from storage to the cache – and what is actually put in the buffer cache. I explain the concept of point-in-time view, how Oracle does it and why it is so fantastic. And a few other bits and pieces.

I’m not trying to explain to people the absolute correct details of what goes on with all these activities that the database does for you. I’m attempting to give people an understanding of the principles so that more advanced topics make more sense and fit together. The talk is, of course, aimed at people who are relatively new to Oracle – students, new DBAS or developers who have never had explained to them why Oracle works the way it does. But I have found that even some very experienced DBA-types have learnt the odd little nugget of information from the talk.

Of course, in an hour there is only so much detail I can go into when covering what is a pretty broad set of topics. And I lie about things. I say things that are not strictly true, that do not apply if more advanced features of Oracle are used, or that ignore a whole bucket full of exceptions. But it’s like teaching astrophysics at school. You first learn about how the Sun is at the centre of the solar system, all the planets & moons revolve around each other due to gravity and the sun is hot due to nuclear fusion. No one mentions how the earth’s orbit varies over thousands and millions of years until you have the basics. Or that GPS satellites have to take into account the theory of relativity to be as accurate as they are. Those finer details are great to learn but they do not change the fundamental principles of planets going around suns and rocks falling out of the sky – and you need to know the simpler overall “story” to slot in the more complex information.

I talk about this picture.

I start off the talk explaining this simplification and I do try to indicate where people will need to dig deeper if they, for example, have Exadata – but with a webinar I am sure people will join late, drop in and out and might miss that. I must remember to keep reminding people I’m ignoring details. And amongst the audience will be people who know enough to spot some of these “simplifications” and I think the occasional person might get upset. Remember I mentioned the bad feedback? I got accosted at a conference once after I had done this talk by a couple of experts, who were really angry with me that I had said something that was not accurate. But they had missed the start of the talk and my warnings of simplification and did not seem to be able to understand that I would have needed half an hour to explain the details of that on thing that they knew – but I had only 50 minutes in total for everything!

As I said, this is the first Webinar I will have done. I am sure it will be strange for me to present with “no audience” and I’m sure I’ll trip up with the pointer and the slides at some point. I usually have some humour in my presentations but that might not work with no crowd feedback and a worldwide audience. We will see. But I am excited about doing it and, if it works, I may well offer to do more.

As a taster, I explain the above diagram. A lot. I mostly just talk about pictures, there will be very few “wordy” slides.

I invite you all to register for the talk – as I said, it is free – and please do spread the word.

click here to register for the Webinar

Advertisements

Friday Philosophy – “Technical Debt” is a Poor Term. Try “Technical Burden”? June 30, 2017

Posted by mwidlake in database design, development, Friday Philosophy, Management.
Tags: , ,
5 comments

Recently my friend Sabine Heimsath asked a few of us native English speakers what the opposite of “technical debt” was. My immediate reaction was to say:

I’d say (sarcastically) “proper development” or “decent designer” or even “what we did 25 bloody years ago when we were allowed to take pride in the software we created!”

But my next comment was less reactive and more considered. And that was to say that I did not like the phrase “Technical Debt”:

A debt is when you owe something to someone, to be paid back. You do not owe anything to someone when you build poor systems, you are actually creating a “technical burden” – something those in the future will need to live with and may eventually have to sort out. Those who created the bad app or design will probably not be the ones fixing it – as in paying the debt.

That is of course not always true. Some of us have had to end up fixing a poor solution that we implemented – usually implemented despite our protestations that it was a daft thing to do. But the usual case is that a badly thought-out solution is implemented in a rush, with little design, or with inadequate testing, because of a management pressure to be “agile” or “fast moving”. And it is done with cheap or over-stretched resource.

Also, “technical debt” to me sounds too organised and too easy to fix. If you have a financial debt, you simply pay it back with some interest. In almost all situations I have seen where there is a “technical debt”, the interest to pay – the extra effort and time – is considerably more than was saved in the first place. Sometimes it is more than the original cost of the whole project! Loan Shark territory.

When the poorly designed/implemented system falls over in a heap sometimes the hard-pressed local staff lack the skills or bandwidth to fix it and “Experts” are called in to sort it out. And part of the time taken to fix it is the expert going “why in f**k did you ever think this was a good idea?” (Maybe using better terminology, but that is what they mean!). Being more serious, sometimes the largest slice of time is when as an “Expert” you have to persuade the people who own this mess that it really does need sorting out properly, not just another quick hack – and it really will take much , much more effort than what they originally saved by deciding to implement this fast & dirty. Sorry, I mean “lean & mean”.

This situation often has a secondary impact – it makes the people who originally implemented the solution look poor. And that may or may not be fair. I’ve seen many cases where the original staff (including myself) were forced to do things they did no like by timing constraints, lack of budget or simply the ridiculous demands by someone higher up the organisation who thought simply shouting and demanding would make good things happen. They don’t, you end up creating a burden. Though I have also seen poor solutions because the original team were poor.

I think at the moment a lot of what is called “systems development” is more like a desperate drive to constantly cut corners and do things quicker. If it goes wrong, it’s just a debt, we pay it back. No, no it is not. It’s often a bloody mess and a Burden for years. I keep hoping that, like many things in I.T. this will be a phase we can cycle out of and back into doing proper planning and implementation again. Yes, anything that speeds things up without losing planing/design is great. And if you have the skills, you can do proper Agile, designing the detail as you go – IF you have the general over-arching design already in place. But I wish there was more consideration of the later cost of quick & dirty.

So what was the term Sabine wanted? Well, I can’t speak for her, I am not 100% sure what she was looking for. But from my perspective, we should not say “Technical Debt” but “Technical Burden”. And the opposite might be “technical Investment”. You spend a bit of time and effort now in considering how you can create a solution that can expand or is flexible. I know from my own personal experience that it is when you are given the chance to do those things that you provide a solution that last and lasts and lasts. Only when I have been allowed to properly consider the business need do I create something still used in 10 years. Or 15. Or maybe even 20. That might need some checking!

So, if you really want to build systems to support a successful business, and not a short-lived flash, perhaps you should be saying something like:

You are asking me to create a Technical Burden. Would you rather not help me create a Technical Investment?

If anything else, you might at least be adding a new entry to “Buzzword Bingo”.

Why oh Why Do We Still Not Have a Fast Bulk “SQL*Unloader” Facility? December 1, 2016

Posted by mwidlake in Architecture, database design, performance.
Tags: , , ,
6 comments

Way back in 2004 I was working at the UK side of the Human Genome project. We were creating a massive store of DNA sequences in an Oracle database (this was one of two world-wide available stores of this information, for free & open use by anyone {* see note!}). The database was, for back then, enormous at 5-6TB. And we knew it would approx double every 12 months (and it did, it was 28TB when I had to migrate it to Oracle 10 in 2006, over 40TB 6 months later and grew to half a petabyte before it was moved to another organisation). And were contemplating storing similar massive volumes in Oracle – Protein, RNA and other sequence stores, huge numbers of cytological images (sorry, microscope slides).

I did my little bit to work out how we all tick

I did my little bit to work out how we all tick

Significant chunks of this data were sometimes processed in specialist ways using Massively Parallel Processing farms (hundreds or thousands of compute nodes so we could do “2 years of compute” in a day or even a few hours). Oracle was never going to be able to support this within the database.

But Oracle had no fast, bulk data unloading offering, a “SQL*Unloader”. Other RDBMSs did. And that gave me an issue. I had to argue as to why hold this data in Oracle where it was hard to yank out chunks of it rather than keep it in say a simple file store or MySQL?

That year I was invited to Oracle Open World and I was given a slot with a senior VP in the RDBMS/OCFS area (as that is what were using – I know! But we were). I had been warned by someone that he could be a little “robust”. In fact my friend called him “Larry’s Rottweiler”. However my chat with the SVP went fine. Until he asked me for the one thing I would really want and I said “well, it’s been promised for the last 2 or 3 versions of Oracle, but it still does not exist in V10 – I’d like a SQL*Unloader so we can rapidly extract bulk data from Oracle into a structured file format”. He went very cold. I continued: “We need to do really intensive processing of the data with ‘C’ and if it is locked into the database we can’t do that. It’s hard for me to justify using Oracle as our primary data store if the data is now stuck there…”

I honestly thought he would bite!

I honestly thought he would bite!

He went ballistic. He was furious, he was raising his voice, leaning forward, I honestly thought he might hit me. I can’t remember the whole rant but one line stuck: “We will NEVER help you get your data out of Oracle!”. I ended the discussion as soon as I could – which obviously suited him fine also.

And here we are over 10 years later and no SQL*Unloader has appeared. And, as far as I can see, there is no longer even a suggestion that “one day” one will exist. You can obviously get data out of Oracle using SQL*Plus, SQLcl or PL/SQL’s UTL_FILE. But you have to write your own code and to make it parallel or fast is not that simple. And there are some commercial tools out there to do it. But fundamentally we do not have a simple, robust & fast data unloader tool within the Oracle toolset.

But I remain mystified as to why Oracle Corp will not provide a tool to do this. I can see the argument that doing so would help you migrate your Oracle Data set to a different platform and thus move away from Oracle, but I think that is a dumb argument. If you are willing to dump your Oracle environment for something else you are already not happy – and making life difficult is only going to make you less happy and almost certain to never come back! It’s incredibly inconvenient to extract all your data at present but compared to the cost to your organisation of changing a central part of your infrastructure, it’s just a bloody annoyance. It’s like domestic service providers (telephone, internet, TV) that make it easy for you to sign up but a nightmare to leave. Thus guaranteeing that once you leave (cos you will) you will certainly NOT be going back to them in a hurry!

So for the sake of what I see as a misplaced protectionist stance they close the door to rapidly extracting data from Oracle databases for processing in other ways. I’ve come across other situations like this, before and since, but the Human Genome issue threw it into sharp relief for me. The end result I’ve seen a few times (and effectively what we had at the Sanger) is the data gets stored at least twice – once in Oracle and then somewhere else where it is easy to access. Now that’s a daft and bad place to be, multiple copies and one of them at least lacking RI. Thanks for forcing that one on us Oracle.

Something that is changing is Cloud. Oracle wants us all to move our data and systems into the sky. Their whole marketing message at the moment is nothing but cloud-cloud-cloud and they give the impression that cloud is in fact everything, every solution. So how do you get 10TB of data into the cloud? It’s not so hard really. You can trickle it in over time, after all networks are constantly getting faster, but for bulk data this is a poor solution. Or you can ship your data physically. You can’t beat the bandwidth of a transit van. I’ve heard Oracle people at OOW and other conferences saying this is easy-peasy to arrange (as it should be, I’ve done it for clients doing migrations a couple of times).

But how are you going to get it out again? How well do you think that a company that has never provided SQL*Unloader is going to help you get bulk data back out of your Oracle cloud-based databases into on-premises systems? And how well are they going to help you shift it from one of their systems to another cloud system? One they have not sold you? Sometimes people talk about business relationships being like a marriage. Well once you gave gone cloud, you might have ensured any divorce is going to be really messy!

Update – see the comment by Noons on the fact that VLDBs and Data Warehouses are often not as static as you think – so there can be a constant need to move a lot data into and out of such very large data stores. Cloud is not currently going to support this well. And if you have a data life-cycle management policy, say archiving off data 3+ years old. Where are you archiving it off to and how will you do this once it is in the web?

* Please note, this was not human DNA sequence information gathered for medical reasons! It was mostly sequence information for other organisms (chimps, flies, potatoes, rice, viruses, bacteria all sorts) generated by the scientific and, in some cases, biopharma community. There was human DNA sequence in there but by consent and anonymised – you can’t have a human genome sequence without involving some humans! We did not hold your Dodgy Uncle’s DNA test that was taken for that court case…

Friday Philosophy – The Singular Stupidity of the Sole Solution April 22, 2016

Posted by mwidlake in Architecture, Exadata, Friday Philosophy, Hardware.
Tags: , , ,
13 comments

I don’t like the ‘C’ word, it’s offensive to some people and gets used way too much. I mean “cloud” of course. Across all of I.T. it’s the current big trend that every PR department seems to feel the need to trump about and it’s what all Marketing people are trying to sell us. I’m not just talking Oracle here either, read any computing, technical or scientific magazine and there are the usual adds by big I.T. companies like IBM and they are all pushing clouds (and the best way to push a cloud is with hot air). And we’ve been here before so many times. It’s not so much the current technical trend that is the problem, it is the obsession with the one architecture as the solution to fit all requirements that is damaging.

No clouds here yet

No clouds here yet

When a company tries to insist that X is the answer to all technical and business issues and promotes it almost to the exclusion of anything else, it leads to a lot of customers being disappointed when it turns out that the new golden bullet is no such thing for their needs. Especially when the promotion of the solution translates to a huge push in sales of it, irrespective of fit. Technicians get a load of grief from the angry clients and have to work very hard to make the poor solution actually do what is needed or quietly change the solution for one that is suitable. The sales people are long gone of course, with their bonuses in the bank.

But often the customer confidence in the provider of the solution is also long gone.

Probably all of us technicians have seen it, some of us time after time and a few of us rant about it (occasionally quite a lot). But I must be missing something, as how can an organisation like Oracle or IBM not realise they are damaging their reputation? But they do it in a cyclical pattern every few years, so whatever they gain by mis-selling these solutions is somehow worth the abuse of the customer – as that is what it is. I suppose the answer could be that all large tech companies are so guilty of this that the customer end up feeling it’s a choice between a list of equally dodgy second hand car salesemen.

Looking at the Oracle sphere, when Exadata came along it was touted by Oracle Sales and PR as the best solution – for almost everything. Wrongly. Utterly and stupidly wrongly. Those of us who got involved in Exadata with the early versions, especially I think V2 and V3, saw it being implemented for OLTP-type systems where it was a very, very expensive way of buying a small amount of SSD. The great shame was that the technical solution of Exadata was fantastic for a sub-set of technical issues. All the clever stuff in the storage cell software and maximizing hardware usage for a small number of queries (small sometimes being as small as 1) was fantastic for some DW work with huge full-segment-scan queries – and of no use at all for the small, single-account-type queries that OLTP systems run. But Oracle just pushed and pushed and pushed Exadata. Sales staff got huge bonuses for selling them and the marketing teams seemed incapable of referring to the core RDBMS without at least a few mentions of Exadata
Like many Oracle performance types, I ran into this mess a couple of times. I remember one client in particular who had been told Exadata V2 would fix all their problems. I suspect based solely on the fact it was going to be a multi-TB data store. But they had an OLTP workload on the data set and any volume of work was slaying the hardware. At one point I suggested that moving a portion of the workload onto a dirt cheap server with a lot of spindles (where we could turn off archive redo – it was a somewhat unusual system) would sort them out. But my telling them a hardware solution 1/20th the cost would fix things was politically unacceptable.

Another example of the wonder solution is Agile. Agile is fantastic: rapid, focused development, that gets a solution to a constrained user requirement in timescales that can be months, weeks, even days. It is also one of the most abused terms in I.T. Implementing Agile is hard work, you need to have excellent designers, programmers that can adapt rapidly and a lot, and I mean a LOT, of control of the development and testing flow. It is also a methodology that blows up very quickly when you try to include fix-on-fail or production support workloads. It also goes horribly wrong when you have poor management, which makes the irony that it is often implemented when management is already failing even more tragic. I’ve seen 5 agile disasters for each success, and on every project there are the shiny-eyed Agile zealots who seem to think just implementing the methodology, no matter what the aims of the project or the culture they are in, is guaranteed success. It is not. For many IT departments, Agile is a bad idea. For some it is the best answer.

Coming back to “cloud”, I think I have something of a reputation for not liking it – which is not a true representation of my thoughts on it, but is partly my fault as I quickly tired of the over-sell and hype. I think some aspect of cloud solutions are great. The idea that service providers can use virtualisation and container technology to spin up a virtual server, a database, an application, an application sitting in a database on a server, all in an automated manner in minutes, is great. The fact that the service provider can do this using a restricted number of parts that they have tested integrate well means they have a way more limited support matrix and thus better reliability. With the Oracle cloud, they are using their engineered systems (which is just a fancy term really for a set of servers, switches, network & storage configured in a specific way with their software configure in a standard manner) so they can test thoroughly and not have the confusion of a type of network switch being used that is unusual or a flavor of linux that is not very common. I think these two items are what really make cloud systems interesting – fast, automated provisioning and a small support matrix. Being available over the internet is not such a great benefit in my book as that introduces reasons why it is not necessarily a great solution.

But right now Oracle (amongst others) is insisting that cloud is the golden solution to everything. If you want to talk at Oracle Open World 2016 I strongly suspect that not including the magic word in the title will seriously reduce your chances. I’ve got some friends who are now so sick of the term that they will deride cloud, just because it is cloud. I’ve done it myself. It’s a potentially great solution for some systems, ie running a known application that is not performance critical that is accessed in a web-type manner already. It is probably not a good solution for systems that are resource heavy, have regulations on where the data is stored (some clinical and financial data cannot go outside the source country no matter what), alter rapidly or are business critical.

I hope that everyone who uses cloud also insists that the recovery of their system from backups is proven beyond doubt on a regular basis. Your system is running on someone else’s hardware, probably managed by staff you have no say over and quite possibly with no actual visibility of what the DR is. No amount of promises or automated mails saying backs occurred is guarantee of recovery reliability. I’m willing to bet that within the next 12 months there is going to be some huge fiasco where a cloud services company loses data or system access in a way that seriously compromises a “top 500” company. After all, how often are we told by companies that security is their top priority? About as often as they mess it up and try to embark on a face-saving PR exercise. So that would be a couple a month.

I just wish Tech companies would learn to be a little less single solution focused. In my book, it makes them look like a bunch of excitable children. Give a child a hammer and everything needs a pounding.

The “as a Service” paradigm. October 27, 2015

Posted by mwidlake in Architecture, Hardware, humour.
Tags: , , ,
4 comments

For the last few days I have been at Oracle Open World 2015 (OOW15) learning about the future plans and directions for Oracle. I’ve come to a striking realisation, which I will reveal at the end.

The message being pressed forward very hard is that of compute services being provided “As A Service”. This now takes three flavours:

  1. Being provided by a 3rd party’s hardware via the internet, ie in The Cloud.
  2. Having your own hardware controlled and maintained by you but providing services with the same tools and quick-provisioning ideology as “cloud”. This is being called On Premise (or just “On Prem” if you are aiming to annoy the audience), irrespective of the probably inaccuracy of that label (think hosting & dedicated compute away from head office)
  3. A mix of the two where you have some of your system in-house and some of it floating in the Cloud. This is called Hybrid Cloud.

There are many types of  “as a Service offerings, the main ones probably being

  • SaaS -Software as a Service
  • PaaS – Platform as a Service
  • DBaas – Database as a a Service
  • Iass – Infrastructure as a Service.

Whilst there is no denying that there is a shift of some computer systems being provided by any of these, or one of the other {X}aaS offerings, it seems to me that what we are really moving towards is providing the hardware, software, network and monitoring required for an IT system. It is the whole architecture that has to be considered and provided and we can think of it as Architecture as a Service or AaaS. This quick provisioning of the architecture is a main win with Cloud, be it externally provided or your own internal systems.

We all know that whilst the provision time is important, it is really the management of the infrastructure that is vital to keeping a service running, avoiding outages and allowing for upgrades. We need a Managed Infrastructure (what I term MI) to ensure the service provided is as good as or better than what we currently have. I see this as a much more important aspect of Cloud.

Finally, it seems to me that the aspects that need to be considered are more than initially spring to mind. Technically the solutions are potentially complex, especially with hybrid cloud, but also there are complications of a legal, security, regulatory and contractual aspect. If I have learnt anything over the last 2+ decades in IT it is that complexity of the system is a real threat. We need to keep things simple where possible – the old adage of Keep It Simple, Stupid is extremely relevant.

I think we can sum up the whole situation by combining these three elements of architecture, managed infrastructure and simplicity into one encompassing concept, which is:

KISS MI AaaS.

.

.

And yes, that was a very long blog post for a pretty weak joke. 5 days of technical presentations and non-technical socialising does strange things to your brain

Friday Philosophy – Building for the Future August 14, 2015

Posted by mwidlake in Architecture, development, Friday Philosophy.
Tags: , ,
2 comments

I started my Oracle working life as a builder – a Forms & Reports Builder (briefly on SQL*Forms V2.3 but thankfully within a month or two we moved up to SQL*Forms V3, SQL*reportwriter V1.1 and SQL*Menu 5 – who remembers SQL*Menu?). Why were we called Builders? I guess as you could get a long way with those tools by drawing screens, utilising the (pretty much new) RI in the underlying Oracle V7 to enforce simple business rules and adding very simple triggers – theoretically not writing much in the way of code. It was deemed to be more like constructing stuff out of bits I guess. But SQL*Forms V3 had PL/SQL V1 built in and on that project we used it a *lot*.

I had been an “Analyst Programmer” for 3 years before then and I’ve continued to be a developer/programmer/constructor-of-code on and off over the intervening couple of decades. I’m still a developer at times. But sometimes I still think of it as being a “builder” as, if you do it write {sorry, little word-play joke there} you are using bits of existing stuff and code designs/patterns you know work well and constructing your system. The novel part, the bit or bits that have never been done before (at least by me), the “architecting” of those units into something interestingly different or the use of improved programming features or techniques vary from almost-none to a few percent. That is the part which I have always considered true “Software Development”.

So am I by implication denigrating the fine and long-standing occupation of traditional builders? You know, men and women who know what a piece of two-by-four is and put up houses that stay put up? No. Look at the below.
House_and_odd_feature

This is part of my neighbour Paul’s house. He is a builder and the black part in the centre with the peaked roof is an extension he added a few years back, by knocking his garage down. The garage was one of three, my two were where the garage doors you can see are and to the left. So he added in his two-story extension, with kitchen below and a very nice en-suite bedroom above, between his house and my ratty, asbestos-riddle garages. Pretty neat. A few years later he knocked down my garages and built me a new one with a study on top (without the asbestos!) and it all looks like it was built with his extension. Good eh? But wait, there is more. You will have noticed the red highlight. What is that white thing?

Closer in - did he forget some plumbing?

Closer in – did he forget some plumbing?


This pipe goes clean through the house

This pipe goes clean through the house

When I noticed that white bit after Paul had finished his extension I figured he had planned more plumbing than he put in. I kept quiet. Then, when he had built my new garage and study, I could not help ask him about the odd plumbing outlet. So he opened it. And it goes through the dividing wall all the way through to the other side of the house. Why?

“Well Martin, putting in cables and pipes and s**t into an existing house that go from one side to the other, especially when there is another building next door, as a real pain in the a**e. It does my ‘ead in. So when a build something that is not detached, I put in a pipe all the way through. Now if I need to run a cable from one side of the house to the other, I have my pipe and I know it is straight, clean, and sloping every so slightly downwards”. Why downwards? “Water Martin. You don’t want water sitting in that pipe!”.

I’ve noticed this about builders. When I’ve had work done that is good, there is at least one person on the team who thinks not just about how to erect or do what needs to be done today, they do indeed think about what you will need after the build is done, or in a few years. Such as hanging doors so they do not smack into the cupboards you will put in next… *sigh*. Paul is the thinking guy in his little team. I suspect one of the others is pretty smart too.

But isn’t this what the architect is for? To think about living with the building? Well, despite the 7 years plus needed to become a true architect (as that term really means, not as some stolen label for software designers with too much ego) I’ve had builders spot the pragmatic needs a couple of times that the architect missed.

And as I think we would all agree, a good software developer always has an eye on future maintenance and modification of the software they develop. And they want to create something that fits in the existing system and looks right. So just like my builder neighbour does.

I’m not a software architect. I’m a code builder. And I’m proud of it.

With Modern Storage the Oracle Buffer Cache is Not So Important. May 27, 2015

Posted by mwidlake in Architecture, Hardware, performance.
Tags: , , , ,
11 comments

With Oracle’s move towards engineered systems we all know that “more” is being done down at the storage layer and modern storage arrays have hundreds of spindles and massive caches. Does it really matter if data is kept in the Database Buffer Cache anymore?

Yes. Yes it does.

Time for a cool beer

Time for a cool beer

With much larger data sets and the still-real issue of less disk spindles per GB of data, the Oracle database buffer cache is not so important as it was. It is even more important.

I could give you some figures but let’s put this in a context most of us can easily understand.

You are sitting in the living room and you want a beer. You are the oracle database, the beer is the block you want. Going to the fridge in the kitchen to get your beer is like you going to the Buffer Cache to get your block.

It takes 5 seconds to get to the fridge, 2 seconds to pop it open with the always-to-hand bottle opener and 5 seconds to get back to your chair. 12 seconds in total. Ahhhhh, beer!!!!

But – what if there is no beer in the fridge? The block is not in the cache. So now you have to get your car keys, open the garage, get the car out and drive to the shop to get your beer. And then come back, pop the beer in the fridge for half an hour and now you can drink it. That is like going to storage to get your block. It is that much slower.

It is only that much slower if you live 6 hours drive from your beer shop. Think taking the scenic route from New York to Washington DC.

The difference in speed really is that large. If your data happens to be in the memory cache in the storage array, that’s like the beer already being in a fridge – in that shop 6 hours away. Your storage is SSD-based? OK, you’ve moved house to Philadelphia, 2 hours closer.

Let's go get beer from the shop

Let’s go get beer from the shop

To back this up, some rough (and I mean really rough) figures. Access time to memory is measured in Microseconds (“us” – millionths of a second) to hundreds of Nanoseconds (“ns” – billionths of a second). Somewhere around 500ns seems to be an acceptable figure. Access to disc storage is more like Milliseconds (“ms” – thousandths of a second). Go check an AWR report or statspack or OEM or whatever you use, you will see that db file scattered reads are anywhere from low teens to say 2 or 3 ms, depending on what your storage and network is. For most sites, that speed has hardly altered in years as, though hard discs get bigger, they have not got much faster – and often you end up with fewer spindles holding your data as you get allocated space not spindles from storage (and the total sustainable speed of hard disc storage is limited to the total speed of all the spindles involved). Oh, the storage guys tell you that your data is spread over all those spindles? So is the data for every system then, you have maximum contention.

However, memory speed has increased over that time, and so has CPU speed (though CPU speed has really stopped improving now, it is more down to More CPUs).

Even allowing for latching and pinning and messing around, accessing a block in memory is going to be at the very least 1,000 times faster than going to disc, maybe 10,000 times. Sticking to a conservative 2,000 times faster for memory than disc , that 12 seconds trip to the fridge equates to 24,000 seconds driving. That’s 6.66 hours.

This is why you want to avoid physical IO in your database if you possibly can. You want to maximise the use of the database buffer cache as much as you can, even with all the new Exadata-like tricks. If you can’t keep all your working data in memory, in the database buffer cache (or in-memory or use the results cache) then you will have to do that achingly slow physical IO and then the intelligence-at-the-hardware comes into it’s own, true Data Warehouse territory.

So the take-home message is – avoid physical IO, design your database and apps to keep as much as you can in the database buffer cache. That way your beer is always to hand.

Cheers.

Update. Kevin Fries commented to mention this wonderful little latency table. Thanks Kevin.

“Here’s something I’ve used before in a presentation. It’s from Brendan Gregg’s book – Systems Performance: Enterprise and the Cloud”

An Oracle Instance is Like An Upmarket Restaurant January 28, 2015

Posted by mwidlake in Architecture.
Tags: ,
11 comments

I recently did an Introduction to Oracle presentation, describing how the oracle instance worked – technically, but from a very high level. In it I used the analogy of a restaurant, which I was quite happy with. I am now looking at converting that talk into a set of short articles and it struck me that the restaurant analogy is rather good!

Here is a slide from the talk:

Simple partial overview of an Oracle Instance

Simple partial overview of an Oracle Instance

As a user of the oracle instance, you are the little, red blob at the bottom left. You (well, your process, be it SQL*Plus, SQL*Developer, a Java app or whatever) do nothing to the database directly. It is all done for you by the Oracle Sever Process – and this is your waiter.

Now, the waiter may wait on many tables (Multi-threaded server) but this is a very posh restaurant, you get your own waiter.

You ask the waiter for food and the waiter goes off and asks the restaurant to provide it. There are many people working in the restaurant, most of them doing specific jobs and they go off and do whatever they do. You, the customer, have no idea who they are or what they do and you don’t really care. You don’t see most of them. You just wait for your food (your SQL results) to turn up. And this is exactly how an Oracle Instance works. Lots of specific processes carry out their own tasks but they are coordinated and the do the job without most of us having much of an idea what each bit does. Finally, some of the food is ready and the waiter delivers the starter to you – The server process brings you the first rows of data.

Let’s expand the analogy a bit, see how far we can take it.

When you arrived at the restaurant, the Maître d’ greets you and allocates you to your waiter. This is like the Listener process waiting for connection requests and allocating you a server process. The Listener Process listens on a particular port, which is the front door to the restaurant. When you log onto an oracle database your session is created, ie your table is laid. If someone has only just logged off the database their session might get partially cleared and re-used for you (you can see this as the SID may well get re-used), as creating a session is a large task for the database. If someone had just left the restaurant that table may have a quick brush down and the cutlery refreshed, but the table cloth, candle and silly flower in a vase stay. Completely striping a table and relaying it takes more time.

The restaurant occupies a part of the building, the database occupies part of the server. Other things go in the server, the restaurant is in a hotel.

The PMON process is the restaurant manager or Head of House maybe and SMON is the kitchen manager, keeping an eye on the processes/staff and parts of the restaurant they are responsible for. To be candid, I don’t really know what PMON and SMON do in detail and I have no real idea how you run a large kitchen.

There are lots of other processes, these are equivalent to the Sous-chef, Saucier, commis-chef, Plonger (washes up, the ARC processes maybe?), Ritisseur, Poissonier, Patissier etc. They just do stuff, let’s not worry about the details, we just know there are lots of them making it all happen and we the customer or end user never see them.

The PGA is the table area in the restaurant, where all the dishes are arranged and provided to each customer? That does not quite work as the waiter does not sit at our table and feed us.

The SGA is the kitchen, where the ingredients are gathered together and converted into the dishes – the data blocks are gathered in the block buffer cache and processed. The Block Buffer Cache are the tables and kitchen surfaces, where all the ingredients sit. The Library cache is, yes, the recipes. They keep getting re-used as our kitchen only does certain recipes, it’s a database with a set of standard queries. It’s when some fool orders off-menu that it all goes to pot.

Food is kept in the larder and fridges – the tablespaces on disc. You do not prepare the dishes in the larder or fridge, let alone eat food out of them (well, some of the oracle process might nick the odd piece of cooked chicken or chocolate). everything is brought into the kitchen {the SGA} and processed there, on the kitchen tables.

The orders for food are the requests for change – the redo deltas. Nothing is considered ordered until it is on that board in the kitchen, that is the vital information. All the orders are preserved (so you know what was ordered, you can do the accounts and you can re-stock). The archived redo. You don’t have to keep this information but if you don’t, it’s a lot harder to run the restaurant and you can’t find out what was ordered last night.

The SCN is the clock on the wall and all orders get the time they were place on them, so people get their food prepared in order.

When you alter the ingredients, eg grate some of the Parmesan cheese into a sauce, the rest of the cheese (which, being an ingredient is in the SGA) is not put back into the fridge immediately, ie put back into storage. It will probably be used again soon. That’ll push it up the LRU list. Eventually someone will put it back, probably the Garçon de cuisine (the kitchen boy). A big restaurant will gave more then one Garçon de cuisine, all with DBW1 to x written on the back of their whites, and they take the ingredients back to the larder or kitchen when they get around to it – or are ordered to do so by one of the chefs.

Can we pull in the idea of RAC? I think we can. We can think of it as a large hotel complex which will have several restaurants, or at least places to eat. They have their own kitchens but the food is all stored in the central store rooms of the hotel complex. I can’t think what can be an analogy of block pinging as only a badly designed or run restautant would for example only have one block of Parmesan cheese – oh, maybe it IS a lot like some of the RAC implementations I have seen 🙂

What is the Sommelier (wine waiter) in all of this? Suggestions on a post card please.

Does anyone have any enhancements to my analogy?

How do you Explain Oracle in 50 Minutes? December 2, 2014

Posted by mwidlake in Architecture, conference, UKOUG.
Tags: , ,
10 comments

I’ve done a very “brave”* thing. I’ve put forward a talk to this year’s UKOUG Tech14 conference titled “How Oracle Works – in under 50 minutes”. Yes, I really was suggesting I could explain to people how the core of Oracle functions in that time. Not only that, but the talk is aimed at those new to Oracle technology. And it got accepted, so I have to present it. I can’t complain about that too much, I was on the paper selection committee…

* – “brave”, of course, means “stupid” in this context.

As a result I am now strapped to the chair in front of my desk, preparing an attempt to explain the overall structure of an Oracle instance, how data moves in out of storage, how ACID works and a few other things. Writing this blog is just avoidance behaviour on my part as I delay going back to it.

Is it possible? I’m convinced it is.

If you ignore all the additional bits, the things that not all sites use, such as Partitioning, RAC, Resource Manager, Materialized Views etc, etc, etc, then that removes a lot. And if not everyone uses it, then it is not core.
There is no need or intention on my part to talk about details of the core – for example, how the Cost Based aspect of the optimizer works, Oracle permissions or the steps needed for instance recovery. We all use those but the details are ignored by some people for their whole career {not usually people who I would deem competent, despite them holding down jobs as Oracle technicians, but they do}.

You are left with a relatively small set of things going on. Don’t get me wrong, it is still a lot of stuff to talk about and is almost certainly too much for someone to fully take in and digest in the time I have. I’m going to have to present this material as if I am possessed. But my intention is to describe a whole picture that makes sense and will allow people to understand the flow. Then, when they see presentations on aspects of it later in the conference, there is more chance it will stick. I find I need to be taught something 3 or 4 times. The first time simply opens my mind to the general idea, the second time I retain some of the details and the third or forth time I start integrating it into what I already new.

My challenge is to say enough so that it makes sense and *no more*. I have developed a very bad habit of trying to cram too much into a presentation and of course this is a real danger here. I’m trying to make it all visual. There will be slides of text, but they are more for if you want to download the talk after the conference. However, drawing pictures takes much, much, much longer than banging down a half dozen bullet points.

One glimmer in the dark is that there is a coffee break after my session. I can go right up to the wire and then take questions after I officially stop, if I am not wrestled to the ground and thrown out the room.

If anyone has any suggestions or comments about what I should or should not include, I’d love to hear them.

This is all part of my intention to provide more conference content for those new to Oracle. As such, this “overview” talk is at the start of the first day of the main conference, 10am Monday. I have to thank my fellow content organisers for allowing me to stick it in where I wanted it. If you are coming to the conference and don’t know much Oracle yet – then I am amazed you read my blog (or any other blog other than maybe AskTom). But if you have colleagues or friends coming who are still relatively new to the tech, tell them to look out for my talk. I really hope it will help them get that initial understanding.

I had hoped to create a fully fledged thread of intro talks running through all of Monday and Tuesday, but I brought the idea up too late. We really needed to promote the idea at the call for papers and then maybe sources a couple of talk. However, using the talks that were accepted we did manage to get a good stab at a flow of intro talks through Monday. I would suggest:

  • 08:50 – Welcome and Introduction
    • Get there in time for the intro if you can, as if you are newish to the tech you are probably newish to a conference
  • 09:00 RMAN the basics, by Michael Abbey.
    • If you are a DBA type, backup/recovery is your number one concern.
  • 10:00 – How Oracle Works in 50 Minutes
    • I think I have said enough!
  • 11:30 – All about Joins by Tony Hasler
    • Top presenter, always good content
  • 12:30 – Lunch. Go and talk to people, lots of people, find some people you might like to talk with again. *don’t stalk anyone*
  • 13:20 – Go to the Oracle Keynote.
    • Personally, I hate whole-audience keynotes, I am sick of being told every year how “there has never been a better time to invest in oracle technology” – but this one is short and after it there is a panel discussion by technical experts.
  • 14:30 is a bit tricky. Tim Hall on Analytical Functions is maybe a bit advanced, but Tim is a brilliant teacher and it is an intro to the subject. Failing that, I’d suggest the Oracle Enterprise Manager round table hosted by Dev Nayak as Database-centric oracle people should know OEM.
  • 16:00 – Again a bit tricky for someone new but I’d plump for The role of Privileges and Roles in Oracle 12C by Carl Dudley. He lectures (lectured?) in database technology and knows his stuff, but this is a New Feature talk…
  • 17:00 – Tuning by Explain Plan by Arian Stijf
    • This is a step-by-step guide to understanding the most common tool used for performance tuning
  • 17:50 onwards – go to the exhibition drinks, the community drinks and just make friends. One of the best thing to come out of conferences is meeting people and swapping stories.

I better get back to drawing pictures. Each one takes me a day and I need about 8 of them. Whoops!

Friday Philosophy – The Importance of Context November 23, 2012

Posted by mwidlake in Friday Philosophy, Perceptions.
Tags: , , , ,
1 comment so far

A couple of weeks ago I was making my way through the office. As I came towards the end of the large, open-plan room I became aware that there was someone following behind me so, on passing through the door I held it briefly for the person behind me {there was no where else they could be going}, turned left and through the next door – and again held it and this time looked behind me to see if the person was still going the same way as I. The lady behind gave me the strangest look.

The strange look was reasonable – the door I’d just held for her was the one into the gentleman’s bathroom. *sigh*

I was doing the correct thing, I was attempting to be helpful to a fellow person, I was in fact being very polite. But because I had utterly failed to consider the context, there is now a lady who works on the same floor as I who considers me, at best, as strange. At worst she thinks I am very strange – and more than a little creepy. I fear the latter given her reaction when she saw me in the kitchen area recently and turned around. {By the way, if anyone can think of a good way I can clear this up I’d appreciate it. After all, I can’t exactly go up to her and say “sorry about holding the door to the gents for you the other day, I did not realise you were a woman”}.

My point is that you can do what you believe is the right thing but, because you are not thinking of the context or are unaware of the full situation, you end up giving utterly the wrong impression. I had a work situation like this a while back.

Without going into too much detail, I was working with a client on a data warehouse project. The Oracle database bulk-processed large quantities of data, did classic big-data queries and was sitting on some fairly expensive hardware with dedicated storage and the intention of implementing Dataguard. One of the issues they had was with a subsidiary part of the system that created a very large number of small transactions, lots of updates. High volume OLTP on a DW setup. It was hammering the storage and eating up all the available IO. The data for this subsidiary system was transient, no need to protect it.
I realised that the hardware was not correct for this subsidiary system and it needed no archived redo. Archiving redo is an all or nothing situation for an Oracle instance. So happy that I had worked out what to do I proposed {with a smile} moving the subsidiary system to it’s own database on it’s own hardware.
When I said this to the client, their response was a stony look and the comment “We’ve just spent a fortune on this platform……”. Having dug my hole I proceeded to jump right in there “It’s OK, what I am proposing is only about 5, 7 thousand pounds of kit – nothing compared to what you spent already!”. The client now got very, very annoyed indeed.

You see, the context is that they had been sold a system that was very expensive – it was to do a demanding job. They had been getting poor performance with the system and that is partly why I was there. They also did not really understand the technical nuances well (at least, not the chaps I was talking to) and they did not appreciate why I said what I did. From their perspective, this smiling loon was suggesting that a system costing 2-3% of what they had spent on their data warehouse platform was going to be able to do the processing that the expensive system could not. Either they had spent waaay too much, this new “expert” was an idiot or else I was lying to them. And they did not like any of those options.

Looking back it is clear I should have been more aware of how they would receive what I said. I’ve done this before {several times}, bounded into a situation like a wide-eyed puppy and gone “Look! We can just do that!” without considering things like upsetting the guy who had suggested the original solution, or making the on-site expert look stupid or blowing away a salesman’s pitch. Or that I have missed a glaring and valid reason why they can’t “just do that”.

I suspect that a few people would say “no, you just tell them the way it is and if they don’t like it or you upset someone then tough”. Well, maybe, but not if you want to be there to help fix the next problem. Also, I know I am not great at appreciating the context sometimes. That is part of why I will never run a company or be a senior manager, I lack the skills to judge the impact of what I propose or say sometimes, in my rush to be helpful. I am slowly learning to just hold back on ideas though and to run things past friends or colleagues with more “whole picture” skills first though. I might be rubbish at it but I can learn I am rubbish at it.

In the case of the situation above, the expensive system was correct for what they wanted to do – and maybe not quite expensive enough. I was suggesting a slightly unusual fix for a specific problem and I should have been more laboured in explaining the problem and more circumspect in leading them to the solution. I should have taken more time.

I should have checked who was following me and where I was going before I held the door open.