jump to navigation

User Group Meetings Next Week (free training everyone!) July 11, 2014

Posted by mwidlake in Exadata, Meeting notes, UKOUG.
Tags: , , , ,
add a comment

I know, posts about up-coming user group meetings are not exactly exciting, but it’s good to be reminded. You can’t beat a bit of free training, can you?

On Monday 14th I am doing a lightning talk at the 4th Oracle Midlands event. The main reason to come along is to see Jonathan Lewis talk about designing efficient SQL and then he will also do a 10 minute session on Breaking Exadata (to achieve that aim I suggest you just follow the advice of the Oracle Sales teams, that will break Exadata for you pretty efficiently!).

If you are not familiar with the Oracle Midlands events, they are FREE evening events run in Birmingham, just north of the center in Aston, at the Innovation Birmingham centre. See the web site for details and to register. The great thing about them being in the evening is you do not have to take time out from the day job to attend. The disadvantage is they are shorter of course. (And for me personally, I feel morally obliged to pop in on my dear old Mother on the way and listen to her latest crazy theories. I think it’s what Mum’s are for). Samosas were provided last time to keep you going and I know a couple of us will retire to a near-by pub after, to continue discussions.

I’ll be doing just a short talk, along with half a dozen others, my topic being “is the optimizer getting too smart to be understood”.

This is a user group in the truest sense of it, organised pretty much by one chap (Mike Mckay-Dirden) in his spare time, with help from interested people and some financial input from the sponsor Redgate.

I get a day off and then I am at the combined RAC CIA and Database SIG on Wednesday 16th. This UKOUG SIG is probably the other end of the user group experience (as in from an organisational and size perspective). They both fulfill a need and I have no problem being involved in both. In fact, I now notice that Patrick Hurley is going to be at both events too.

The RAC CIA & Database SIG is also free, IF you are a member of the UKOUG. You can also attend if you pay a one-off fee. It’s an all-day event and as it is a combined SIG it is a two-track event. It’s almost a mini-conference! Presenters include myself (doing my intro to Exadata talk, probably for the last time), Julian Dyke, Patrick Hurley, Martin Bach, Neil Chandler, Neil Johnson, Martin Nash (twice!), Ron Ekins, John Jezewski, Alex Evans and David Kurtz. If that is not enough, Owen Ireland is going to give a support update and then we have Mike Appleyard giving a keynote on a brand new 12.1 feature, Oracle Database In-Memory option.

I’d be going along even if I was not presenting or helping run the RAC CIA SIG and I’m retired for goodness sake! (well, sort of, my wife has not ordered me back to the working life yet). And of course, we will no doubt retire to a hostelry after (to count how many Neils and Martins are involved).

I hope to see as many UK people as possible at these two days. As I said at the top, it’s free training, you can’t get better than that.

Friday Philosophy – Level of Presentations March 8, 2013

Posted by mwidlake in Exadata, Friday Philosophy, Perceptions, Presenting.
Tags: , ,
7 comments

This FF is a bit of a follow-up to the one I posted last week on PL/SQL skills and a comment made by Noons on how much knowledge you need to be an OakTable member.

I have a question to answer and I would appreciate other people’s opinion. Should there be more intro talks at conferences? If so, should the experts be giving them?

I am an OakTable member (and really quite worryingly proud about that) and I guess that means I know a lot about some aspects of Oracle. But also, and this is a key part of being asked to be a member of the OakTable, I both try and base my “knowledge” on scientific testing of what I think I know (be it by myself or others willing to show their workings) and I try and pass that knowledge on. I don’t think there is a member of the OT that does not abide by those two concepts.

This is not false modesty on my part, but most other people on the OT know a shed load {UK colloquialism for “really quite a lot”} more than I do about the Oracle database and how it works. Some of them are so smart I can’t but secretly dislike them for it :-). But I have a reasonable stash of knowledge in my head and I am a stong proponent of those last two factors. In particular, I want to put what I have in my head about {ohh, let’s pick partition pruning} in other people’s heads. Which is why for the last 4 years most of my presentations have run over horribly. I must get every detail into the audiences’ heads, even if they don’t want to know it!

Of late I have started to feel that I present in the wrong way. I take a subject, I write down what I know about it and I try to present that knowledge. I’ve only picked topics I feel I know well and in which I can handle almost any question the audience asks. For me to be that confident I have to have used that aspect of Oracle technology a lot and had months, if not years of experience. You cannot pass that on in 1 hour. I’ve already said I am not that smart, but I learn over time. So I started to strip out the basics and present just the clever stuff, which shows how fabulous I am. {British self-deprecating sarcasm, please}. Just like so many other experts. Hell, if we are experts, we should be talking expert stuff!!!

To balance that I think there is a gap in what is talked about at conferences. I know this is felt by some conference organisers and attendees too, but there is just way too much “impressive smart stuff” and not enough “how X works in the real world, for people who have never done it”. Not 10,000 feet up sales pitch rubbish that gives no substance, but talks on a topic where you can go from level 1 for knowledge-free beginners to level 5 for the 25 people at the conference who know this stuff inside out – and the talk stops at level 2. I’ve made a decision to try and address that gap, which is why I now offer more talks on “an intro to Exadata” and “how to get going with performance tuning” than the smart stuff.

The problem is, how many people, probably mostly young people, go to conferences? Am I wasting my time trying to offer these talks if no one going to the conferences wants them? Do people going to conferences, generally speaking, want the technical nitty-gritty or do they want the intro stuff? Yes, I know there is a spread but where is the real need? I suppose I thought of this recently when I did a talk on Index Organized Tables and almost everyone in the room already used them. Then, a few hours later, I did an intro to database design. Almost everyone in the room was a seasoned database designer… I doubt I said much of real value to either audience.

That leaves my last point about should the experts do intro talks? A problem with experts doing intro talks is the expert knows it all. It can be hard to remember what you really needed to know at the start (and also, my own problem, knowing what to drop out the talk as, really, it is “being smart as an ego-trip” that the new people can do without). But if you are not an expert, can you handle the What If questions? I have played with this issue with my Intro to Exadata talk. I wrote the draft when I had very little real experience and I have only modified it as I gained more experience. I’m glad I did as when I revisited the presentation recently I started putting loads of stuff in that only makes sense when you are used to it’s peculiarities. Thankfully, I realised this and stripped most of it out again. So well, in fact, that one person who wrote about the talk said “it was nice to see someone talk about it who was an Oracle expert who obviously knew little about the topic”. Well, something like that :-)

Enough. I would appreciate other people’s opinions and experiences on this.

Friday Philosophy – Do good DBAs need PL/SQL Skills? March 1, 2013

Posted by mwidlake in Friday Philosophy, PL/SQL.
Tags: ,
14 comments

This Friday Philosophy was prompted by a discusion between some OakTable people about did we think “good” DBAs should know PL/SQL? Not all the tricks, bulk processing, using all the built-ins, but able to write PL/SQL with cursor loops and some exception handling that could eg cycle thorough tables and archive off data or implement some logon trigger functionality.

My response was “that depends on the age of the DBA”.

If you had asked me that question 15 years ago I would have said Yes, a good DAB would and should know PL/SQL.
If you had asked me 10 years ago I would have said I’d hope they would and most DBAs I respected has some PL/SQL skills.
If you had asked me 5 years ago I would have sighed and had a little rant about how they should but the younger ones don’t and that is wrong.

But now, I would say that no, a good DBA does not need PL/SQL skills as so often they have so many other things they have to do and the tools to manage the database are somewhat better than they were. But inside I would still be thinking any DBA beyond their first year or two in the job would benefit from knowing the basics of PL/SQL.

It seems to me that a DBA now is generally expected to look after a very large number of instances, application servers, agents etc and all their time is taken doing the bread-and-butter tasks of backups and recoveries, patching, duplicating data, raising SRs (and that seems to take more and more time each SR every year), unlocking accounts, sorting out permissions…

Not only that but there is more and more to Oracle that a good DBA needs to understand as the technolgy gets more complex. Oracle has tried to make Oracle look after itself more but the result seems to be that for larger systems there are more moving parts to go wrong – and when they do it is often the DBA who has to sort it out. As an example, you no longer need to set several instance parameters to allocate memory to the components of the PGA and SGA. Just set the Memory Target. But if the system starts to throw odd errors about components being out of memory, the DBA needs to sort that out. They need to know about the dynamic memory adjust ments and check them out. They need to understand that certain components are now calculated in a different way, like the log buffer size. And probably they will have to revert back to the old parameters so they still need to know all about that!

So unless a DBA is an old hand and “grew up” with all this, they have no time to develop PL/SQL skills. Thus my response was “that depends on the age of the DBA”.

Should a good DBA know SQL?

Yes. I still see that as a given. Buttons, widgets and assistants will only get you so far.

Friday Philosophy – The Tech to Do What You Need Probably Exists Already November 30, 2012

Posted by mwidlake in Friday Philosophy, Perceptions.
Tags: , , ,
4 comments

How many of you have read the Oracle Concepts manual for the main version you are working on?

This is a question I ask quite often when I present and over the last 10 years the percentage number of hands raised has dropped. It was always less than 50%, it’s been dropping to more like 1 in 10 and Last year (at the UKOUG 2011 conference) was the nadir when not a single hand was raised. {Interestingly I asked this at the Slovenian User Group 3 months ago and something like 40% raised their hand – impressive!}.

Why do I feel this is important? Well, do you know all the technology solutions available across just the core RDBMS with no cost options?. No, you don’t, you (and I) really don’t. If you read the concepts manual, even just skimming it, you will be reminded of a whole load of stuff you have only dim memories of and perhaps you will even see some features that passed you by when they were first introduced.

Of course, you would need to read a few more manuals to get the full picture, such as the PL/SQL Packages and Types References, as so much good stuff is introduced via built-in packages, and the SQL Language Reference, as SQL has been extended quite a lot over the last couple of versions. Dull reading indeed but I’d estimate that if you read those three you would be aware of 90%+ of the Oracle technologies that are available to you out-of-the-box and considerably more than all but a handful of Oracle Experts. You’d know more than I as (a) I have not skimmed the PL/SQL one for years and (b) I have a rotten memory.

My point it, you can’t consider using Oracle technologies you don’t know about or remember – and they could be just what you need to fix the problem you see in front of you.

I’ll give a couple of examples.

Problem, physical IO is too high, your storage system is bottle-necked.
Answers, Reducing Physical IO:
-First up, Index Organized tables. Some of you will be aware that I am very keen on IOTs and the reason I am is that I’ve used them to physically group data that the application needed to select over and over again. It can make a massive improvement to that sort of system. They are rarely used.
-Clustered Tables. Even less used, in fact has anyone reading this used them in anger in the last 10 years? Great for situations where you need parent-children or parent-children-type-1+parent-children-type-2 data. I confess, I have not used them in anger for years.
-Move table (and order them as you do it!) and re-build indexes to remove “dead” space. This one got a bad name, especially the index rebuilds, as people were doing it needlessly without appreciating what the intention was, but now I hardly ever see it done – even when it is of benefit.
-Compress your tables and indexes. With normal Oracle compress (no need for HCC). In tests I’ve done I pretty much always see a drop in physical IO and run time. Being candid, I can’t remember doing any tests and NOT seeing an improvement but I usually only test when I expect and improvement and I don’t want to give anyone the impression it will always help.

All the above were available in Oracle 7 or 8 and all have improved over the versions.

Problem, you want to carry out some long,complex data processing in PL/SQL and if a step fails, be able to handle it and carry on.
Answers:
– Savepoints. You can rollback to a savepoint, not just to the last commit. In this way you can break the task into chunks and, if something towards the end fails, you roll back one step (or several, your choice) and call an alternative routine to handle the exception.
– Autonomous Transactions. You want to record that an error occurred but not fail the original action or not save anything it has done to date. An autonomous transaction runs in it’s own sub-session and commits in it do not effect the calling session.
– Temporary Tables. You can put your working information in them as you progress and if you need to bomb out (or some evil DBA kills your session for running too long) the temp table contents just disappear. No clean up needed.

Maybe the above is not so fair, I have not been a proper PL/SQL developer for a while now, but I hardly see the above used. Especially Savepoints. I can’t remember not having savepoints available (hmm, maybe Oracle 6) and Autonomous Transactions and Temporary tables are Oracle 8 (I thought Temporary tables might be 9 but Tim Hall’s OracleBase says 8)

Another thing I have noticed over the years is that so often I will read up on some oracle feature I know little about, only for it to come up in the next few weeks! There is a psychological aspect to this, that we only remember these “coincidences” and not the more common situations where we read up on something which does not subsequently come up before we forget about it, but I think that it is also that we tend to use only a few solutions to solve the problems we see and adding another solution to our list means the chances are high it will be suitable for something soon.

OK, so it would help us all to read the manuals (or other Oracle technical books) more. Now the big problem is finding the time.

Friday Philosophy – I Am An Exadata Expert August 10, 2012

Posted by mwidlake in Exadata, Friday Philosophy, Perceptions.
Tags: , , ,
5 comments

(Can I feel the angry fuming and dagger looks coming from certain quarters now?)

I am an Exadata Expert.

I must be! – I have logged onto an Exadata quarter rack and selected sysdate from Dual.

The pity is that, from some of the email threads and conversations I have had with people over the last 12 months, this is more real-world experience than some people I have heard of who are offering consultancy services. It’s also more experience than some people I have actually met, who have extolled their knowledge of Exadata – which is based solely on the presentations by Oracle sales people looking at the data sheets from 10,000 feet up and claiming it will solve world hunger.

Heck, hang the modesty – I am actually an Exadata Guru!

This must be true as I have presented on Exadata and it was a damned fine, technical presentation based on real-world experience and I have even debated, in public, the pros and cons of point releases of exadata. Touching base with reality once more, I did an intro talk “the first 5 things you need to know about Exadata” and the “debate” was asking Julian Dyke if he had considered the impact of serial direct IO on a performane issue he had seen and he had not only done so but looked into the issue far more than I – so he was able to correct me.

But joking aside –  I really am a true consulting demi-god when it comes to Exadata

I have years of experience across a wide range of Exadata platforms. That would be 0.5 years and I’ve worked intensively on just one system and am in a team now with some people who are proper experts. So a range of two. Yes, tongue is still firmly in cheek.

This situation always happens with the latest-greatest from Oracle (and obviously all other popular computing technologies). People feel the need to claim knowledge they do not have. Sometimes it is to try and get consultancy sales or employment, sometimes it is because they don’t want to be seen to be behind the times and sometimes it is because they are just deluded. The deluded have seen some presentations, a few blog posts and maybe even got the book and read the first few chapters and are honeslty convinced in their own minds that they now know enough to make effective use of the technology, teach {or, more usually, preach} others and so proclaim on it. {See Dunning Kruger effect, the certainty of idiots}. I’m certainly not arguing against going to presentations, reading blogs and books and learning, just don’t make the mistake of thinking theoretical, second-hand knowledge equates to expert.

With Exadata this situation is made worse as the kit is expensive and much of what makes it unusual cannot be replicated on a laptop, so you cannot as an individual set up a test system and play with it. Real world experince is required. This is growing but is still limited. So the bullshit to real skills quotient remains very, very high.

If you are looking for help or expertise with Exadata, how do you spot the people with real knowledge from the vocal but uninformed? Who do you turn to? {NB don’t call me – I’m busy for 6 months and I really am not an expert – as yet}. If your knowledge to date is based on sales presentations and tidbits from the net which may or may not be based on a depth of experience, it is going to be hard to spot. When I was still without real world experience I had an unfair advantage in that I saw email threads between my fellow OakTable members and of course some of those guys and gals really are experts. But I think I was still hoodwinked by the odd individual on the web or presenting and, I can tell you, though this background knowledge really helped – when I DID work on my first exadata system, I soon realised I did not understand a lot about the subtulties and not-so-subtulties of using a system where massively improved IO was available under key conditions. I had to put a lot of time and effort and testing to move from informed idoit to informed, partially experienced semi-idiot.

I know this issue of the non-expert proclaiming their skills really frustrates some people who do know their stuff for real and it is of course very annoying if you take someone’s advice (or even hire them) only to find their advice to be poor. Let’s face it, is is simple lying at best and potentially criminal mis-selling.

I guess the only way is for peopel needing help to seek the help of someone who has already proven themselves to be honest about their skills or can demonstrate a real-world level experience and success. I would suggest the real experts should do that most difficult task of pointing out the mistakes of the false prophets, but it is very tricky to do without looking like a smartarse or coming over as a big head or jealous.

I’ll finish on one thing. Last year I said how I thought maybe I should do more blog posts about things I did not know much about, and be honest about it and explore the process of learning. I did actualy draft out about 3 posts on such a topic but never pushed them out as I was way too busy to complete them… That and, being candid, I really did not want to look like an idiot. After all, this Oracle lark is what puts beer in my hand, hat fabric on my wife’s millinary worktop and food in my cat’s bowl. The topic was….? Correct, Exadata. Maybe I should dust them off and put them out for you all to laugh at.

OUGN 2012 Second Day – Out on the Open Seas March 23, 2012

Posted by mwidlake in Meeting notes.
Tags: , , ,
1 comment so far

As I said yesterday, I am not one for blogging about conferences. So what the heck, I’ll do another post on this one :-).

You might have picked up on the fact that I am not very good on the sea and have a lot of nervousness about this idea of spending 2 days on an ocean-going liner. So today is the day we move over to being on the water, the OUGN conference proper. I’m delighted to say I am not the only one nervous about this boat lark, Marie Colgan {or, as I should be calling her, “The Optimizer Lady” – since Doug Burns christened her back at the UKOUG 2011 conference} feels the same. There is nothing better to reduce one’s apprehension about something than finding someone else who is just as if not more anxious about it. I suspect this is an evil way to think but I can’t help it.

The day went well, my overview of VLDBs could have maybe been a little more polished – I think I was trying to cover too many topics and not spending enough time on anything, apart from “why a small number of spindles is bad” which I spent too long on. But I did make two points that I think are major considerations with VLDBs and are actually not very technical. (1) If you are going to have to look after a massive database, try very, very, very hard to keep everything as simple, standardised and controlled as you can. (2) Find other people to talk to about VLDBs. Most companies that have them are reluctant to talk, either because of the industry type (Defense, Banks, Pharma) or a perceived giving away of corporate advantage (Banks, Pharma, Utilities and pretty much all companies really).

Anyhow, having done my talk I was now free for the next 2 days to enjoy other talks, socialise, relax – and keep internally checking if I felt nauseous or not. The sea has turned out to be very calm and the boat very stable. But I keep checking and, of course, that makes me feel a little anxious – and thus ill.

However, I have to say that the travel sickness pills I have been taking do seem to be very effective. They are effectively in making me feel woozy. But, and this is important, not ill. I’m having to rely on beer for the latter.

One thing I really, really like about this conference. Everyone is stuck on a boat, they can’t escape. Which means you get to see lots of people all day and it makes the whole thing have a nice sense of community.

Right, Maria is about to present again. I’m going to go and sit in the front row and sway lightly from side to side. Apparently it makes her feel even worse…

Headlong rush to Conference – Preparing the Presentations November 29, 2011

Posted by mwidlake in Meeting notes, UKOUG.
Tags: , , , ,
6 comments

With only a few days to go before the UKOUG conference this year I’m preparing my presentations. I know pretty much what I want to say and, for the IOT talk at least, it is not as if I do not have enough material already prepared – some of which has been on the blog posts and some of which has not. (though it did strike me that I could just fire up the blog and walk through the thread, taking questions).

My big problem is not what to say – it is what not to say.

I’ve always had this problem when I want to impart knowledge, I have this desire to grab the audience by the throat, take what I know about the subject and somehow just cram the information into the heads of the people in front of me. All of it. I want them to know everything about it that I know, the core knowledge, the oddities, the gotchas, how it meshes with other topics. It’s ridiculous of course, if I’ve spent many hours (days, weeks, 20 years) acquiring experience, reading articles and learning, I can’t expect to pass that all on in a one hour presentation – especially as I like to provide proof and examples for what I say. But I think the desire to do so is part of what makes me a good presenter and tutor. I bounce around in front of the audience, lobbing information at them and constantly trying to judge if I need to backup and give more time to anything or if I can plough on, skipping the basics. Hint, if you are in the audience and I’m going too fast or garbling my topic, then I am always happy to be asked questions or told to reverse up a little. I’ve never been asked to speed up though :-)

It gets even worse. If I am putting myself up there to talk about a topic then I don’t want to be found wanting. I want to be able to handle any question and have a slide or example up my sleeve to demonstrate it. It’s exhausting and, again, pointless. At somewhere like the UKOUG there is bound to be someone who knows something I don’t know about any topic.

For me the trick is to pare it down, to keep reminding myself that if the audience leaves with more knowledge than they came in with, that is a win. If they actually enjoyed the experience I’m even more happier. Maybe I should forget the topic and just take drinks and nibbles…

So, I’m currently doing what I always do, which is trying to force myself to remove stuff that is not directly relevant whilst still leaving a few little oddities and interesting items. Plus getting the 200 slides down to something more reasonable – like say 120 :-)

If I can get it down to one slide per minute (some of which I skip on the day as they are there for anyone downloading the talk) then I’m OK.

Of course, having done this, the day before the course I’ll do one last “final review” – and add a couple of dozen slides to just clarify a few points…

Want to Know More about Oracle’s Core? October 19, 2011

Posted by mwidlake in performance, Private Life, publications.
Tags: , , ,
14 comments

I had a real treat this summer during my “time off” in that I got to review Jonathan Lewis’s up-coming new book. I think it’s going to be a great book. If you want to know how Oracle actually holds it’s data in memory, how it finds records already in the cache and how it manages to control everything so that all that committing and read consistency really works, it will be the book for you.

{Update, Jonathan has confirmed that, unexpected hiccups aside, Oracle Core: Essential Internals for DBAs and Developers should be available from October 24, 2011}

{Thanks to Mike Cox, who let me know it is already available to be reserved at Amazon}

Jonathan got in touch with me around mid-May to say he was working on the draft of his new book, one that would cover “how does Oracle work”, the core mechanics. Would I be willing to be one of his reviewers? Before anyone comments that there is not likely to be much about core Oracle that I know and Jonathan does not, he did point out that he had already lined up someone to be his technical reviewer, ie someone he expected to know as much as he and help spot actual errors. The technical reviewer is the most excellent Tanel Poder, who posted a little mention of it a couple of months back.

I was to act more like a typical reader – someone who knew the basics and wanted to learn more. I would be more likely to spot things he had assumed we all know but don’t, or bits that did not clearly explain the point if you did not already know the answer. ie an incomplete geek. I figured I could manage that :-).

It was a lot harder work than I expected and I have to confess I struggled to supply back feedback as quickly as Jonathan wanted it – I was not working but I was very busy {and he maybe did not poke me with a sharp stick for feedback soon enough}. As anybody who has had to review code specifications or design documents will probably appreciate, you don’t just read stuff when you review it, you try and consider if all the information is there, can it be misunderstood and, if you find that you don’t understand a section, you need to work out if the fault is with you, with the way it is written or with what is written. When I read a technical {or scientific} document and I do not fully understand it, I usually leave it a day, re-read it and if it still seems opaque, I just move on. In this case I could not do that, I had to ensure I understood it or else tell Jonathan why I thought I did not understand it. If there are sections in the end book that people find confusing, I’ll feel I let Jonathan down.

Just as tricky, on the one hand, as I’ve been using Oracle for so long and I do know quite a lot about Oracle {although clearly not enough in the eyes of the author :-) } I had to try and “not know” stuff to be able to decide if something was missing. On the other, when I wanted to know more about something was I just being a bit too nerdy? I swung more towards the opinion that if I wanted to know more, others would too.

I have to say that I really enjoyed the experience and I learnt a lot. I think it might change how I read technical books a little. I would run through each chapter once to get the feel of it all and then re-read it properly, constantly checking things in both version 11 and 10 of Oracle as I read the drafts and would not let myself skip over anything until I felt I really understood it. As an example, I’ve never dug into internal locks, latches and mutexes much before and now that I’ve had to learn more to review the book, I have a much better appreciation of some issues I’ve seen in the wild.

Keep an eye out for the book, it should be available by the end of this year and be called something like “Oracle Core” {I’ll check with Jonathan and update this}. I won’t say it will be an easy read – though hopefully a little easier as a result of my input – as understanding things always takes some skull work. But it will certainly be a rewarding read and packed full of information and knowledge.

When to Fix & When to Find Out – Friday Philosophy July 15, 2011

Posted by mwidlake in AWR, Friday Philosophy, Testing.
Tags: ,
8 comments

{warning, this is one of my long, rambling Friday Philosophy blogs. Technical blogs are on the way – though there are some nice AWR screen shots in this one :-) }

As a DBA (or System Administrator or Network admin or any other role that involves fixing things that are broken before the shouting starts) there is often a tension between two contending “best practices”:

- getting the system working again as soon as possible.
or
– understanding exactly what the problem is.

Some experts point out {and I generally agree} that unless you actually find out exactly what the problem was, what you did to fix it and via a test case demonstrate why the fix worked, you are not really solving the problem – You are just making it go away. You (hopefully!) know what you changed or the action you took so you have something that you can repeat which might fix it again next time. (NB this does not apply to the classic “turn it off, turn it back on again”, that nearly always is just an aversion therapy).

But it might not fix it next time.

Or you might be going to way more trouble to fix it than is needed {I’m remembering how flushing the shared pool used to get touted as a way to resolve performance issues, but it is a pretty brutal thing to do to a production database}.

You might even be making other problems worse {like slowing down everything on the production database as the caches warm up again, having flushed out all that data in the SGA, or that good-old-standard of rebuilding indexes that simply do not benefit from the rebuild}.

There is another issue with “just fixing it by trying things” in that you are not really learning about the cause of the issue or about how the technology you are looking after works. A big part of what makes an “expert” an expert is the drive, desire and opportunity to take the time to work this out. It’s that last part I sometimes get grumpy about, the opportunity.
For many of us, we do not have the luxury of investigating the root cause. Because we are under so much pressure to “get it fixed right now and worry about the detail later”. But we do not get to the detail as there is then the next “fix this problem right now” and so it goes on. {Kellyn Pot’Vin does a nice post about what she calls the Superman Conundrum on this topic}.

I’ve had exactly this dilema just a couple of months ago. Below are the details, it’s a bit long so skip to the end of the post if you like…

I was creating test data and I decided to use parallel processing to speed it up. I created a month’s worth of data with PL/SQL and then decided to copy it with a simple “insert into …select” statement, updating dates and a couple of other incrementing columns as part of the insert, using parallel. The creation of the second month’s data took longer than the PL/SQL code for the first month took. What the!!!??? I pulled up the AWR information and I could see that the problem was (possibly) with inter process communication between the parallel processes, as shown by the PX DEQ CREDIT:send blkd wait event.

The below screenshot shows the overall instance workload, the green is CPU and the Pink is “Other”. Only one SQL statement is responsible for all of this effort, via 5 sessions (four of which are parallel threads) You can see that the issue had been going on for over an hour {oh, and to a certain extent these pretty pictures are pointless – I am not looking for the exact solution now, but having loaded the pictures up to the blog site I am damn well going to pretty-up my blog with them}:

Drilling into that one session shows that the bulk of the waits by far is for PX DEq Credit: Send blkd:

By selecting that wait event, I got the histogram of wait times since the system started {I love those little histograms of wait times}:

Note that these waits are for long periods of time, around 1/10 of a second on average and some are for up to 2 or 4 seconds.

The thing is, I had anticipated this problem and increased my PARALLEL_EXECUTION_MESSAGE_SIZE to 16K from the default of 2K already, as I knew from experience that the default is way to small and has slowed down parallel execution for me before. So why was I seeing poorer performane now than I anticipated? I’m not understanding stuff. So I needed to change one thing and see what impact it has and repeat until I got to the bottom of it.

Except I could not – the next team in line was waiting for this data and they already had massive pressure to deliver. My job, what my employer was giving me money to do, was to fix this problem and move on. So, in this instance I just pragmatically got on with getting the task done as quickly as I could.

I did what we all do {don’t we?} under times of accute time pressure. I made a handful of changes, using the knowledge I already have and guessing a little, hoping that one of them would do the trick. This included reducing the degree of parallelism, adding the /*+ append */ hint (I simply forgot the first time around), pre-allocating the required space to the tablespace, muttering “pleaseopleaseoplease” under my breath….

It worked:

The job ran in less than 20 minutes and used less resource during that time. Well, it waited less anyway.
The wait histograms show lots and lots of shorter duration waits:

The duplication took 20 minutes when the previous attempt had been terminated after 2 hours when other factors forced it to be curtailed. Job done.

But the thing is, the problem was not fixed. I got the task done in a timescale that was required, I satisfied the need of the client, but I was and am not sure exactly why.

If I was a permanent employee I would consider pressing for being allowed to spend some time working this out, as my employer benefits from me extending my knowledge and skills. This is not always a successful strategy :-) {but it does indicate to me that my employer has a Problem and I need to address that}. In any case, as I was a consultant on this job, I was being paid to already know stuff. So it is now down to me, some personal time, the internet and people more knowledgeble than me who I can ask for help to sort this out.

And that was my main point. Often at work you have to get over the issue. Later on, you work out what the issue was. I wish I could get paid for the latter as well as the former. The real blow for me is that I no longer have access to that site and the information. My job was complete and, whether they have really shut down my access or not, I am not allowed to go back into the systems to dig around. I think I now know the issue, but I can’t prove it.

Friday Philosophy – Software being Good is not Good Enough February 5, 2010

Posted by mwidlake in Friday Philosophy, Uncategorized.
Tags: ,
4 comments

In a previous Friday Philosophy on In Case of Emergency I mention that something being simply a “Good Idea” with technology is not good enough. Even being a GREAT idea is not enough. It also has to be:

  1. Easy and simple to use. After all, using that funny stick thing behind your steering wheel in a car, to indicate which direction you are turning, seems to be too much of an effort for many people. If installing some bit of softare or running a web page is more than a little bit of effort, most people will not bother.
  2. Quick. No one has patience anymore, or spare time. This will probably be the fourth year in a row I do not plant any vegetables in our garden as I need to spend a day or two clearing and digging over the spot for said veg. You can’t beat home-grown veg. Similarly, I won’t use a web pages that takes as long to load as it does to plant a carrot seed.
  3. Known about. There could be a really fantastic little program Out There that allows you to take a screen shot, add a title and comment and pastes it straight into a document for you, converting to half a dozen common formats on the fly. But I do not know about it. { ScreenHunter is pretty good, I have to say, and when I have shown it to people a lot of them like it}.
  4. Popular. This is not actually the same as “known about”. For a stand-alone application to be good for you, you just need to know where it exists. Like maybe a free building architecture package. Whether thousands of people use it is moot, so long as you can get your extension drawings done in it with ease, that makes it great. But something that relies on the community, like a service to rate local eataries, unless lots of people use it and add ratings, well who cares. There are dozens (if not hundreds) of such community “good ideas” started every day but unless enough people start to use it, it will fizzle out, as the vast majority of them do.

Point 4 is highly relevant to “In Case Of Emergency” as it is simple, quick and relativley known about. It just needs to be ubiquitous.

I became very aware of point 3 a few years ago and also of the ability for very clever people to be sometimes very stupid when it comes to dealing with their fellow humans.

I was working on a database holding vast quantities of DNA information. If you don’t know, DNA information is basically represented by huge long strings of A, C, T and G. So something like AACTCGTAGGTACGGGTAGGGGTAGAGTTTGAGATTGACTGAGAGGGGGAAAAATGTGTAGTGA…etc, etc, etc. These strings are hundreds, thousand, hundreds of thousands of letters long. And Scientists like to search against these strings. Of which there are millions and millions. Not for exact match mind, but kind-of-similar, fuzzy matches, where for example 95% of the ACTGs match but some do not. It’s called a BLAST match.

Anway, suffice to say, it takes a lot of compute power to do this and a fair amount of time to run. There was a service in America which would allow you to submit a BLAST query and get the answer in 20 minutes or so {I have no idea how fast it is now}. 

Some extremely clever chaps I had the pleasure of working with came up with a faster solution. Same search, under 5 seconds. Now that is GREAT. We put together the relevant hadware and software and started the service.  Now I thought it went beyond Good or even Great. It was Amazing (and I mean it, I was amazed we could do a fuzzy search against a billion such strings in 2, 3 seconds using a dozen or so PC-type servers).

No one used it. This was because almost no one knew about it and there was already this slow service people were used to using. People who used the old service never really thought to look for a new one and the chances were they would not have found ours anyway.

I pushed for more to be made of this new, faster service, that it should be advertised to the community, that it should be “sold” to people (it was free to use, by “sold” I mean an attempt made to persuade the scientific community it was worth their while investigating). The response I was given?

“If the service is worth using, people will come and use it”.

No they won’t. And indeed they didn’t. It was, I felt, a stupid position to take by an incredibly inteligent person. How were people to know it existed? Were they just supposed to just wake up one morning knowing a better solution was out there? The internet pixies would come along in the night and whisper about it in your ear? In the unlikely event of someone who would be interested in it just coming across it, were they then going to swap to using it? After all no one else seemed to know about it and it was 2 orders of magnitude faster, suspiciously fast, how could it be any good?

The service got shut down as it was just humming in the corner consuming electricity. No one knew it existed, no one found it, no one came. I can’t but help wonder how much it could have helped the scientific community.

There must be thousands of other “failed” systems across the IT world that never took off just because the people who could use it never knew it existed. Depressing huh?

Follow

Get every new post delivered to your Inbox.

Join 156 other followers