jump to navigation

Headlong rush to Conference – Preparing the Presentations November 29, 2011

Posted by mwidlake in Meeting notes, UKOUG.
Tags: , , , ,
6 comments

With only a few days to go before the UKOUG conference this year I’m preparing my presentations. I know pretty much what I want to say and, for the IOT talk at least, it is not as if I do not have enough material already prepared – some of which has been on the blog posts and some of which has not. (though it did strike me that I could just fire up the blog and walk through the thread, taking questions).

My big problem is not what to say – it is what not to say.

I’ve always had this problem when I want to impart knowledge, I have this desire to grab the audience by the throat, take what I know about the subject and somehow just cram the information into the heads of the people in front of me. All of it. I want them to know everything about it that I know, the core knowledge, the oddities, the gotchas, how it meshes with other topics. It’s ridiculous of course, if I’ve spent many hours (days, weeks, 20 years) acquiring experience, reading articles and learning, I can’t expect to pass that all on in a one hour presentation – especially as I like to provide proof and examples for what I say. But I think the desire to do so is part of what makes me a good presenter and tutor. I bounce around in front of the audience, lobbing information at them and constantly trying to judge if I need to backup and give more time to anything or if I can plough on, skipping the basics. Hint, if you are in the audience and I’m going too fast or garbling my topic, then I am always happy to be asked questions or told to reverse up a little. I’ve never been asked to speed up though :-)

It gets even worse. If I am putting myself up there to talk about a topic then I don’t want to be found wanting. I want to be able to handle any question and have a slide or example up my sleeve to demonstrate it. It’s exhausting and, again, pointless. At somewhere like the UKOUG there is bound to be someone who knows something I don’t know about any topic.

For me the trick is to pare it down, to keep reminding myself that if the audience leaves with more knowledge than they came in with, that is a win. If they actually enjoyed the experience I’m even more happier. Maybe I should forget the topic and just take drinks and nibbles…

So, I’m currently doing what I always do, which is trying to force myself to remove stuff that is not directly relevant whilst still leaving a few little oddities and interesting items. Plus getting the 200 slides down to something more reasonable – like say 120 :-)

If I can get it down to one slide per minute (some of which I skip on the day as they are there for anyone downloading the talk) then I’m OK.

Of course, having done this, the day before the course I’ll do one last “final review” – and add a couple of dozen slides to just clarify a few points…

Want to Know More about Oracle’s Core? October 19, 2011

Posted by mwidlake in performance, Private Life, publications.
Tags: , , ,
14 comments

I had a real treat this summer during my “time off” in that I got to review Jonathan Lewis’s up-coming new book. I think it’s going to be a great book. If you want to know how Oracle actually holds it’s data in memory, how it finds records already in the cache and how it manages to control everything so that all that committing and read consistency really works, it will be the book for you.

{Update, Jonathan has confirmed that, unexpected hiccups aside, Oracle Core: Essential Internals for DBAs and Developers should be available from October 24, 2011}

{Thanks to Mike Cox, who let me know it is already available to be reserved at Amazon}

Jonathan got in touch with me around mid-May to say he was working on the draft of his new book, one that would cover “how does Oracle work”, the core mechanics. Would I be willing to be one of his reviewers? Before anyone comments that there is not likely to be much about core Oracle that I know and Jonathan does not, he did point out that he had already lined up someone to be his technical reviewer, ie someone he expected to know as much as he and help spot actual errors. The technical reviewer is the most excellent Tanel Poder, who posted a little mention of it a couple of months back.

I was to act more like a typical reader – someone who knew the basics and wanted to learn more. I would be more likely to spot things he had assumed we all know but don’t, or bits that did not clearly explain the point if you did not already know the answer. ie an incomplete geek. I figured I could manage that :-).

It was a lot harder work than I expected and I have to confess I struggled to supply back feedback as quickly as Jonathan wanted it – I was not working but I was very busy {and he maybe did not poke me with a sharp stick for feedback soon enough}. As anybody who has had to review code specifications or design documents will probably appreciate, you don’t just read stuff when you review it, you try and consider if all the information is there, can it be misunderstood and, if you find that you don’t understand a section, you need to work out if the fault is with you, with the way it is written or with what is written. When I read a technical {or scientific} document and I do not fully understand it, I usually leave it a day, re-read it and if it still seems opaque, I just move on. In this case I could not do that, I had to ensure I understood it or else tell Jonathan why I thought I did not understand it. If there are sections in the end book that people find confusing, I’ll feel I let Jonathan down.

Just as tricky, on the one hand, as I’ve been using Oracle for so long and I do know quite a lot about Oracle {although clearly not enough in the eyes of the author :-) } I had to try and “not know” stuff to be able to decide if something was missing. On the other, when I wanted to know more about something was I just being a bit too nerdy? I swung more towards the opinion that if I wanted to know more, others would too.

I have to say that I really enjoyed the experience and I learnt a lot. I think it might change how I read technical books a little. I would run through each chapter once to get the feel of it all and then re-read it properly, constantly checking things in both version 11 and 10 of Oracle as I read the drafts and would not let myself skip over anything until I felt I really understood it. As an example, I’ve never dug into internal locks, latches and mutexes much before and now that I’ve had to learn more to review the book, I have a much better appreciation of some issues I’ve seen in the wild.

Keep an eye out for the book, it should be available by the end of this year and be called something like “Oracle Core” {I’ll check with Jonathan and update this}. I won’t say it will be an easy read – though hopefully a little easier as a result of my input – as understanding things always takes some skull work. But it will certainly be a rewarding read and packed full of information and knowledge.

When to Fix & When to Find Out – Friday Philosophy July 15, 2011

Posted by mwidlake in AWR, Friday Philosophy, Testing.
Tags: ,
8 comments

{warning, this is one of my long, rambling Friday Philosophy blogs. Technical blogs are on the way – though there are some nice AWR screen shots in this one :-) }

As a DBA (or System Administrator or Network admin or any other role that involves fixing things that are broken before the shouting starts) there is often a tension between two contending “best practices”:

– getting the system working again as soon as possible.
or
– understanding exactly what the problem is.

Some experts point out {and I generally agree} that unless you actually find out exactly what the problem was, what you did to fix it and via a test case demonstrate why the fix worked, you are not really solving the problem – You are just making it go away. You (hopefully!) know what you changed or the action you took so you have something that you can repeat which might fix it again next time. (NB this does not apply to the classic “turn it off, turn it back on again”, that nearly always is just an aversion therapy).

But it might not fix it next time.

Or you might be going to way more trouble to fix it than is needed {I’m remembering how flushing the shared pool used to get touted as a way to resolve performance issues, but it is a pretty brutal thing to do to a production database}.

You might even be making other problems worse {like slowing down everything on the production database as the caches warm up again, having flushed out all that data in the SGA, or that good-old-standard of rebuilding indexes that simply do not benefit from the rebuild}.

There is another issue with “just fixing it by trying things” in that you are not really learning about the cause of the issue or about how the technology you are looking after works. A big part of what makes an “expert” an expert is the drive, desire and opportunity to take the time to work this out. It’s that last part I sometimes get grumpy about, the opportunity.
For many of us, we do not have the luxury of investigating the root cause. Because we are under so much pressure to “get it fixed right now and worry about the detail later”. But we do not get to the detail as there is then the next “fix this problem right now” and so it goes on. {Kellyn Pot’Vin does a nice post about what she calls the Superman Conundrum on this topic}.

I’ve had exactly this dilema just a couple of months ago. Below are the details, it’s a bit long so skip to the end of the post if you like…

I was creating test data and I decided to use parallel processing to speed it up. I created a month’s worth of data with PL/SQL and then decided to copy it with a simple “insert into …select” statement, updating dates and a couple of other incrementing columns as part of the insert, using parallel. The creation of the second month’s data took longer than the PL/SQL code for the first month took. What the!!!??? I pulled up the AWR information and I could see that the problem was (possibly) with inter process communication between the parallel processes, as shown by the PX DEQ CREDIT:send blkd wait event.

The below screenshot shows the overall instance workload, the green is CPU and the Pink is “Other”. Only one SQL statement is responsible for all of this effort, via 5 sessions (four of which are parallel threads) You can see that the issue had been going on for over an hour {oh, and to a certain extent these pretty pictures are pointless – I am not looking for the exact solution now, but having loaded the pictures up to the blog site I am damn well going to pretty-up my blog with them}:

Drilling into that one session shows that the bulk of the waits by far is for PX DEq Credit: Send blkd:

By selecting that wait event, I got the histogram of wait times since the system started {I love those little histograms of wait times}:

Note that these waits are for long periods of time, around 1/10 of a second on average and some are for up to 2 or 4 seconds.

The thing is, I had anticipated this problem and increased my PARALLEL_EXECUTION_MESSAGE_SIZE to 16K from the default of 2K already, as I knew from experience that the default is way to small and has slowed down parallel execution for me before. So why was I seeing poorer performane now than I anticipated? I’m not understanding stuff. So I needed to change one thing and see what impact it has and repeat until I got to the bottom of it.

Except I could not – the next team in line was waiting for this data and they already had massive pressure to deliver. My job, what my employer was giving me money to do, was to fix this problem and move on. So, in this instance I just pragmatically got on with getting the task done as quickly as I could.

I did what we all do {don’t we?} under times of accute time pressure. I made a handful of changes, using the knowledge I already have and guessing a little, hoping that one of them would do the trick. This included reducing the degree of parallelism, adding the /*+ append */ hint (I simply forgot the first time around), pre-allocating the required space to the tablespace, muttering “pleaseopleaseoplease” under my breath….

It worked:

The job ran in less than 20 minutes and used less resource during that time. Well, it waited less anyway.
The wait histograms show lots and lots of shorter duration waits:

The duplication took 20 minutes when the previous attempt had been terminated after 2 hours when other factors forced it to be curtailed. Job done.

But the thing is, the problem was not fixed. I got the task done in a timescale that was required, I satisfied the need of the client, but I was and am not sure exactly why.

If I was a permanent employee I would consider pressing for being allowed to spend some time working this out, as my employer benefits from me extending my knowledge and skills. This is not always a successful strategy :-) {but it does indicate to me that my employer has a Problem and I need to address that}. In any case, as I was a consultant on this job, I was being paid to already know stuff. So it is now down to me, some personal time, the internet and people more knowledgeble than me who I can ask for help to sort this out.

And that was my main point. Often at work you have to get over the issue. Later on, you work out what the issue was. I wish I could get paid for the latter as well as the former. The real blow for me is that I no longer have access to that site and the information. My job was complete and, whether they have really shut down my access or not, I am not allowed to go back into the systems to dig around. I think I now know the issue, but I can’t prove it.

Friday Philosophy – Software being Good is not Good Enough February 5, 2010

Posted by mwidlake in Friday Philosophy, Uncategorized.
Tags: ,
4 comments

In a previous Friday Philosophy on In Case of Emergency I mention that something being simply a “Good Idea” with technology is not good enough. Even being a GREAT idea is not enough. It also has to be:

  1. Easy and simple to use. After all, using that funny stick thing behind your steering wheel in a car, to indicate which direction you are turning, seems to be too much of an effort for many people. If installing some bit of softare or running a web page is more than a little bit of effort, most people will not bother.
  2. Quick. No one has patience anymore, or spare time. This will probably be the fourth year in a row I do not plant any vegetables in our garden as I need to spend a day or two clearing and digging over the spot for said veg. You can’t beat home-grown veg. Similarly, I won’t use a web pages that takes as long to load as it does to plant a carrot seed.
  3. Known about. There could be a really fantastic little program Out There that allows you to take a screen shot, add a title and comment and pastes it straight into a document for you, converting to half a dozen common formats on the fly. But I do not know about it. { ScreenHunter is pretty good, I have to say, and when I have shown it to people a lot of them like it}.
  4. Popular. This is not actually the same as “known about”. For a stand-alone application to be good for you, you just need to know where it exists. Like maybe a free building architecture package. Whether thousands of people use it is moot, so long as you can get your extension drawings done in it with ease, that makes it great. But something that relies on the community, like a service to rate local eataries, unless lots of people use it and add ratings, well who cares. There are dozens (if not hundreds) of such community “good ideas” started every day but unless enough people start to use it, it will fizzle out, as the vast majority of them do.

Point 4 is highly relevant to “In Case Of Emergency” as it is simple, quick and relativley known about. It just needs to be ubiquitous.

I became very aware of point 3 a few years ago and also of the ability for very clever people to be sometimes very stupid when it comes to dealing with their fellow humans.

I was working on a database holding vast quantities of DNA information. If you don’t know, DNA information is basically represented by huge long strings of A, C, T and G. So something like AACTCGTAGGTACGGGTAGGGGTAGAGTTTGAGATTGACTGAGAGGGGGAAAAATGTGTAGTGA…etc, etc, etc. These strings are hundreds, thousand, hundreds of thousands of letters long. And Scientists like to search against these strings. Of which there are millions and millions. Not for exact match mind, but kind-of-similar, fuzzy matches, where for example 95% of the ACTGs match but some do not. It’s called a BLAST match.

Anway, suffice to say, it takes a lot of compute power to do this and a fair amount of time to run. There was a service in America which would allow you to submit a BLAST query and get the answer in 20 minutes or so {I have no idea how fast it is now}. 

Some extremely clever chaps I had the pleasure of working with came up with a faster solution. Same search, under 5 seconds. Now that is GREAT. We put together the relevant hadware and software and started the service.  Now I thought it went beyond Good or even Great. It was Amazing (and I mean it, I was amazed we could do a fuzzy search against a billion such strings in 2, 3 seconds using a dozen or so PC-type servers).

No one used it. This was because almost no one knew about it and there was already this slow service people were used to using. People who used the old service never really thought to look for a new one and the chances were they would not have found ours anyway.

I pushed for more to be made of this new, faster service, that it should be advertised to the community, that it should be “sold” to people (it was free to use, by “sold” I mean an attempt made to persuade the scientific community it was worth their while investigating). The response I was given?

“If the service is worth using, people will come and use it”.

No they won’t. And indeed they didn’t. It was, I felt, a stupid position to take by an incredibly inteligent person. How were people to know it existed? Were they just supposed to just wake up one morning knowing a better solution was out there? The internet pixies would come along in the night and whisper about it in your ear? In the unlikely event of someone who would be interested in it just coming across it, were they then going to swap to using it? After all no one else seemed to know about it and it was 2 orders of magnitude faster, suspiciously fast, how could it be any good?

The service got shut down as it was just humming in the corner consuming electricity. No one knew it existed, no one found it, no one came. I can’t but help wonder how much it could have helped the scientific community.

There must be thousands of other “failed” systems across the IT world that never took off just because the people who could use it never knew it existed. Depressing huh?

Testing Methodology – Getting Community Information January 22, 2010

Posted by mwidlake in Testing, Uncategorized.
Tags: ,
4 comments

<Last post I tried to highlight that knowing specifically what you want to understand before you start testing is important and that it is beneficial to start with the booring official documentation. This is so that you can filter what is Out There in the web, as well as get a good grounding in what is officially possible.

There is no doubt that you can get a lot more information on Oracle features from the community than you can from the manuals and that it is often presented in a more easily understood and relevant way. But some of it is not right for your version or is just plain wrong. I can do no better than to refer to this recent posting by Jonathan Lewis on the topic. In this case, it is the persistance of old and now irrelevant information, perpetuated by the kind but flawed habit people have of pulling information together – but not testing it.

Treat everything on the net with caution if it has no proof. Even then be cautious :-) – I know I have “proved” a couple of things that turned out to be misunderstandings…

You can of course post to forums to ask for information and help, and people will often come to your aid. But you are often not really understanding the subject by doing that, you are just learning facts. It’s like learning the dates of the two World Wars and not understanding why the wars occured. I would point you to this posting by Lisa Dobson from a couple of years back, which makes the point about what you stand to learn, and also says something about how to ask for help. If it helps you to do the “right thing”, think of it selfishly. You will get better and more specific help if you have already looked into your issue and can ask specific questions. Also, if you properly learn something rather than just get a couple of facts about it, you are less likely to look foolish when you repeat those facts in a context they are not valid in. And you will.

Sorry, side tracked a little into one of my pet annoyances. Back to topic.

I want to know about SQL AUDIT. So I google SQL AUDIT ORACLE. Lots of hits. Let’s start filtering. First off, anything by “experts-exchange” and the other pay-for forums can forget it. Sorry guys but if I am going to spend money I’ll buy a manual. I will also ignore any stuff by sites that constantly suggest you go on training courses on boats with them or are “Team America” {everyone seen the film by Trey Parker? Puts the claim “We are the American Team” into a different light…}.

Now I go and look at the sites and I try making my search more intelligent, maybe adding in the word DBA_AUDIT_TRAIL and the like. I try and think what words and phrases someone explaining the topic would use which would not be common for other topics. That is why I often use database object names and commands in my searches.

In the case of SQL AUDIT, there are plenty of sites that generally pull together the basics from the manuals and tell you how to audit a few things and see the results and give some nice examples. It gets you further forward, but it’s mostly not very detailed. Just cookery instructions on what to do but not how it works. Thankfully, there is an excellent article by one of the experts in the field, Pete Finnigan.
Maybe I could have replaced this post by simply suggesting you just go to Pete’s web site, read what is there and follow the link from it to the ones he likes. It would work for the topic of SQL AUDIT. However, although it is always good to go to the sites of people you know and trust, it is always worth doing a google and going beyond the first page or two of results. Those first pages are mostly for a handful of very popular sites and sources. A new article or someone who is relatively unknown but has the answers you need may be further down the list, like pages 3 or 4. It is worth 5 mins just checking.

However, you are not likely to find everything you need. Certainly with SQL AUDIT you won’t find much about performance impact other than bland and not-very-helpful generic warnings that “Use of SQL AUDIT can carry a significant performance overhead”. How much overhead? for what sort of audit? And how do you know? In fact, when searching I found this, admittedly old, comment by Pete about there being relatively little “out there” about performance impact of Auditing. That made me smile, as that information was exactly what I was after.

*sigh*. The information I want does not seem to be out there. I need to do some testing.

That is for the next post (“at LAST”, I hear you all cry).

Testing Methodolgy – The Groundwork January 20, 2010

Posted by mwidlake in Perceptions, Testing, Uncategorized.
Tags: , ,
add a comment

<Previous PostNext Post …>

I want to start learning about using SQL AUDIT, as I mentioned a couple of days ago.

First question. What do I want to know about SQL AUDIT? I might just have thought “well, I do not know much about this area and I think I should – so I want to know everything”. That is OK, but personally I find it helps to make up a specific aim. Otherwise you can just flounder about {well, I do}. In this case I have decided I want to know:

  1. The options for auditing who is selecting key tables.
  2. How to audit when those key tables get changed.
  3. The impact on performance of that audit, and if it could be an issue.
  4. (4) follows on from (3), in that I want to find out how to control that performance impact.

For any of you who have been or are code developers, you hopefully appreciate test-driven coding. That is, you decide up front what the new code must achieve and design tests to ensure the code does it. You do not write any code until you have at least thought of the tests and written them down in a document. {Ideally you have written the tests and built some test data before you start, but then in an ideal world you would get paid more, have more holidays and the newspapers would tell the truth rather than sensational rubbish, but there you go :-) }

I do not think that learning stuff/testing as much different from developing code, thus the list above. I now know what I want to understand.

What next? I’m going to go and check the Oracle Documentation for the feature. And I am very careful to check the documentation for the version I will use. This is 10.2 for me. I know, the Oracle documentation can be pretty dry, it can be rather unhelpful and it is not very good at examples. But you can expect it to be 90% accurate in what it tells you. You can also expect it to be not-very-forthcoming about the issues, gotchas and bits that don’t work. {I have this pet theory that the official documentation only mentions how things do not work once a feature has been out for a version and people have complained that the documentation should let on about the limitations}.

So, for SQL AUDIT I suggest you go and read:

  • Concepts Manual, chapter 20 Database Security. If I am not rushed I would read the whole chapter, I might realise that what I want to do is better done with some other tool (If I wanted to see who had changed records months down the line, I think I would pick up that database triggers were a better bet, for example).
  • SQL Reference, chapter 13 , the section on AUDIT (no surprises there). I do not do much more than read through the SQL manual once though, as frankly I find it pretty useless for explaining stuff, but it puts into your head what the parts of the command there are and pointers to other parts of the documentation. I’ll read the concepts manual with more attention. In this case, the manual will lead me to:
  • Database Security Guide chapter 8. Which is pretty good, actually.
  • My next source of information, may not immediately spring to mind but I find it very valuable, is to find out which data dictionary objects are involved in the feature. In this case, the previous sources would lead me to go to the Database Reference and check out:
  • DBA_AUDIT_TRAIL, DBA_OBJ_AUDIT_OPTS, DBA_PRIV_AUDIT_OPTS, DBA_STMT_AUDIT_OPTS. And of course, SYS.AUD$. {I actually queried DBA_OBJECTS for anything with the word “AUDIT” in it, check out all the tables and also had a quick peak at the text for any views, which would have led me to SYS.AUD$ if I did not already know about it}.

Why do I go and look at the data dictionary objects and the reference guide? After all, is it not nerdy enough to have put myself through reading the SQL Reference manual? {and let us be honest, it is rarely enjoyable reading the SQL Reference manual}. Because  I want to know how it works, not just how to use it. Seeing the table give me a lot of information and the description of the columns may well tell me a lot more. First thing, SYS.AUD$ only has one index, on a column SESSIONID# (I know, there is another column in the index, I need to get to a DB to check this). Any queries not via this index are going to scan the whole damned thing. Right off, I suspect this will be an issue.

I will DESC the tables, see if there is already any information in them. In this case, there was not. A clean sheet.

Why did I not just go to Google (Bing, Yahoo, whatever) and search? Because I can trust the manuals to tell me stuff which is mostly true and is valid for my release. Not stuff about older versions, about later versions, urban myths or just outright fallacies. The manuals certainly will not tell me everything, far from it, but what it does say is solid stuff. With this reliable start I can now go to other sources and have some chance of filtering  all the other stuff that is Out There. Once filtered, it is probably a lot more informative and easier to digest than the manuals.  

I’ll ramble on about that next posting.

New Year, same old rambling thoughts January 5, 2010

Posted by mwidlake in Blogging, Perceptions.
Tags: ,
3 comments

It’s not Friday but, heck, it’s a New Year, there are many of us who might appreciate a non-techie, pointless ramble at the start of the first full working week of a new decade…A Friday Philospohy for the New Year. {If anyone wants to point out the New Decade starts on 1st Jan 2011, go take a running jump – popular opinion is against you, even if logic is for you}.

I found the UKOUG techie conference this year particularly interesting as it was the first major meeting I have been to since I started blogging, and I came across two main opinions about my attempts:

Those who like my blog as it is “chatty” and rambles a bit.
Those who dislike it – because it is “chatty” and rambles a bit…
{oh, and the third opinion, the most common, of utter ignorance of my blog – there goes the ego}.

Well, you can’t please everyone. I was a little saddened, however, as I spoke to a couple of people I really admire in the Oracle Knowledge world and they landed on the “chatty and rambling – bad” side of things. Damn. But they are so good at what they do, I forgive them. The swines.

But then I remembered what I said to a fellow blogger the other month. We bloggers/twitterers all put forward what we blog about in our own style. We might not blog something that is new, we might blog something that is “well known”, but we put it in our own style. Some like it, some do not. It matters not, so long as it adds to the sum of decent knowledge out there.
Some will hate our style and not read, some will read and enjoy. So long as the information gets out there to more people, that is fine.

So, do I think everything I blog is decent knowledge? Oh, I wish. I like to think it is mostly there {and I wish it was all correct} but I am realistic. I test most of what I blog, or I have lived for real most of what I blog, but I will make mistakes. And sometimes I will hit the edge of something good and I put it up there in the hope others will contribute {like the recent one one translating min-max column values into human readable stuff}. And often people do contribute and that is really, really good.

But I do and will continue to make mistakes, be daft, or just put things poorly. I have learned a fair bit in the last 8 months about written communication, the art of communicating to a global audience and also about how not to spread a topic over several weeks as you hope you can “just finish of those half-written blogs in an hour or two” and find it takes weeks. If anyone wants to give me any constructive criticism, please do, but maybe use my email (mwidlake@btinternet.com) rather than flame my postings.

And my last rambling thought for the start of 2010? I am probably going to post less in the next 6 months. I am always sad when the blog by someone I enjoy goes very quiet, but then we all have real jobs to do, so I try to be patient. In my own case, I have noticed I now read a lot less of other people’s blogs as writing my own takes so long. And I am missing too much. There are blogs I really admire or I have discovered in the last 6 months (sometimes both) that I simply fail to really read and they know stuff. So I need to read them. I am going to try and maintain a steady 2-3 blog entries a week, but for the next 6 months I am going to concentrate on learning. Something blogging has taught me is I am really quite ignorant :-)

Good wishes for 2010 to all and everyone who stumbles across my ramblings.

Friday Philosophy – Memory (of the Human Kind) November 14, 2009

Posted by mwidlake in Perceptions, Uncategorized.
Tags:
7 comments

One of the Joys of having been around for a while (I’ve hit that point where I occasionally find myself working with people who were not born when I got my first job in IT) is that you start to re-learn things. Well, I do. {I have to point out that, this being a Friday Philosophy, you will learn nothing yourself, especially not about Oracle. from this opost. If you want Oracle knowledge you might want to pop by Dion cho, Tanel Poder or Richard foote}

Several times over the last couple of weeks I have learnt something “new” and realised a little while later that I already knew it, but had forgotton (BLEVEL is one, consistent gets being impacted by array fetch size is another) . This is not something to be proud of – not knowing something you once did is far worse, and slightly worrying, then never having known it…

I like to think that it is all the new stuff I am learning, pushing previous knowledge out of my brain’s buffer cache {if it is not warmed by re-using it}, but I suspect that it might just be an uptime-related memory leak. A bit like having your database open for a few months and doing a lot of work that includes massive PL/SQL in-memory tables and hash joins (pickler fetches). No one seems sure of the exact mechanics, but after a while you have a hell of a lot less spare memory in your serever than you started with :-)

Maybe the memory loss is closer to the “pickler query” analogy than I thought, you can preserve soft tissue for a long time in alcohol. I’ll have another glass of wine and think on that.

This Forgetting Stuff was actually a major factor to my starting a blog. I was in the process of putting a load of notes on Oracle Odditites and things I had learnt from years past onto a web site so I could get at it from wherever I was in the world, and a friend told me I was being stupid – I should put them on a blog. So I did. There are only two little issues with this.

  • I can’t remember what I have already put on my blog.
  • I’ve forgotten where I put the ZIP disk with the notes on.

So I was contemplating this drift of knowledge and two things struck me.

1) I reckon that the very best people, in any area of expertise, are blessed with excellent memories. There is a comedian in the UK called Stephen Fry, and he is renowned for being stunningly clever. I think he probably is stunningly clever, but he also remembers everything he has learnt (Rory McGrath is another UK comedian with a perfect memory, but somehow he lacks the charm of Stephen Fry, and he has a beard, so not so highly renowned).
2) My poor memory is not maybe such a bad thing. I don’t have to get so upset when a whole chunk of my Oracle knowledge becomes obsolete. I used to be really good at sizing objects and utilizing space, taking into account the 5-block minimum, getting the extents to fit nicely into the datafiles, little scripts to resize segments into a small but sensibele number of extents to reduce wasted space, considering initrans, pctfree, pctused… Pretty much all pointless now :-)

Buffer Cache Hit Ratio – my “guilty” Confession November 1, 2009

Posted by mwidlake in Perceptions, performance.
Tags: , ,
14 comments

My Friday Philosophy this week was on Rules of Thumb on buffer gets per row returned.

Piet de Visser responded with a nice posting of his own, confessing to using ratios to help tuning {We seem to be playing some sort of blog-comment tag team game at the moment}.

Well, I have a confession so “guilty” or “dirty” that I feel I cannot inflict it on someone else’s blog as a comment.

I use the Buffer Cache Hit Ratio.

And the Library Cache Hit Ratio and the other Ratios.

As has been blogged and forum’d extensively, using these ratios is bad and stupid and anyone doing so does not know what they are doing as they do not help you solve performance problems. I mean, hell, you can download Connor McDonald’s/Jonathan Lewis’s script  to set it to what you want so it must be rubbish {go to the link and chose “tuning” and pick “Custom Hit Ratio” – it’s a rather neat little script}.

The point I am trying to make is that once the Buffer Cache Hit Ratio (BCHR) was wrongly elevated to the level of being regarded as a vital piece of key information but the reaction against this silly situation has been that it is now viewed by many (I feel) as the worst piece of misleading rubbish. Again a silly situation.

I think of the BCHR as similar to a heart rate. Is a heart rate of 120 good or bad? It’s bad if it is an adult’s resting heart rate, but pretty good if it is a kitten’s resting heart rate. It’s also probably pretty good if it is your heart rate as you walk briskly. Like the BCHR it can be fudged. I can go for a run to get mine higher, I can drain a couple of pints of my blood from my body and it will go up {I reserve the right not to prove that last one}. I can go to sleep and it will drop. Comparing my resting heart rate to yours (so like comparing BCHRs between systems) is pretty pointless, as I am a different size, age and metabolism to you {probably} but looking at mine over a year of dieting and exercising is very useful. If only I could keep up dieting and exercising for a year…

So what do I think the much-maligned Buffer Cache Hit Ratio gives me? It gives me what percentage of sql access, across the whole database activity, is satisfied from memory as opposed to disc. Or, put another way, the percentage of occurences a block has to be got from the I/O subsystem. Not how many blocks are read from storage or memory though, but you can get that information easily enough. As Physical IO is several orders of magnitude slower than memory access {ignoring I/O caches I should add} , it gives me an immediate feel for where I can and can’t look for things to improve.

If I am looking at a system that is overall very slow (eg high process wait queues under l/unix, the client has said the system is generally slow) and I see that the BCHR is low, say below 90%, this tells me I probably can get some performance increase by reducing physical access. I’ll go and look for those statements with the highest physical IO and the hottest tablespaces/objects in the DB.
If the BCHR is already up at the 99% level, I need to look at other things, such as tuning sort, looking at removing activity in the database, to be very mindful of nested loop access where maybe it is not the best access method (very likely due to old stats on tables).

When I have got to know a system and what it’s BCHR generally sits at, a sudden change, especially a drop, means there is some unusual physical IO going on. If the phones start going and someone is complaining “it’s all slow”, the BCHR is one of the first things to look at – especially as it is available from so many places.

Another thing the BCHR gives me is, if I am looking at a given SQL statement or part of an application, it’s specific BCHR can be compared to the system BCHR. this does not help me tune the statement itself, but I know if it’s specific BCHR is low then it has unusually high IO demands compared to the rest of the system. Further, Reducing it might help the whole system, so I might want to keep an eye on overall system throughput. If I reduce the statement’s execution time by 75% and the whole system IO by 1%, the client is likely to be more happy, especially if that 1% equates to other programs running a little faster “for free”.

So, I don’t use the BCHR to tune individual statements but I feel confident using it to track the general health of my database, so long as I am mindful of the impact of new functionality or upgrades. It’s a rule of thumb. It’s a database heart rate. (and so is redo generation and half a dozen other things).

When do We Learn #2 October 20, 2009

Posted by mwidlake in Blogging, Perceptions.
Tags:
4 comments

I exchanged a couple of mails with a friend a few weeks back about how the same topic can arise in a couple of blogs at the same time. Well, I had just blogged myself on when we learn and, blow me over with a feather, that Jonathan Lewis goes and post in a similar vein. He must be nicking ideas off my blog :-) {and yes, I am being tongue-in-cheek here}. We both post thought about needing spare capacity in your life to be able to spend the time to really understand how something works. Yes you learn a lot in the heat of a crisis, but you rarely reallu understand the details, ie become an expert, without having time to digest and qualify that knowledge.

I did write a long comment on his posting, including some links back to my own meandering thoughts on the topic, then realised that I would come across as a bit “me too” so I trimmed it and took out the links. But that is part of why I do my own blog, I found I was spamming other people’s pages with my diatribes and so decide to spam my own. {And I know I am meandering, I’m a bit sleep-deprived, stream of consciousness at the moment}. So here I can refer back to my own stuff and say “me too”, but you are already here reading this, so you only have yourself to blame :-)… Anyway, I wanted to refer back to a very early blog of mine about how much knowledge is enough. I try and make the point that you do not need to know everything, you can become a small-field or local expert just by being willing to learn a bit more.

Jonathan raises the point that he does not have a full time commitment to one client and so he has the luxury to investigate the details and oddities of what he looks into. He suggest this is a large part of why he is an expert, which I feel is true, and I am very happy to see one of the Oracle Names acknowledging that relative freedom from other pressures is key to having the luxury to chase down the details. Those of us in a full time role doing eg DBA, development or design work, have more than enough on our workday plates to keep us too busy. We cannot be top experts, we have a boss to satisfy and a role to fulfill. {Jonathan does not mention that chosing a career where you have luxury of time is also a pretty brave choice – you stand a good chance of earning a lot, lot less whilst working very hard to establish enough of a reputation to be able to earn enough to feed yourself and the cat}.

But this is not a black and white situatuation. There is room for many of us to become experts in our domain or in our locality. Our breadth of knowledge may never be as wide as others, we may not know more than anyone else in a given area {and let’s face, logically there can only be one person who knows the most about a given topic, and that one person is probably in denial about their superiority, which seems to be a defining quality of an expert – it is not so much humility I think as an acknowledgement of there being more to know and a desire to know it}. However, most of us can become the person in our organisation who knows most about X, or who can tie A, B and C together in a more holistic way than others (and that can be a real trick you know). There are always the top experts that you can call on for the worst problems, but you could become the person people come to first.

My advice would be to not try and learn everything about all aspects of Oracle, because you can’t, but rather learn a lot about one or two areas {and consider areas that are more unusual, not just “tuning SQL” or “the CBO”} and expand just your general knowledge of the wider field. And never forget that there is more to learn. So long as you are taking in more knowledge and understanding, you are improving. The best way to do it? Don’t just read other people’s stuff, try teaching someone else. It never ceases to amaze me how stupid I realise I am when I try and show someone else how something works. But that’s OK, so long as they learn it’s fine. If I learn as well, it’s great, and I nearly always do.

I’m getting on a bit, I think I am finally getting the hang of the idea that the more you know the more you realise you don’t know, I wish I knew that when I knew nothing.

Follow

Get every new post delivered to your Inbox.

Join 193 other followers