jump to navigation

Keeping the server and storage utilisation high October 9, 2009

Posted by mwidlake in performance.
Tags: ,
trackback

A friend of mine sent me this today, from an old site of mine:

******************************************
From: John Smith
Sent: 07 October 2009 10:08
To: Sarah Sims
Cc: DBA team
Subject: RE: Performance issues on your Test servers
 

Hi Sarah,

Please could somebody tell us why this query is running repeatedly on your database:

SELECT composite_key , exact_time , object_type , table_name , user_id , xml_data FROM usr1234.acramendlogshd ORDER BY 2;

It’s very prominent in both of the 1-hour time slices I’ve analyzed so far, and is fetching the entire 12GB 20M-row table.  This is so absurd that it looks almost like a programming error!

The same table in the production environment is almost the same size, 18M-rows.
 
Regards,

John
******************************************

******************************************
From: Sarah Sims
Sent: 07 October 2009 11:18
To: John Smith
Cc: DBA team
Subject: RE: Performance issues on your Test servers

John,

This would be the ACRALS service which has a bug in it currently which means that it runs continuously but never achieves anything.

Sarah
******************************************

So, not only inefficient and pointless but known to be inefficient and pointless. And still on.

And as some of you may suspect, yep this is a third party application where changing the code is forbidden. Seems like testing before release might also be forbidden….

Comments»

1. PdV - October 9, 2009

You need more hardware !
End of.

You need to order extra hardware, If only to be seen to do something.
(hardware vendors love packages apps…)

Hardware gets cheaper and faster, so we have to consume more of it all the time to keep the market going. I remember in 2007 a “visionary” stating: “Hardware is cheap. Soon, you will be ordered to solve a problem with hardware, period, no disucssion”. And low n behold exactly that happened in May-09. In a quite-rude way too.

Our real solution in May was to purge old data, btw. Every record of the “history” (sometimes thousands per entity) resulted in a dynamic-sql qry fired.. ?

Apparently, deep down in the releasenotes, the software-vendor had stated that “at most 2 weeks of history should be held in operation”, and we had nearly two years worth of old data. A miracle the shared-pool had withstood the bashing at all (hey, Oracle can be rather good. Me too, I often like Oracle – my career depends on it).
What prolly helped us was the fact that each dyn-sql also required a round trip from app to db, that prolly gave the db time to breathe (that is speculation on my part tho).

But the damage was done: new server was on (forced) order anyway.

The DBA had a good excercise-moment for physical standby when switching over. Quite a positive twist, as he is now much more confident with DataGuard: He got the opportunity to test it for himself on his own live system.

In your case, I suggest you pre-emptively contact your hardware vendor. They really want to help, you know.

2. Martin Berger - October 9, 2009

hmm,
maybe this is the reason, why Oracle introduced the query result cache?
But, as allways, if you cannot change the game, you even can manipulate some undefined areas in the play {or just buy the referee}?
A FGAC policy which adds a ‘where 1=2’ was my first idea (not sure if it can be applied only on specified SQL_ID, but worth trying).
A 2nd idea (if 11g) would be using SQL Plan Baselines with a ‘slightly different’ statement {did I mention ‘where 1=2’} and try to plant it on the database.
Yes, I also like Oracle, as there is {nearly} allways at least one more way to solve my problem than I can think 😉

3. Gary - October 9, 2009

If you are on Enterprise Edition, Advanced Query Rewrite can do what Martin was suggesting in regard to adding a 1=2 predicate
http://download.oracle.com/docs/cd/B19306_01/appdev.102/b14258/d_advrwr.htm#CFAJDGBJ

4. mwidlake - October 10, 2009

Aaaargghh! No guys, don’t work around the problem, make these damned slap-dash vendors fix their crap code!!! Withhold payment, ring up their managing director and shout at them, tell user groups how rubbish the vendor is, go around their offices and cause trouble, pay tramps to hang around the foyer, anything but “fix” their problem, as that is condoning bad behaviour!!! {can you see the blood vessels throbbing in my temple from there :-)?}

OK, OK, as Martin says, often you can’t change the game {ie the vendor already has your cash and is not helping, or is just so useless that they can’t fix a glaring bug like this “until the next release”, or are so big compared to you that they don’t care} and you’ll be damned if you will let your company pay the price of the duff code just to prove a moral point, in which case Martin and Gary’s fix comes to the rescue nicely. Very nicely in fact. – And Piet, just accept the new hardware. There is always some forgotten system somewhere that you know is critical but no one will pay for an upgrade of, but now you have some “old” kit you can recycle…

And that leads me to something about the situation that I keep thinking about. You guys and I would look for a solution. The person “looking after” the system knew about the duff code. But as far as I know did nothing to fix it or work around it or even go and ask the company experts for advice, she just complained the system was slow and wanted to know why. The connection between a repeated massive FTS and a slow system was not even made.

There are possibly two problems with the system. The code AND the person looking after it. If I was a manager there, I’d now be asking if this person needs more training, more motivation, some pointers in being proactive or simply moving to a job making tea for everyone else.

5. Martin Berger - October 10, 2009

I’m really sorry for throwing your philosophical blog into a technical direction. Really. I am!
I tried to reflect why I acted as I did, and I came to some answers:
*) I’m a technican. So I search for solutions in the area I know best – first.
*) I’m used to optimisation. So I use it first in choosing the method to target the problem: Should I a) push the vendor, b) push the person who favours ‘looking away’ over ‘looking after’, c) solve it myself?
a) will cost me a lot of effort in first {making someone} contact the vendor, and in subsequent more effort to ask for a solution, escalate, maly my manger to escalate, make the CTO … *grrr* Sometimes this is fun, but often it’s not.
b) sounds similar to a): As this person did not care until now, why now? So it will end in a kind of escalation within the company. Also boring and frustrating.
c) is much easier: After identifying the problem (which you prodivded, in this case), searching for a technical solution which meets the gameplay. I did this in my previous comment and Gary answsered within less than 1 day {what a performance! It’s still weekend}. The next steps would be development, testing, documentation and presentation of the solution. Probably not more than 2 days alltogether.
Which Solution to chose?
a) will probably last for months.
b) even for weeks
{a) and b) also cost a lot of in-company-effort}
c) can be very fast in comparison to a) and b) and even gives me more fun {did I mention I’m a technican?}
You said, there are possible two problems with this system: the code (vendor) or the person. But I’d say, there is only one: the manager who is responsible for both {the purchase and hiring decision}. That’s what managers are for: make decisions and solve problems resulting in them. Even here is a ‘BUT’: what’s the cheapest solution to choose: a), b) or c) ? Here the manager might be d’accord with my findings.
To come back to this blog and the thread, this shows my first comment was correct {even it piqued your initial question} – first just by intuition, now by deduction.

mwidlake - October 10, 2009

“I’m really sorry for throwing your philosophical blog into a technical direction”
Well, yes you did (and it was a very good technical solution) but your response is also very philosophical and very interesting. And I was just having a half-joking Rant when I replied.

I too am a technician, but one who has been a manager for many years as well, so I have been both sides.
As a technician I would agree that (c) fix the problem so that my system is OK and move on is easiest and quickest, if there is a technical bandage I can apply. This is not always true.
(b) fixing the person who did not actively handle the situation does take more time, but you should end up with a more effective person – and people can and do improve. Would you say you have not increased your problem solving skills and interpersonal skills as you got older? {and there is a circular argument that goes, if you do not realise you need to improve your interpersonal skills, you REALLY need to improve your interpersonal skills, hehe}.
Following on from (b), you suggest towards the end of your comment that the real problem is the manager. Maybe. Maybe not. The manager may have had no control over the introduction of the vendor’s software (and in this case, I happen to know it was foisted on the IT department despite the DBA manager’s opinion AND the head of IT’s opinion). The manager may have not hired the member of staff or is trying to take someone with junior skills and improve them. That would be a very good manager in my book.

Which leads to (a), sorting out the vendor. You are right, a lot of effort and payback to your company is only going to be worth it if you use that vendor a lot or the problem is extreme.

I would say, as an ex-manager, that (c) fixing the problem is paramount and cheapest, but (b) is probably the best overall cost option as I would be developing my asset, the member of staff. This is not empty theory, it is something I always try to do as a manager. It is one of the few things that made me a good manager, in my self-analysing opinion. (c) and (b) are not mutually exclusive options.

That still leaves (a) fix the vendor, which was my Rant. And I am going to find it hard to argue for from a business point of view. But from a Moral point of view, I DAMN well want to see more of it!

A thought – “by intuition and now by decuction”, you feel that (c) is still the correct answer. What happens when problem two comes along and this time (c) is not possible as, no matter how much fun you have trying to answer it as a technician, you can’t bandage the new problem? Then you only have (a) beating up on the vendor. You should have tackled that vendor last month when you had the first problem as now they would know you are not to be messed with and have a list of broken bits with their code to tell others about…. {Oh, I can dream on}

If all individual businesses stood up against poor vendors, demanded action on poor code, complained about untested code, basically were more demanding of quality, we might get better stuff from all our vendors and drive bad vendors to the wall.
However, capitalism and communism (which is pretty much what I want to see, the community of users acting for evereyone’s benefit), well, they are pretty much mutually exclusive arn’t they? Value for money for one company? Solve the problem and move on.
So I am on to a loser.
Ohhh, that makes me mad!
🙂


Leave a reply to mwidlake Cancel reply