jump to navigation

Friday Philosophy – Alternatives to Testing February 19, 2010

Posted by mwidlake in Friday Philosophy, Testing.
Tags: ,
trackback

We all know that we should test everything before we do it on a production system.

Or should we? Well yes, you should, but sometimes there is no suitable test system (or even any test system, which seems to be becoming a more common problem) or the testing regime takes weeks and you need a fix right now.

So given that there really is a proper business reason why doing the testing is expensive {in effort or time} or impossible, what can you do? I believe there is an alternative.

To be able to reduce or remove the need for proper testing you need to be absolutely sure the following three things are true, in order of importance:

  1. Be able to monitor the live system for negative impact of your changes.
  2. Have a proven and quick rollout.
  3. Be confident that the chance of negative impact on your system is low likelihood and low impact.

Note that, in my opinion, the least important {yet still very important} consideration is that the change be almost certain to work and won’t cause too much grief if something unexpected happens.  However, this seems to be the main concern when people are considering a “Quick Fix” on a production system. “Is it likely to go wrong? Is it going to matter if it does not help?”. At this point all eyes tend to swivel to the DBA or the Lead Developer who has to go on what I call DBA Instinct. Given your experience to date and what you know about how the database works, give it your best guess.

The problem with DBA Instinct is that if you are not very experienced you are unlikely to get it right. If you are very experienced, most of that experience was on earlier version of Oracle and you probably do little hands-on work now as you are a manger. So you are still likely unlikely to get it right :-)

So, I argue that point 1 (spotting things going wrong) and 2 (Getting back to where you started) are most important.

Let us take the classic example of just adding an index as our untested change (as we have no system capable of generating a realistic workload and thus the impact across the board).

3) Most of us would agree that adding an index is easy and the impact is low and the chance of plans going terribly slow by using the new index unsuitably are low…

2) Rollback is relatively simple, you drop the index again. Question, can you just drop an index on a table if a SQL statement is in-flight and using that index? You know, a SQL statement that the CBO has decided is going to be useful and, despite your DBA Instinct that nothing is likely to go wrong, now takes 20 seconds to run rather than 0.2 seconds. And it is being run all the time on your OLTP system so now there is no time when there are not 2 or 3 queries in flight.

3) Verifying impact. This point is vital and can be tricky. You can just rely on staring at your monitoring tool and hoping any serious impact shows up so you can catch it and do something about it. Or people ring up and complain. That would not be so good, you have impacted the business. 

I would pull out code from the SGA that is referencing the table to be indexed before you start (I’ll stick a bit of code at the bottom for doing this). If the list is not too large, you can do the same check after the code has gone in and compare changes to buffer gets and disk reads per execution.

A more targeted check is to pull out of v$SQL_PLAN anything that starts using the new index and check to make sure it is not now a problem.

A second example, I want to gather system statistics as I am sure the system does not appreciate how blindingly fast my disks are for multi block reads.

3) Impact? Well, I would suggest the potential impact is wide and across the board, else why do it?

2) Rollback? Delete the system statistics. What happens when you delete the system statistics? Are they all set to null? Are the defaults put back in place? Could I use DBMS_STATS.RESTORE_SYSTEM_STATS? What is the impact of any of them? {RESTORE appears to work reliably and, if you delete the system stats, you get the “out of the box” values again, which could be very different to what you had before your spur-of-the-moment gather}

1) How to assess negative impact? You are going to have to monitor the whole system and hope to pick up any code that starts running slowly before it impacts the business too much.

I would probably add the index but I would need to do at least some testing and proof of rollback before I gather system statistics. And I would only do so without proper testing if said proper testing was impossible due to a lack of test systems.

So, I do think you can implement changes without testing or with reduced testing, but it is not always the easy option.

I also think it is a valid (and often quicker, in terms of elapsed time) way of changing the productions system without going through full unit/integration/regression/NFR testing.

Now you just need to persuade the Change Advisory Board that it is OK to do the change. Good luck :-)

Oh, that code:-

-- chk_sga_txt.sql
-- Martin Widlake - look for sql in sga containing the provided txt
--
set pause on
set pages 36
col first_load_time form A20
spool chk_sga_txt
select
first_load_time
,parse_calls prse
,executions  excs
,buffer_gets buffs
,disk_reads   discs
,rows_processed rws
--,address        address
,hash_value     hash_value
,sql_id
--,plan_hash_value
,sql_text
from v$sqlarea
--where parsing_schema_id !='0'
where upper(sql_text) like upper('%'||nvl('&sql_txt','whoops')||'%')
and rownum <50
--order by first_load_time desc
order by (greatest(buffer_gets,1)/greatest(rows_processed,1)) desc
/
spool off
About these ads

Comments»

1. Coskan - February 19, 2010

Good points Martin,

Your post reminded me a post of Jonathan Lewis about dependent plans

http://jonathanlewis.wordpress.com/2009/05/05/dependent-plans/

very handy if you want to see what is dependent to the object you want to alter

mwidlake - February 19, 2010

I just checked out that link Coskan. The original posting went by me somehow but it is indeed a very cunning script. Thanks for that.

2. Niall Litchfield - February 19, 2010

There’s good news though, even web developers with trendy technology are beginning to think in the way described in your post http://www.contrast.ie/blog/there-are-no-small-changes/ (by way of Tom Kyte and his data centric view of the same issue).

I’m personally really fed-up with change advisory boards. Not because I think they are bad, but humans, being humans and liking talking, seem to have missed the point of them. The point isn’t to have a board, the point is to ensure that changes are technically reviewed for precisely the sorts of issues you describe above.

mwidlake - February 19, 2010

Hi Niall,

As soon as I clicked on the link I had a flash back image {NO not flashback(tm)} of Tom doing that exact presentation at the UKOUG TEBS conference this year.

CABs. *Sigh*. A whole Friday Philosophy coming up on CABs. Great idea, can be very useful, sometimes vital, but most often a painful blockage to the whole development and support process. It’s a person thing, you are right. Watch this space.

Peter Scott - February 19, 2010

Sigh indeed – Process is good… except when it is bad process.
The CAB that wanted to approve each dataload of data warehouse in case the new data affected live performance – and could we test each load first and send them the test results and test plan?
The CAB that approved a major change (20 hours outage) in the data centre without asking the business about the impact to them (major as there was a takeover in progress and the auditors needed to run every report known to man) – the CAB then said that we should have blocked the change at CAB meeting – then realised that we were deliberately excluded from their process
And don’t start me on the pure stupid changes that people do without even thinking / asking … like deciding to change the connect details in TNSNAMES so it had a “more meaningful” name and then NOT wondering why the BI server could no longer connect to the database..
I’m sounding an old man :-)


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Follow

Get every new post delivered to your Inbox.

Join 163 other followers

%d bloggers like this: