jump to navigation

Friday Philosophy – ICE {In Case of Emergency} January 31, 2010

Posted by mwidlake in Friday Philosophy.
Tags: , ,
6 comments

ICE is a simple concept. In this case it stand for In Case of Emergency and the idea is to store a contact number on your mobile phone under the identifier of ICE. Then if you are in an accident, suddenly become very ill or are in any other way incapacitated, your ICE number can be called. The person you list under ICE is someone who knows you well and hopefully knows of any medical conditions you have and who can get in touch with others. It can be incredibly useful to medical services to know if you are alergic to common drugs or have any known medical conditions (like heart issues or diabetes). I guess in worst scenario of you being, well, not alive anymore, it gives someone to contact who would want/need to know. What a nice, simple use of modern technology! Go one, get your phone out and put and ICE number in now.

The idea originally came from Bob Brotchie, a UK paramedic.
The idea has caught on enough that there are companies in America trying to make a bit of profit out of it {something that I feel slightly negative about} and it has a wikipedia entry.

There is one big issue, which is if you have locked your mobile phone, no one is going to be able to see your contacts and thus the ICE details. This is something that others have thought of and one suggestion is to make your wallpaper show your ICE contacts.

However, this idea of ICE only works if people know about it and use it. I became aware of this last year when I saw an old lady collapse in the street. She just stopped, staggered a little and went down, attempting to take a brick out of a wall with her head as she went down. Three or four of us rushed over, thankfully one was a nurse so she took charge of looking after the patient. We others quickly found her phone. I bet what you are now expecting me to say is that she had no ICE number. She might have, but none of us knew to LOOK for an ICE number. We took pot-luck on the number saying “Jack at work” or something and thankfully got her husband.

Maybe we were an unusually unknowing bunch of people and all you lot reading this know about ICE numbers, but given we were all turned on enough to come to the lady’s aid, one was a nurse, I work in IT and none of us knew about this useful use of technology, I suspect it is not that universal an idea yet. I only know about ICE as my H&S brother told me about it only a couple of weeks after the above incident. {He was asking me if I could see a negative side to it and all I could think of is that if someone found/stole your phone then they had a known “someone who cares” number for you, but then they also have your whole contact list if you do not lock your phone, with things like “Mum” and “Uncle Bob” on it}

However, it takes only a couple of minutes to put an ICE number on your phone, so nothing is stopping you doing it. If you have a modern “all singing, all dancing” phone you might have an ICE feature or app you can download to show ICE information with the phone still locked (my phone is only 4 months old but has no such feature, but then it was a cheap, temporary “just for the week” buy when my old phone committed suicide). A quick check shows the iPhone app can also show things like drug susceptibility and known medical issues too, which is more immediate help to emergency services than a contact number.

So, I would encourage you to put an ICE number on your mobile phone. I would also encourage you to spread the word about ICE. It’s a great, easy, simple use of technology, but only if it is popular.

Just a pointer to an update January 27, 2010

Posted by mwidlake in Testing.
Tags:
2 comments

This post is really just to highlight that I finished salvaging Yesterdays Post. If you found it of any interest before, go have a look at the end. If not, heck go look at dilbert. As funny as ever but BOY is that web site suffering from advertising hell and slow performance…

Update – I removed the link to the dilbert site because I went over to the site the other day and got one of those “Your PC is INFECTED” cons popping up in my browser. You know, the ones that tell you you cannotget rid of the infection without their software, but the only infection is their software. I could not get rid of it without shutting down my browser session (and yes, I have PC security software that is updated every day).

Testing Methodology – How to Test #2 January 26, 2010

Posted by mwidlake in performance, Testing.
Tags: ,
1 comment so far

<previous post …

Last post I (see link above) I said something about how I like timing how long things {usually SQL queries but you can apply this to more complex situations} take as it is simple. But I highlighted that it is an approach that misses a lot of information and also can catch you out if you do not appreciate that the first run of any test script is likely to be slow (due to parsing and data caching overheads which are reduced for subsequent itterations).

I want to continue on that theme now, of simple testing with just how long things take. And how to cope with variation.

Hands up all of you who have Oracle on a PC at home? Great, you can test what you want on a dedicated machine. No one else is running anything on your machine, no one else is using your storage, no one else is using your network (if you have a network at home) and no one is running the SQL statement from hell on your database.

BUT. You might have some download of the latest security patch set running in the background, your virus scanner may kick in (or you might just have Norton full-security-everything-running-on-your-cpu-all-the-time turned on). Your miles per gallon may vary, especially on Windows but also on Linux.

For many of us we are testing on a server at work, where developers are working, where the storage is shared, where the network is dealing with that lady behind you watching “Celebrity Strictly come roller-skate dancing impersonations” {well, she sat behind me for 4 months and she was in the Quality team, what does that tell you}. Bottom line, the performance of your server will vary, your testing needs to cope with that.

How do you cope with a varying workload on the system you are running tests on? Well, you run the new and old version of your test several times, in succession, interleaved. This is ideal when , in the very un-ideal situation of having to test on a Live situation. Testing on “Live” is wrong, in the same way that spelling “initialisation” with a Z is wrong, but it is often forced upon us.

As an example, I want to check into the questionable suggestion someone made that selecting count(*) from dba_tables where owner=user is faster than selecting count(*) from user_tables.

So, I knock up a quick sql*plus script that runs the two alternatives in quick succession, interleaved.

--testa.sql
set timi on pause off
spool testa.lst
--
prompt run 1
select count(*) from user_tables;
select count(*) from dba_tables where owner=user;
--
prompt run 2
select count(*) from user_tables;
select count(*) from dba_tables where owner=user;
--
prompt run 3
select count(*) from user_tables;
select count(*) from dba_tables where owner=user;
--
prompt run 4
select count(*) from user_tables;
select count(*) from dba_tables where owner=user;
--
prompt run 5
select count(*) from user_tables;
select count(*) from dba_tables where owner=user;
--
spool off
-- eof

Just a couple of things to mention. I always spool out to a file called the same thing as the script, with .lst at the end (which is the default file extension for spool commands but I like to state it) and I like to use “prompt” to record in the output what is going on. Here it is obvious, sometimes it is not. Oh, and I like to put –EOF at the end of my files, so I can spot cut-and-paste errors. I know, I am very “command line” centric. I love pictures too Doug. Finally, I should not run the two versions with the same one first each time, I should swap them over after every two runs (not each run as then you are running A,A,B,B,A,A,B,B. A more complex A,B,A,B,B,A,B,A is better). But the results look messy, so I didn’t.

Here is the output (trimmed a little, for the sake of brevity):-

run 1

  COUNT(*)                                                                      
----------                                                                      
        35                                                                      
1 row selected.
Elapsed: 00:00:00.58

  COUNT(*)                                                                      
----------                                                                      
        35                                                                      
1 row selected.
Elapsed: 00:00:00.73

run 2

  COUNT(*)                                                                      
----------                                                                      
        35                                                                      
1 row selected.
Elapsed: 00:00:00.28

  COUNT(*)                                                                      
----------                                                                      
        35                                                                      
1 row selected.
Elapsed: 00:00:00.10

run 3

  COUNT(*)                                                                      
----------                                                                      
        35                                                                      
1 row selected.
Elapsed: 00:00:00.28

  COUNT(*)                                                                      
----------                                                                      
        35                                                                      
1 row selected.
Elapsed: 00:00:00.10

run 4

  COUNT(*)                                                                      
----------                                                                      
        35                                                                      
1 row selected.
Elapsed: 00:00:00.28

  COUNT(*)                                                                      
----------                                                                      
        35                                                                      
1 row selected.
Elapsed: 00:00:00.10

run 5

  COUNT(*)                                                                      
----------                                                                      
        35                                                                      
1 row selected.
Elapsed: 00:00:00.29

  COUNT(*)                                                                      
----------                                                                      
        35                                                                      
1 row selected.
Elapsed: 00:00:00.09

At this point I might pull the results into Excel, as it is so good at drawing pretty graphs, but I think we can interpret the results “by eye” in this case.

Run    User Tabs   DBA_Tabs
1        0.58          0.73
2        0.28          0.10
3        0.28          0.10
4        0.28          0.10
5        0.29          0.09

AVG      0.28          0.10

My Average is ignoring the first run for each version of the code, as previously explained {parse overhead/warming caches}

You can see that version B is considerably faster than A. In this case, there is no overlap between the two result sets, by which I mean every run o version B is faster than every run of version A. Also, there is little variation in the performance of each version.

What if you got a result set like:-

Run    User Tabs   DBA_Tabs
1        0.77          0.68
2        0.35          0.25
3        0.42          0.34
4        0.48          0.38
5        0.48          0.30
6        0.41          0.29
7        0.40          0.28

AVG      0.42          0.31

If you look at the figures, both A and B slow down and then start to speed up. Something happened to slow down the system in the middle of the test and it was recovering but not back to “normal” by the end. If we had done 7 runs of A followed by 7 runs of B, B would have been hard hit by the temporary system slow-down {I opened a WORD document :-) }.

Also, there is some overlap in the results for A and B. The slowest result for B, 0.38, is slower than for the fastest result for A, 0.35. When the spread of times for test A and test B overlap like this, you need to do a few more tests to ensure the results are consistent. I did 7 but I should really have done 9 or maybe 15.
In any case, B is consistently 25% faster than A. It is Faster.

How about this result set:

Run    User Tabs   DBA_Tabs
1        0.87          0.71
2        0.41          0.42
3        0.47          0.43
4        0.37          0.42
5        0.39          0.34
6        0.51          0.38
7        0.42          0.40

AVG      0.43          0.40

The variation in the result sets is higher and if fact if you only looked at results for paired runs 2,3 and 4 you would come to the conclusion A was very slightly faster than B. In fact, B is slightly faster than A.

Or is it? There are only 6 tests {remember, ignore run 1}, the variation within the sets is high (.37 to .51 in A, .34 ot .42 in B) and the overall difference is low. You could run a statistical analysis on this, a “Student T” test I think, to see if the difference was significant. But unless I was looking at something utterly business critical at this point where the smallest fraction of improvement was important, I would not bother. I would be far better off looking for solution C or doing something else completely. If you ever find yourself doing scientific statistical analysis to decide which of two options is faster, it is probably time to go home and consider what else you could be doing to help your employer more…

Enough for tonight and HOPEFULLY the wordpress pixies will not eat the end of this post (yesterday’s effort was better than tonights I think…)

Testing Methodology – How To Test January 26, 2010

Posted by mwidlake in performance, Testing.
Tags: , ,
1 comment so far

<Previous Post…

On those rare but pleasant occasions when I find myself running a training course on performance, something I always want to press home is “How to Test”.

I believe you can learn everthing in every manual on oracle technology and internals and still be very poor at performance tuning. Similarly, I think you can be lacking an awful lot of knowledge and still be good at performance tuning. It comes down to how you test. Not just how to test, but how you as an individual design and run your tests.

I joke with whoever I am talking to that what you need most of all to test is a watch and someone calling out “start” and “Stop”. ie you need to be able to time things. It is a bit of a throw-away statement, but actually most of us will do exactly this on a regular basis. We will be working on something, run it and it will come back in a few seconds. Then we will tweak it and run it again and mentally {or in my case, just very quietly under my breath} count. We even discuss it with people like that “How long did it take that time?” Ohh, a couple of seconds faster than last time”.

I like tuning by timing very, very, very much.

Firstly, it is simple – If a SQL query runs faster, it runs faster. If you tune by looking at the explain plan and you see a full table scan being replace with a nested loop and an index look up, is it faster? It depends. If you tune by looking at the buffer gets and other statistics from the SGA (or wherever), if the “buffer gets” go down, is it faster? This depends again. If the disk reads went up, even by a realtively small amount, maybe not. If the memory usage went through the roof because of a sort that spills down to disc, well quite probably not. Obviously if all of buffer gets, disk reads and cpu usage went down, you can be pretty certain you are having a positive impact in speeding up the statement. But timing the statement from start to finish gives you a nice, simple answer on if it runs faster.

Secondly, it is simple. You do not have to look at plans, at runtime statistics, at deltas of session statistics, at trace files. All those things are good but you need more knowledge and experience to use them and time to set up, collect and study them. This is not to say you do not get a lot more out of tuning if you understand all of the stuff about trace files, explain plans etc, I hasten to add – but you do not need that to get going with performance tuning.

Thirdly, it is simple. You tell your manager’s manager that you use 47% less buffer gets, 12% less disk reads but 9% more CPU then they will ask you what all that means. You tell them that version A runs in 2 minutes and 9 seconds and version B in 34 seconds, they will understand. Better still, graph it and give them a power point…

Fourthly, it is simple. You can see that statement (a) is faster or slower than statement (b). Now you can look at the plan, at the statistics for the statement, at what wait events are occuring and start tying up book knowledge with real-world results.

Of course, timing with a watch is very crude. You have a very capable piece of computing power underlying that database of yours, so let’s get it to do the timing.

As a simple start, in SQL*Plus use “set timing on” (it can be abbreviated to “set timi on”). Most GUI SQL tools will either default to telling you the elapsed time or have an obvious option to switch on the functionality.

{Oh, I’ll just mention that if you cannot get timing in sql*plus or wherever to work you, might want to check what the initialisation parameter “TIMED_STATISTICS” is set to. Some ancient memory is telling me that if it is set to FALSE, you may not be able to use TIMING in SQL*Plus but another memory is telling me that stopped being true a while back, version 8 maybe. I tried setting TIMED_STATISTICS to false in my session on 10.2 but TIMING still worked, but then I left STATISTICS_LEVEL alone and that cause TIMED_STATISTICS to be set. It is so long ago that I worked on a system that had TIMED_STATISTICS set to false! Even a quick google did not throw up an immediate answer}.

DB10.2> select count(*) from sn
  2  /
any key> 

  COUNT(*)
----------
       116

1 row selected.

DB10.2> set timi on
DB10.2> select count(*) from sn
  2  /
any key> 

  COUNT(*)
----------
       116

1 row selected.

Elapsed: 00:00:01.09

There you go, the count of SN records, all 116, took 1.09 seconds.

Yes, I had “set pause on” and the timing includes how long it takes me to spot that the query has finished and press a key. We all do it. However, sometimes the need for user input it still missed {Something I see quite often is, for example, not pressing the button in PL/SQL developer to get ALL the rows for the SQL statement, rather than just the first screenful}. A key thing is to try and remove from testing all and any waiting for user response, as we humans are slow and irratic.

so SET PAUSE OFF.

Now you have TIMING on and pause off. Time to run something and see how long it takes, then try and speed it up and run again:

DB10.2> select count(*) from person where to_char(dob,'YYYY') = 1988
  2  /

  COUNT(*)
----------
        29
Elapsed: 00:00:01.15

DB10.2> create index per_dob on person(dob);

Index created.

Elapsed: 00:00:01.31
DB10.2> select count(*) from person where to_char(dob,'YYYY') = 1988;

  COUNT(*)
----------
        29
Elapsed: 00:00:00.14

There you go, I added an index and the query went down from 1.15 seconds to 0.14, that is 8 times faster. Timing is timing.
Well, no, and this is something you need to be careful of if you are new to tuning.

The second itteration is nearly always faster.

Why? Well, the first time you run a piece of SQL it has to be parsed for a start, and that takes some time. More importantly, the first time you select the data, it is going to be on disk. Oracle has read it in and put it into memory. The second time you query the same data, it will be found in memory. That {nearly always} makes the next access to the data a lot faster. The index was not being used as I had a function on the column in the WHERE clause and this stops the index from being used.

So having said I love testing by timing, you need to be cautious about one-off tests. Oh, and here below is proof that the index I created is making no real difference to the speed of the SQL query:

DB10.2> set autotrace on
DB10.2> select count(*) from person where to_char(dob,'YYYY') = 1988;

  COUNT(*)
----------
        29
Elapsed: 00:00:00.12

Execution Plan
-----------------------------------------------------------
| Id  | Operation          | Name   | Rows  | Bytes | Cost 
-----------------------------------------------------------
|   0 | SELECT STATEMENT   |        |     1 |     9 |   335
|   1 |  SORT AGGREGATE    |        |     1 |     9 |      
|*  2 |   TABLE ACCESS FULL| PERSON |    36 |   324 |   335
-----------------------------------------------------------

Statistics
----------------------------------------------------------
         20  recursive calls
         19  db block gets
        727  consistent gets
          0  physical reads
       1992  redo size

DB10.2> drop index per_dob
  2  /

Index dropped.

Elapsed: 00:00:00.03
DB10.2> select count(*) from person where to_char(dob,'YYYY') = 1988
  2  /

  COUNT(*)
----------
        29
Elapsed: 00:00:00.15

Execution Plan
----------------------------------------------------------
| Id  | Operation          | Name   | Rows  | Bytes | Cost
----------------------------------------------------------
|   0 | SELECT STATEMENT   |        |     1 |     9 |  335
|   1 |  SORT AGGREGATE    |        |     1 |     9 |      
|*  2 |   TABLE ACCESS FULL| PERSON |    36 |   324 |  335
----------------------------------------------------------


Statistics
----------------------------------------------------------
        261  recursive calls
         19  db block gets
        841  consistent gets
          0  physical reads
       1992  redo size

We see 0.12 seconds goes to 0.14 seconds, exactly the same explain plan for both statements, a small increase in consistent gets, no physical gets by either statement (so the data is cached).
Why is the second statement a little slower and has a large increase in recursive calls and db block gets? Because I dropped the index and the statement had to be reparsed. Now, if you are new to tuning you would almost certainly not have appreciated what the recursive calls and DB block gets were all about, it could distract you from the question of “does it run faster”. It is certainly good to know all about that, but when you are starting off, you want to keep things simple and learn in stages.

What the above demonstrates, I hope, is that the second thing you run will have an advantage and could probably run faster even though, in reality, it is using more resource. And we tend to run old code first and new code second. So swap them over, give the old code the advantage of being run second.

Do not test things once, as you can easily be caught out. Test each version several times. And ignore the first run of each version. This is not perfect advice, code in production may well be being parsed and gathering data from disk, but unless you can allow for this in your testing, I think it is generally better, for simple testing, to run each version 6 times. Of the six runs, ignore the first run of each and average the results of the other 5. Which ever one runs faster on average is, well, fastest. IF the difference is significant.

Oh. Where is the SQL AUDIT in all this? Well, ponder on why am I generating REDO for a simple select…

Testing Methodology – Getting Community Information January 22, 2010

Posted by mwidlake in Testing, Uncategorized.
Tags: ,
4 comments

<Last post I tried to highlight that knowing specifically what you want to understand before you start testing is important and that it is beneficial to start with the booring official documentation. This is so that you can filter what is Out There in the web, as well as get a good grounding in what is officially possible.

There is no doubt that you can get a lot more information on Oracle features from the community than you can from the manuals and that it is often presented in a more easily understood and relevant way. But some of it is not right for your version or is just plain wrong. I can do no better than to refer to this recent posting by Jonathan Lewis on the topic. In this case, it is the persistance of old and now irrelevant information, perpetuated by the kind but flawed habit people have of pulling information together – but not testing it.

Treat everything on the net with caution if it has no proof. Even then be cautious :-) – I know I have “proved” a couple of things that turned out to be misunderstandings…

You can of course post to forums to ask for information and help, and people will often come to your aid. But you are often not really understanding the subject by doing that, you are just learning facts. It’s like learning the dates of the two World Wars and not understanding why the wars occured. I would point you to this posting by Lisa Dobson from a couple of years back, which makes the point about what you stand to learn, and also says something about how to ask for help. If it helps you to do the “right thing”, think of it selfishly. You will get better and more specific help if you have already looked into your issue and can ask specific questions. Also, if you properly learn something rather than just get a couple of facts about it, you are less likely to look foolish when you repeat those facts in a context they are not valid in. And you will.

Sorry, side tracked a little into one of my pet annoyances. Back to topic.

I want to know about SQL AUDIT. So I google SQL AUDIT ORACLE. Lots of hits. Let’s start filtering. First off, anything by “experts-exchange” and the other pay-for forums can forget it. Sorry guys but if I am going to spend money I’ll buy a manual. I will also ignore any stuff by sites that constantly suggest you go on training courses on boats with them or are “Team America” {everyone seen the film by Trey Parker? Puts the claim “We are the American Team” into a different light…}.

Now I go and look at the sites and I try making my search more intelligent, maybe adding in the word DBA_AUDIT_TRAIL and the like. I try and think what words and phrases someone explaining the topic would use which would not be common for other topics. That is why I often use database object names and commands in my searches.

In the case of SQL AUDIT, there are plenty of sites that generally pull together the basics from the manuals and tell you how to audit a few things and see the results and give some nice examples. It gets you further forward, but it’s mostly not very detailed. Just cookery instructions on what to do but not how it works. Thankfully, there is an excellent article by one of the experts in the field, Pete Finnigan.
Maybe I could have replaced this post by simply suggesting you just go to Pete’s web site, read what is there and follow the link from it to the ones he likes. It would work for the topic of SQL AUDIT. However, although it is always good to go to the sites of people you know and trust, it is always worth doing a google and going beyond the first page or two of results. Those first pages are mostly for a handful of very popular sites and sources. A new article or someone who is relatively unknown but has the answers you need may be further down the list, like pages 3 or 4. It is worth 5 mins just checking.

However, you are not likely to find everything you need. Certainly with SQL AUDIT you won’t find much about performance impact other than bland and not-very-helpful generic warnings that “Use of SQL AUDIT can carry a significant performance overhead”. How much overhead? for what sort of audit? And how do you know? In fact, when searching I found this, admittedly old, comment by Pete about there being relatively little “out there” about performance impact of Auditing. That made me smile, as that information was exactly what I was after.

*sigh*. The information I want does not seem to be out there. I need to do some testing.

That is for the next post (“at LAST”, I hear you all cry).

Testing Methodolgy – The Groundwork January 20, 2010

Posted by mwidlake in Perceptions, Testing, Uncategorized.
Tags: , ,
add a comment

<Previous PostNext Post …>

I want to start learning about using SQL AUDIT, as I mentioned a couple of days ago.

First question. What do I want to know about SQL AUDIT? I might just have thought “well, I do not know much about this area and I think I should – so I want to know everything”. That is OK, but personally I find it helps to make up a specific aim. Otherwise you can just flounder about {well, I do}. In this case I have decided I want to know:

  1. The options for auditing who is selecting key tables.
  2. How to audit when those key tables get changed.
  3. The impact on performance of that audit, and if it could be an issue.
  4. (4) follows on from (3), in that I want to find out how to control that performance impact.

For any of you who have been or are code developers, you hopefully appreciate test-driven coding. That is, you decide up front what the new code must achieve and design tests to ensure the code does it. You do not write any code until you have at least thought of the tests and written them down in a document. {Ideally you have written the tests and built some test data before you start, but then in an ideal world you would get paid more, have more holidays and the newspapers would tell the truth rather than sensational rubbish, but there you go :-) }

I do not think that learning stuff/testing as much different from developing code, thus the list above. I now know what I want to understand.

What next? I’m going to go and check the Oracle Documentation for the feature. And I am very careful to check the documentation for the version I will use. This is 10.2 for me. I know, the Oracle documentation can be pretty dry, it can be rather unhelpful and it is not very good at examples. But you can expect it to be 90% accurate in what it tells you. You can also expect it to be not-very-forthcoming about the issues, gotchas and bits that don’t work. {I have this pet theory that the official documentation only mentions how things do not work once a feature has been out for a version and people have complained that the documentation should let on about the limitations}.

So, for SQL AUDIT I suggest you go and read:

  • Concepts Manual, chapter 20 Database Security. If I am not rushed I would read the whole chapter, I might realise that what I want to do is better done with some other tool (If I wanted to see who had changed records months down the line, I think I would pick up that database triggers were a better bet, for example).
  • SQL Reference, chapter 13 , the section on AUDIT (no surprises there). I do not do much more than read through the SQL manual once though, as frankly I find it pretty useless for explaining stuff, but it puts into your head what the parts of the command there are and pointers to other parts of the documentation. I’ll read the concepts manual with more attention. In this case, the manual will lead me to:
  • Database Security Guide chapter 8. Which is pretty good, actually.
  • My next source of information, may not immediately spring to mind but I find it very valuable, is to find out which data dictionary objects are involved in the feature. In this case, the previous sources would lead me to go to the Database Reference and check out:
  • DBA_AUDIT_TRAIL, DBA_OBJ_AUDIT_OPTS, DBA_PRIV_AUDIT_OPTS, DBA_STMT_AUDIT_OPTS. And of course, SYS.AUD$. {I actually queried DBA_OBJECTS for anything with the word “AUDIT” in it, check out all the tables and also had a quick peak at the text for any views, which would have led me to SYS.AUD$ if I did not already know about it}.

Why do I go and look at the data dictionary objects and the reference guide? After all, is it not nerdy enough to have put myself through reading the SQL Reference manual? {and let us be honest, it is rarely enjoyable reading the SQL Reference manual}. Because  I want to know how it works, not just how to use it. Seeing the table give me a lot of information and the description of the columns may well tell me a lot more. First thing, SYS.AUD$ only has one index, on a column SESSIONID# (I know, there is another column in the index, I need to get to a DB to check this). Any queries not via this index are going to scan the whole damned thing. Right off, I suspect this will be an issue.

I will DESC the tables, see if there is already any information in them. In this case, there was not. A clean sheet.

Why did I not just go to Google (Bing, Yahoo, whatever) and search? Because I can trust the manuals to tell me stuff which is mostly true and is valid for my release. Not stuff about older versions, about later versions, urban myths or just outright fallacies. The manuals certainly will not tell me everything, far from it, but what it does say is solid stuff. With this reliable start I can now go to other sources and have some chance of filtering  all the other stuff that is Out There. Once filtered, it is probably a lot more informative and easier to digest than the manuals.  

I’ll ramble on about that next posting.

Testing Methodology January 19, 2010

Posted by mwidlake in Testing.
Tags:
3 comments

This is my 101st posting and {almost} my start for 2010. I was going to kick off 2010 on something I knew well, so I could look all smart – specifically, handling stats for big databases. But I have decided to kick off on something I know very poorly. Security. Namely, use of SQL AUDIT. Why? Why blog about something I know little about?

Because I want to blog about the process of finding out about stuff and testing that stuff. It’s actually going to be a thread about testing – the SQL AUDIT is just an excuse {and matches my work life, which is jolly important when time is limited and you need to keep the employer happy}. I started out my whole blog (before almost anyone paid any attention to my meanderings) on what you could find out just from doing a “select count(*)” from a table. You can find out a lot from simple beginings.

How do you test something? Well, that question is actually a step or two down the line from “how do I understand something”. Sorry, I was trained as a scientist at college and you get sold this line as a trainee scientist about Baconian Science whereby you record all the data about something and then come to conclusions. It does not work really. You need an idea first and then you need to test that idea.

So, in the case of IT, you come across something (the initial idea) and you want to know how it works. It is usually a feature of the software new to you. With software (and the Oracle Database is really just a huge pile of software) there is a source of starter information and it is called the “documentation”. If I was a Baconian scientist I would just use the feature and record everything I could think of about it and record what I discovered. But this is software and some human (or humans) designed it and implemented it. And documented it.

So, First thing, don’t google it. Go and read the official manual on it. It is not going to tell you everything (it is, after all, written by people desperate to put it in the best light possible – and yet leave space of consultancy and training dollars to be earned) and remember that the manual can be wrong, but it is likely to be more accurate than 80% of what you come across on web searches. {If I was a tabloid newspaper I would go and ask a bunch of people who knew little about the subject what they thought and then publish an article on it, emphasising any shocking or outrahgeous parts. Remember that when you Google or Bing or Yahoo anything, Anything. }.

Having read (and I would suggest you start with) the Concepts Manual on the feature and then any specific manual covering the feature, now do your web search and…
(a) look for sources you have grown to trust. Not popular sources necessarily, sources where you have rarely found them wrong or they publish examples and worked proofs of their utterences.
(b) look for repeated claims about that feature and keep them in mind. They could be urban myths, they could be accurate. Just keep them in mind.

Now go and do some tests. Yourself. Otherwise, you just never know…

And this is the nub, I think, of testing. Get some basic information about how it should work, throw in there your experience to date (eg, this AUDIT stuff is putting data into an oracle table, so it is inserting and updating rows in a table, it should react like data going into a normal table. What do I know already about insert and updates to tables?) and ask yourself a few simple questions:-
- How do I think this should work?
- How could this go wrong?
- What scenarios can I think of for using this?
- What are the simplest examples I can think of for this process?
- How can I reduce tests down to trying only this feature or even one aspect of this feature?
- How could this go wrong?
Yep, I ask that last question twice and it is no oversight. I want to try and understand where this new feature might break. I need to ask the “ahh but” questions.

More on COMMAND_TYPE values. January 8, 2010

Posted by mwidlake in internals.
Tags: ,
3 comments

This is just a quick update on finding the values for COMMAND_TYPE as is held in V$SQL and V$SQLAREA. The original post is here. Quick sumary – table SYS.AUDIT_ACTIONS as an almost-complete list of COMMAND_TYPES.

In one of those little concidences of fate, I have been looking at use of SQL AUDIT for the last few days. I am not very knowledgeble about AUDIT or oracle security in general, so I do what I normally do when I don’t know much about an area. I look on the web, scan the manual and then go poking around in the data dictionary. I came across two potentially interesting tables, SYS.AUDIT_ACTIONS and SYS.STMT_AUDIT_OPTION_MAP. At first glance they look to hold similar information, things that can be audited. This seems to be what SYS.STMT_AUDIT_OPTION_MAP is:-

select * from SYS.STMT_AUDIT_OPTION_MAP
order by option#
   OPTION# NAME                                       PROPERTY
---------- ---------------------------------------- ----------
         3 ALTER SYSTEM                                      0
         4 SYSTEM AUDIT                                      0
         5 CREATE SESSION                                    0
         6 ALTER SESSION                                     0
         7 RESTRICTED SESSION                                0
         8 TABLE                                             0
         9 CLUSTER                                           0
        10 CREATE TABLESPACE                                 0
        11 ALTER TABLESPACE                                  0
        12 MANAGE TABLESPACE                                 0
        13 DROP TABLESPACE                                   0
        14 TABLESPACE                                        0
        15 UNLIMITED TABLESPACE                              0
-- lets look at dba_audit_trail that underlys many of the audit views.
@vw_txt
Enter value for vw_name: dba_audit_trail
Any Key...>

OWNER      VIEW_NAME                      TEXT_LENGTH
---------- ------------------------------ -----------
TEXT
------------------------------------------------------------------
SYS        DBA_AUDIT_TRAIL                       3424
select spare1           /* OS_USERNAME */,
       userid           /* USERNAME */,
       userhost         /* USERHOST */,
       terminal         /* TERMINAL */,
       cast (           /* TIMESTAMP */
           (from_tz(ntimestamp#,'00:00') at local) as date),
       obj$creator      /* OWNER */,
       obj$name         /* OBJECT_NAME */,
       aud.action#      /* ACTION */,
       act.name         /* ACTION_NAME */,
....
from sys.aud$ aud, system_privilege_map spm, system_privilege_map spx,
     STMT_AUDIT_OPTION_MAP aom, audit_actions act
where   aud.action#     = act.action    (+)
  and - aud.logoff$dead = spm.privilege (+)
  and   aud.logoff$dead = aom.option#   (+)
  and - aud.priv$used   = spx.privilege (+)

What about AUDIT_ACTIONS? Well, I looked at the table and thought “this looks very similar to the list of values for COMMAND_TYPE – I should check that out”.

I don’t have to. The very next day Christian Antognini posted a quick comment on the posting about COMMAND_TYPE telling me where one could find a list of these commands in the DB – Yep, AUDIT_ACTIONS.

Christian’s posting warned me that a couple of commands are missing (our old favorite from the original post MERGE being one and CREATE_DISKGROUP being the other he cites) but it includes several more that are not listed in the manual. If anyone is aware of COMMAND_TYPES not included in AUDIT_ACTIONS then let me know and I can maintain a list here.

{Thanks to Chris for providing many more missing actions and two new to 11.2, included below}

New ones to those liste in the manual are:-
128 – FLASHBACK
129 – CREATE_SESSION
197 – PURGE USER_RECYCLEBIN
198 – PURGE DBA_RECYCLEBIN
199 – PURGE TABLESAPCE
200 – PURGE TABLE
201 – PURGE INDEX
202 – UNDROP OBJECT
204 – FLASHBACK DATABASE
205 – FLASHBACK TABLE
206 – CREATE RESTORE POINT
207 – DROP RESTORE POINT
208 – PROXY AUTHENTICATION ONLY
209 – DECLARE REWRITE EQUIVALENCE
210 – ALTER REWRITE EQUIVALENCE
211 – DROP REWRITE EQUIVALENCE
and new in the table for 11.2
225 – ALTER DATABASE LINK
305 – ALTER PUBLIC DATABASE LINK

Missing codes are:-
88 – ALTER VIEW
90 – SET CONSTRAINTS
187 – CREATE SPFILE
188 – CREATE PFILE
189 – MERGE
192 – ALTER SYNONYM
193 – ALTER DISKGROUP
194 – CREATE DISKGROUP
195 – DROP DISKGROUP
XXX – FLASHBACK

Here is the list form AUDIT_ACTIONS from 10.2.0.3 (an eyeball check with 11.1.0.1 indicates the list has not increased in that version at least, which surprised me):

select * from audit_actions order by action

    ACTION NAME
---------- ----------------------------
         0 UNKNOWN
         1 CREATE TABLE
         2 INSERT
         3 SELECT
         4 CREATE CLUSTER
         5 ALTER CLUSTER
         6 UPDATE
         7 DELETE
         8 DROP CLUSTER
         9 CREATE INDEX
        10 DROP INDEX
        11 ALTER INDEX
        12 DROP TABLE
        13 CREATE SEQUENCE
        14 ALTER SEQUENCE
        15 ALTER TABLE
        16 DROP SEQUENCE
        17 GRANT OBJECT
        18 REVOKE OBJECT
        19 CREATE SYNONYM
        20 DROP SYNONYM
        21 CREATE VIEW
        22 DROP VIEW
        23 VALIDATE INDEX
        24 CREATE PROCEDURE
        25 ALTER PROCEDURE
        26 LOCK
        27 NO-OP
        28 RENAME
        29 COMMENT
        30 AUDIT OBJECT
        31 NOAUDIT OBJECT
        32 CREATE DATABASE LINK
        33 DROP DATABASE LINK
        34 CREATE DATABASE
        35 ALTER DATABASE
        36 CREATE ROLLBACK SEG
        37 ALTER ROLLBACK SEG
        38 DROP ROLLBACK SEG
        39 CREATE TABLESPACE
        40 ALTER TABLESPACE
        41 DROP TABLESPACE
        42 ALTER SESSION
        43 ALTER USER
        44 COMMIT
        45 ROLLBACK
        46 SAVEPOINT
        47 PL/SQL EXECUTE
        48 SET TRANSACTION
        49 ALTER SYSTEM
        50 EXPLAIN
        51 CREATE USER
        52 CREATE ROLE
        53 DROP USER
        54 DROP ROLE
        55 SET ROLE
        56 CREATE SCHEMA
        57 CREATE CONTROL FILE
        59 CREATE TRIGGER
        60 ALTER TRIGGER
        61 DROP TRIGGER
        62 ANALYZE TABLE
        63 ANALYZE INDEX
        64 ANALYZE CLUSTER
        65 CREATE PROFILE
        66 DROP PROFILE
        67 ALTER PROFILE
        68 DROP PROCEDURE
        70 ALTER RESOURCE COST
        71 CREATE MATERIALIZED VIEW LOG
        72 ALTER MATERIALIZED VIEW LOG
        73 DROP MATERIALIZED VIEW LOG
        74 CREATE MATERIALIZED VIEW
        75 ALTER MATERIALIZED VIEW
        76 DROP MATERIALIZED VIEW
        77 CREATE TYPE
        78 DROP TYPE
        79 ALTER ROLE
        80 ALTER TYPE
        81 CREATE TYPE BODY
        82 ALTER TYPE BODY
        83 DROP TYPE BODY
        84 DROP LIBRARY
        85 TRUNCATE TABLE
        86 TRUNCATE CLUSTER
        91 CREATE FUNCTION
        92 ALTER FUNCTION
        93 DROP FUNCTION
        94 CREATE PACKAGE
        95 ALTER PACKAGE
        96 DROP PACKAGE
        97 CREATE PACKAGE BODY
        98 ALTER PACKAGE BODY
        99 DROP PACKAGE BODY
       100 LOGON
       101 LOGOFF
       102 LOGOFF BY CLEANUP
       103 SESSION REC
       104 SYSTEM AUDIT
       105 SYSTEM NOAUDIT
       106 AUDIT DEFAULT
       107 NOAUDIT DEFAULT
       108 SYSTEM GRANT
       109 SYSTEM REVOKE
       110 CREATE PUBLIC SYNONYM
       111 DROP PUBLIC SYNONYM
       112 CREATE PUBLIC DATABASE LINK
       113 DROP PUBLIC DATABASE LINK
       114 GRANT ROLE
       115 REVOKE ROLE
       116 EXECUTE PROCEDURE
       117 USER COMMENT
       118 ENABLE TRIGGER
       119 DISABLE TRIGGER
       120 ENABLE ALL TRIGGERS
       121 DISABLE ALL TRIGGERS
       122 NETWORK ERROR
       123 EXECUTE TYPE
       128 FLASHBACK
       129 CREATE SESSION
       157 CREATE DIRECTORY
       158 DROP DIRECTORY
       159 CREATE LIBRARY
       160 CREATE JAVA
       161 ALTER JAVA
       162 DROP JAVA
       163 CREATE OPERATOR
       164 CREATE INDEXTYPE
       165 DROP INDEXTYPE
       167 DROP OPERATOR
       168 ASSOCIATE STATISTICS
       169 DISASSOCIATE STATISTICS
       170 CALL METHOD
       171 CREATE SUMMARY
       172 ALTER SUMMARY
       173 DROP SUMMARY
       174 CREATE DIMENSION
       175 ALTER DIMENSION
       176 DROP DIMENSION
       177 CREATE CONTEXT
       178 DROP CONTEXT
       179 ALTER OUTLINE
       180 CREATE OUTLINE
       181 DROP OUTLINE
       182 UPDATE INDEXES
       183 ALTER OPERATOR
       197 PURGE USER_RECYCLEBIN
       198 PURGE DBA_RECYCLEBIN
       199 PURGE TABLESAPCE
       200 PURGE TABLE
       201 PURGE INDEX
       202 UNDROP OBJECT
       204 FLASHBACK DATABASE
       205 FLASHBACK TABLE
       206 CREATE RESTORE POINT
       207 DROP RESTORE POINT
       208 PROXY AUTHENTICATION ONLY
       209 DECLARE REWRITE EQUIVALENCE
       210 ALTER REWRITE EQUIVALENCE
       211 DROP REWRITE EQUIVALENCE

New Year, same old rambling thoughts January 5, 2010

Posted by mwidlake in Blogging, Perceptions.
Tags: ,
3 comments

It’s not Friday but, heck, it’s a New Year, there are many of us who might appreciate a non-techie, pointless ramble at the start of the first full working week of a new decade…A Friday Philospohy for the New Year. {If anyone wants to point out the New Decade starts on 1st Jan 2011, go take a running jump – popular opinion is against you, even if logic is for you}.

I found the UKOUG techie conference this year particularly interesting as it was the first major meeting I have been to since I started blogging, and I came across two main opinions about my attempts:

Those who like my blog as it is “chatty” and rambles a bit.
Those who dislike it – because it is “chatty” and rambles a bit…
{oh, and the third opinion, the most common, of utter ignorance of my blog – there goes the ego}.

Well, you can’t please everyone. I was a little saddened, however, as I spoke to a couple of people I really admire in the Oracle Knowledge world and they landed on the “chatty and rambling – bad” side of things. Damn. But they are so good at what they do, I forgive them. The swines.

But then I remembered what I said to a fellow blogger the other month. We bloggers/twitterers all put forward what we blog about in our own style. We might not blog something that is new, we might blog something that is “well known”, but we put it in our own style. Some like it, some do not. It matters not, so long as it adds to the sum of decent knowledge out there.
Some will hate our style and not read, some will read and enjoy. So long as the information gets out there to more people, that is fine.

So, do I think everything I blog is decent knowledge? Oh, I wish. I like to think it is mostly there {and I wish it was all correct} but I am realistic. I test most of what I blog, or I have lived for real most of what I blog, but I will make mistakes. And sometimes I will hit the edge of something good and I put it up there in the hope others will contribute {like the recent one one translating min-max column values into human readable stuff}. And often people do contribute and that is really, really good.

But I do and will continue to make mistakes, be daft, or just put things poorly. I have learned a fair bit in the last 8 months about written communication, the art of communicating to a global audience and also about how not to spread a topic over several weeks as you hope you can “just finish of those half-written blogs in an hour or two” and find it takes weeks. If anyone wants to give me any constructive criticism, please do, but maybe use my email (mwidlake@btinternet.com) rather than flame my postings.

And my last rambling thought for the start of 2010? I am probably going to post less in the next 6 months. I am always sad when the blog by someone I enjoy goes very quiet, but then we all have real jobs to do, so I try to be patient. In my own case, I have noticed I now read a lot less of other people’s blogs as writing my own takes so long. And I am missing too much. There are blogs I really admire or I have discovered in the last 6 months (sometimes both) that I simply fail to really read and they know stuff. So I need to read them. I am going to try and maintain a steady 2-3 blog entries a week, but for the next 6 months I am going to concentrate on learning. Something blogging has taught me is I am really quite ignorant :-)

Good wishes for 2010 to all and everyone who stumbles across my ramblings.

Decoding high_value and low_value January 3, 2010

Posted by mwidlake in internals, performance.
Tags: , ,
13 comments

There is an update to this post here>

The table DBA_TAB_COLUMNS holds the LOW_VALUE and HIGH_VALUE for columns. This information is potentially very useful to us, especially as when the CBO looks at a WHERE clause on a column for a given value, the High and Low values for that column are considered. If the value in the WHERE clause is outside the known range, the expected number of values to be found is decreased in proportion to the distance outside the range.

What is not so helpful is that Oracle stores, and displays, the information in an internal raw format. Which is utterly unhelpful to us of course. Why can’t they at least expose the information in a view in a human-readable format? Tsch.

So here is some code on how to decode low_value and high_value. I pulled most of this together a few years back but always struggled to decode the low and high values for dates, until I found this post by Gary Myers.

-- col_stats
-- Martin Widlake mdw 21/03/2003
-- MDW 11/12/09 enhanced to include more translations of low_value/high_value
-- pilfered from Gary Myers blog
col owner        form a6 word wrap
col table_name   form a15 word wrap
col column_name  form a22 word wrap
col data_type    form a12
col M            form a1
col num_vals     form 99999,999
col dnsty        form 0.9999
col num_nulls    form 99999,999
col low_v        form a18
col hi_v         form a18
col data_type    form a10
set lines 110
break on owner nodup on table_name nodup
spool col_stats.lst
select --owner
--      ,table_name
      column_name
      ,data_type
      ,decode (nullable,'N','Y','N')  M
      ,num_distinct num_vals
      ,num_nulls
      ,density dnsty
,decode(data_type
  ,'NUMBER'       ,to_char(utl_raw.cast_to_number(low_value))
  ,'VARCHAR2'     ,to_char(utl_raw.cast_to_varchar2(low_value))
  ,'NVARCHAR2'    ,to_char(utl_raw.cast_to_nvarchar2(low_value))
  ,'BINARY_DOUBLE',to_char(utl_raw.cast_to_binary_double(low_value))
  ,'BINARY_FLOAT' ,to_char(utl_raw.cast_to_binary_float(low_value))
  ,'DATE',to_char(1780+to_number(substr(low_value,1,2),'XX')
         +to_number(substr(low_value,3,2),'XX'))||'-'
       ||to_number(substr(low_value,5,2),'XX')||'-'
       ||to_number(substr(low_value,7,2),'XX')||' '
       ||(to_number(substr(low_value,9,2),'XX')-1)||':'
       ||(to_number(substr(low_value,11,2),'XX')-1)||':'
       ||(to_number(substr(low_value,13,2),'XX')-1)
,  low_value
       ) low_v
,decode(data_type
  ,'NUMBER'       ,to_char(utl_raw.cast_to_number(high_value))
  ,'VARCHAR2'     ,to_char(utl_raw.cast_to_varchar2(high_value))
  ,'NVARCHAR2'    ,to_char(utl_raw.cast_to_nvarchar2(high_value))
  ,'BINARY_DOUBLE',to_char(utl_raw.cast_to_binary_double(high_value))
  ,'BINARY_FLOAT' ,to_char(utl_raw.cast_to_binary_float(high_value))
  ,'DATE',to_char(1780+to_number(substr(high_value,1,2),'XX')
         +to_number(substr(high_value,3,2),'XX'))||'-'
       ||to_number(substr(high_value,5,2),'XX')||'-'
       ||to_number(substr(high_value,7,2),'XX')||' '
       ||(to_number(substr(high_value,9,2),'XX')-1)||':'
       ||(to_number(substr(high_value,11,2),'XX')-1)||':'
       ||(to_number(substr(high_value,13,2),'XX')-1)
,  high_value
       ) hi_v
from dba_tab_columns
where owner      like upper('&tab_own')
and   table_name like upper(nvl('&tab_name','WHOOPS')||'%')
ORDER BY owner,table_name,COLUMN_ID
/
clear colu
spool off
clear breaks

Most of the translation is done via the utl_raw.cast_to_datatype functions but no such function is provided for dates, which is where most of us come unstuck. Gary recognised that the data was being stored in hex format, with an offset for the year.

I hope the script helps some of you.
{There are also functions under DBMS_STATS, DBMS_STATS.CONVERT_RAW_VALUES, that can also be called to translate many datatypes}

If anyone knows how to decode timestamps, I’d love to know as it would save me trying to work it out/find it on the Net somewhere. I’ll add it to the post so that there is one place to find all translatations.

Here is a quick output for a test table

@col_stats
Enter value for tab_own: dwperf
old  40: where owner      like upper('&tab_own')
new  40: where owner      like upper('dwperf')
Enter value for tab_name: ad_sql_exec_p
old  41: and   table_name like upper(nvl('&tab_name','WHOOPS')||'%')
new  41: and   table_name like upper(nvl('ad_sql_exec_p','WHOOPS')||'%')
any key> 

COLUMN_NAME            DATA_TYPE  M   NUM_VALS  NUM_NULLS   DNSTY LOW_V              HI_V
---------------------- ---------- - ---------- ---------- ------- ------------------ ---------------
INST_ID                NUMBER     Y          4          0  0.2500 1                  4
SQL_ID                 VARCHAR2   Y     87,104          0  0.0000 008d71chrks14      gzw309ttkx862
PLAN_HASH_VALUE        NUMBER     Y      1,884          0  0.0005 2646229            4294043874
SPC_ID                 NUMBER     N         83          0  0.0120 1076               2269
ADDRESS                RAW        N    118,198          0  0.0000 00000003E33123A0   0000001342FF3FF8
HASH_VALUE             NUMBER     N     87,104          0  0.0000 2758710            4294676643
CREATED_DATE           DATE       N        120          0  0.0083 2009-10-23 8:19:10 2009-12-10 9:19:13
LATEST_FIRST_LOAD_TIME DATE       N     11,791          0  0.0001 2009-9-14 11:55:59 2009-12-13 9:33:24
TOTAL_LOADS            NUMBER     N         98          0  0.0102 1                  55047
TOTAL_PARSE_CALLS      NUMBER     N         92          0  0.0109 0                  2972
TOTAL_EXECUTIONS       NUMBER     N        235          0  0.0043 0                  834624
TOTAL_ROWS_PROCESSED   NUMBER     N        809          0  0.0012 0                  26946123
TOTAL_FETCHES          NUMBER     N        313          0  0.0032 0                  834624
TOTAL_BUFFER_GETS      NUMBER     N      3,016          0  0.0003 0                  3355576809
TOTAL_DISK_READS       NUMBER     N        985          0  0.0010 0                  28189240
TOTAL_DIRECT_WRITES    NUMBER     N         98          0  0.0102 0                  751289
TOTAL_SORTS            NUMBER     N        106          0  0.0094 0                  5283
TOTAL_CPU_TIME         NUMBER     N     94,401          0  0.0000 1337               12183936207
TOTAL_ELAPSED_TIME     NUMBER     N    115,203          0  0.0000 1337               139692482086
TOTAL_OPTIMIZER_COST   NUMBER     N      1,467          0  0.0007 0                  369740902209315000
                                                                                     0000000
Follow

Get every new post delivered to your Inbox.

Join 156 other followers