jump to navigation

Friday Philosophy – Flippin’ Technology June 5, 2015

Posted by mwidlake in Friday Philosophy, off-topic, Perceptions, rant.
Tags: , ,
5 comments

Sometimes I think I would have been a Luddite or a member of other groups who have trashed new technology in frustration. Some days, I can just scream at it. You would think having worked in Information Technology for so long would make me more of a fan, but it actually makes me worse – as I know there is no need for there to be so much wrong with the electronic dross we all have to deal with day-to-day. And if I, someone who has used (heck, even programmed) computers for 3 decades, have trouble with these damned things, how frustrating must “normal” people find it?

Tesco Challenge - original on RevK Rant's blog

Tesco Challenge – original on RevK Rant’s blog

Take this morning’s source-for-a-rant. Self checkouts in supermarkets. I had popped into Tesco to get the weekend’s rations of baked beans, wine and cat food and there were large queues for all manned (though, I should more correctly say, mostly womanned) tills. And almost no queue for the self-service ones. We all know why almost no queue for the self-service ones, most of us hate them. But I had to get back home for a UKOUG conference call and there was very little chance the three-people-deep queues would be negotiated in time, so I manned up and went to one of the vacant screens.

Have I mentioned I’ve been using computers since before Wayne Rooney, Keira Knightley or Michael Phelps were born? So I have some affinity and experience to navigating screens of information. But, like all of theses devices, using them is painful. Given they are designed to be used by everyone including idiots, why is the “UX” so low? Why does the important information appear at different spots in the screen at different times? Why does there seem to be no button to press for a simple, key-word triggered guide (“How to weigh Veg?” Oh, look up there, press the correct icon, look down there and press another and then finally click over here to say “yes I really did ask you to weigh some bananas” – that would be nice). Why does the Waitrose one make me swipe my card to pull up my scanned items but insist I shove the card up the slot to pay? Plus all the times you have to get some human to come over and confirm you are over 18 (I need to be 18 to buy expanding foam?!?) or don’t look suicidal. I’m not being funny but the age check is just not needed, if I am using a credit card it can know I am 18 or over (you cannot you have a UK credit card under 18, partly as you are not allowed to sign up for a credit agreement below that age and if I am using someone else’s credit card, me buying a bottle of wine is the least of the potential issues). To give them their due, at least our local Tesco (unlike many other stores I have used around here) have someone on hand to constantly swipe, press, tap and harass the machines into playing correctly.

I can’t believe how badly these self service checkouts work. I can’t believe the companies have not tested them extensively with real people (I know, they claim to, but then I’ve seen “user system testing” in big banks and it is shockingly poor). How can they think such terrible systems are good for business? That people still insist on queuing for checkouts with real people rather than use them must tell the companies something! Why can’t these systems be better designed. Why are they so painful to use? It can’t be me. Maybe it is me….

Next rant. My internet & email supplier. Hi BT. BT, you are crap at running your internet & email service. I’ll only mention in passing the regular episodes of slow internet, the times mail seems to struggle to keep moving, the harassing emails to buy more allowance as I am reaching my limit…for the prior month (“HI BT, I think I might pop back in time and download 15 films last March, can I have more allowance for than as I’d used over half my limit”) – but I am going to complain over the fact that for the last 5 weeks now, each weekend you stop accepting valid connection requests from my Laptop – but allow them from my iPhone. So my account/pwd is working fine. But no, from the PC you tell me I have to validate my account. So I log on to the web site and as soon as I put in my details you tell me I have to change my password as there has been too many failed attempts at access my account? Well, if they failed, I chose a decent password maybe? Trying to force me to change it is likely to make me change it to something simpler maybe? Especially as this is the fourth time this month… but then, usually the system fails to actually process my password change and just hangs. My iPhone still keeps working with the old details and, usually by the next day, the errors have gone and I can access my mail with my old details with a real machine again too. My conclusion has been that it’s their software screwing up. Each. And. Every. Weekend.

It’s got worse, it now fouls up some workdays too. And I made an interesting discovery. When I log in to the web site to validate myself, if I put in a password of “scr3wy0uBT” – it accepts it and puts me into the Change Your Password screen exactly the same as before. No, my password is not “scr3wy0uBT” {it rejected it when I tried…}. So top security there. Whatever is happening, it’s just…..pants {UK phrase, it means “utterly rubbish”. Pants are not trousers, they are undergarments, you strange Americans.}

What is BT doing wrong to have this problem keep happening? Is this a good “UX” experience for me and all the other people who seem to have similar issues? Is it so hard to sort this out?

What was the third rant? Oh yes. Windows 8. Too many people before me have vented spleen and given pieces of their mind on Windows 8 for me to be able to add any more to the pile, but what I cannot fathom is, as soon as they got the almighty ass-kicking that they did for ballsing up the Start Button/Menu, why did they not in the next version just put it straight back as it was?. Or put out and advertise a simple “patch” to put back what millions of people were screaming they wanted back? All I can think is someone’s ego was too large to wave their hands in the air and say “Oh boy, did we make a dog’s dinner of that – let us help improve your “UX” and our reputation by fixing that straight away”.

Final rant. Games. Computer Games. I like running around shooting things. It gets rids of some of the IT-In-Daily-Life anger. But I am not very good at it and my broadband connection is slow and a bit laggy, so I am not interested in running around shooting things with friends. The same broadband issues mean I also don’t want to spend 4 hours downloading a game, I want to buy it in a floppy disc…..I mean CD…. Errr, DVD… and play it. So I went to this place called a “shop” and I bought a game on media in a box and checked the packaging. No where did it state I have to have an internet connection. I get the DVD out (hmm, there are actually three), put it in the machine and 30 mins later the software is loaded. And now it FORCES me to register with some crapola online gaming site to register my copy (like, if I had a bootleg version that would not be the first bit they strip out) and that takes an hour to download it’s own shitty software. That done, it will let me fire up the game – that immediately bombs out to download the latest patch (which I think it the crapola online site’s version) and that takes two hours as it is obviously much, much more than a patch. I suspect it is the whole damned game again. This is not a “UX” I wanted and, you can bet, next time I buy a game, crapola online gaming company is one thing I will be looking to avoid. It does not help that said game won’t fire up without logging into said game site or making me watch a minute of adverts about who wrote the game on who’s graphics card using what game engine. Thankfully a few minutes on the net explained how I could avoid all of that. But why do I have to take steps to stop these companies annoying me and, this is the bit that confuses me, what makes these companies think I’ll be impressed by being repeatedly exposed to their adds that I don’t want to see? I’ll just despise them a little bit more each time.

I just don’t get it. The number one thing any IT system needs to achieve is User Acceptance (as I have said before, if you check the link). Why do so many large companies miss this and inflict on the world a seriously sub-standard experience of IT and technology? If someone like me who has driven a screen, a keyboard and a mouse for 3 decades, understands some of the limits to IT and must have at least some brains in his skull, if I get endlessly caught out, befuddled and simply screaming-out-loud-frustrated by crap IT, how is my poor old mum (and everyone’s poor old parents) supposed to cope?

I’m going to become a Lumberjack. Chainsaws do not have screens and keyboards.

No I.T. Hassles Here

No I.T. Hassles Here

With Modern Storage the Oracle Buffer Cache is Not So Important. May 27, 2015

Posted by mwidlake in Architecture, Hardware, performance.
Tags: , , , ,
11 comments

With Oracle’s move towards engineered systems we all know that “more” is being done down at the storage layer and modern storage arrays have hundreds of spindles and massive caches. Does it really matter if data is kept in the Database Buffer Cache anymore?

Yes. Yes it does.

Time for a cool beer

Time for a cool beer

With much larger data sets and the still-real issue of less disk spindles per GB of data, the Oracle database buffer cache is not so important as it was. It is even more important.

I could give you some figures but let’s put this in a context most of us can easily understand.

You are sitting in the living room and you want a beer. You are the oracle database, the beer is the block you want. Going to the fridge in the kitchen to get your beer is like you going to the Buffer Cache to get your block.

It takes 5 seconds to get to the fridge, 2 seconds to pop it open with the always-to-hand bottle opener and 5 seconds to get back to your chair. 12 seconds in total. Ahhhhh, beer!!!!

But – what if there is no beer in the fridge? The block is not in the cache. So now you have to get your car keys, open the garage, get the car out and drive to the shop to get your beer. And then come back, pop the beer in the fridge for half an hour and now you can drink it. That is like going to storage to get your block. It is that much slower.

It is only that much slower if you live 6 hours drive from your beer shop. Think taking the scenic route from New York to Washington DC.

The difference in speed really is that large. If your data happens to be in the memory cache in the storage array, that’s like the beer already being in a fridge – in that shop 6 hours away. Your storage is SSD-based? OK, you’ve moved house to Philadelphia, 2 hours closer.

Let's go get beer from the shop

Let’s go get beer from the shop

To back this up, some rough (and I mean really rough) figures. Access time to memory is measured in Microseconds (“us” – millionths of a second) to hundreds of Nanoseconds (“ns” – billionths of a second). Somewhere around 500ns seems to be an acceptable figure. Access to disc storage is more like Milliseconds (“ms” – thousandths of a second). Go check an AWR report or statspack or OEM or whatever you use, you will see that db file scattered reads are anywhere from low teens to say 2 or 3 ms, depending on what your storage and network is. For most sites, that speed has hardly altered in years as, though hard discs get bigger, they have not got much faster – and often you end up with fewer spindles holding your data as you get allocated space not spindles from storage (and the total sustainable speed of hard disc storage is limited to the total speed of all the spindles involved). Oh, the storage guys tell you that your data is spread over all those spindles? So is the data for every system then, you have maximum contention.

However, memory speed has increased over that time, and so has CPU speed (though CPU speed has really stopped improving now, it is more down to More CPUs).

Even allowing for latching and pinning and messing around, accessing a block in memory is going to be at the very least 1,000 times faster than going to disc, maybe 10,000 times. Sticking to a conservative 2,000 times faster for memory than disc , that 12 seconds trip to the fridge equates to 24,000 seconds driving. That’s 6.66 hours.

This is why you want to avoid physical IO in your database if you possibly can. You want to maximise the use of the database buffer cache as much as you can, even with all the new Exadata-like tricks. If you can’t keep all your working data in memory, in the database buffer cache (or in-memory or use the results cache) then you will have to do that achingly slow physical IO and then the intelligence-at-the-hardware comes into it’s own, true Data Warehouse territory.

So the take-home message is – avoid physical IO, design your database and apps to keep as much as you can in the database buffer cache. That way your beer is always to hand.

Cheers.

Update. Kevin Fries commented to mention this wonderful little latency table. Thanks Kevin.

“Here’s something I’ve used before in a presentation. It’s from Brendan Gregg’s book – Systems Performance: Enterprise and the Cloud”

Extra session at OUG Ireland – Oracle Lego. March 12, 2015

Posted by mwidlake in database design, development, Presenting.
Tags: , , ,
add a comment

I’m now doing a second session at OUG Ireland 2015. {This is because one of the accepted speakers had to drop out – it sometimes happens that, despite your best intentions, you can’t make the conference and it is better to let them know as soon as you can, as they did}. This will be a talk called “Oracle Lego” and it is one I put together a couple of years ago when I decided to try and do more introductory talks – talks aimed at those who are not {yet} experts and who I think tend to get ignored by most conference and user group agenda. So it is aimed at those new to oracle or experts in other areas who have never really touched on the subject.

“Oracle Lego” is about the basics of database design. I have a personal soap box I occasionally get on that very little real database design seems to occur these days. There are exceptions, but often the database design seems to be a quick brain-dump of what the developers or business analysts know they need to store information about and the first-cut set of tables gets created – and then endlessly modified as the development rolls on.

Guess what, we don’t build houses, cars, ships, bridges or garden sheds like that – and there is a reason. If you build things piecemeal as you go along and with bits you either have to hand or have to quickly get, you end up with a pretty poor shed. In fact you don’t end up with a shed, you end up with a shack. With a leaking roof and a door that hits the potting table when you open it. I don’t want a shack and I never, ever want to go over a bridge or sail in a ship built “on the hoof” like that!

Further, just as with a proper architectural or engineering design, a database design does not fix the solution in stone, there is still scope for modification. A bespoke house plan gets tweaked and modified as you do the build and you realise what can be improved when you see it – but you do not suddenly decide to dig out a basement and change from wood walls to stone when you have already constructed the ground floor! I’ve seen database “designs” like this.

There is also more to doing a database design than coming up with tables that hold the records we want to store: We might want to consider if storing similar things in the same table could be better than a table for each “type” of something; How we index those tables and relate them together can have a huge impact on how easy it is to get the data out and store it in the first place; The expected volume and life cycle of the data may require us to consider eg archiving; The very-much-ignored aspect of physical placement of data and clustering of data.

You can spend weeks dedicated to learning about database design – but you can also learn a lot in 60 minutes, ie the basics. And it really is like Lego – once you know the basics you can build up a really complex design. And you learn stuff doing it (and turning it into a real system), just like you do the first time you build a Lego robot (or dog or house or car or bridge or spaceship or whatever). So the second time you build your Lego robot you use the same design basics and layer on top what you learnt last time.

So that is the aim of this talk, the basics of database design.

The strange thing is, last time I did this talk I asked the audience how much database design they did. Every single one of them was already an experienced and capable database designer! So why had they come to this intro talk? They had three reasons:

  1. It was the only talk on database design at the conference, and one more than they were used to getting.
  2. They had picked up their database design skills on-the-job and thought a “reminder” of the basics would be good.
  3. It was cold outside and all the other talks appealed less.

So, this time I am hoping some of the audience is new to database design and I get to teach them great stuff they did not know. If it is all experts again, I think I’ll have to retire this particular intro talk, at least for conferences.

As you can see from the agenda grid here, I’ll be talking at 10:15. You can’t link to an abstract of the talk yet, that just needs to be twiddled into place.

Update – Peter Scott stopped by this blog and it prompted a thought. He felt it was too much at a tangent to add as a comment but I felt it was a very valid and valuable point – so check it out over here on his blog.

BTW Pete has started blogging more, on his thoughts and opinions on Data Warehousing. Personally I think it is worth catching them.

{Oh, and in case any lawyers stop by, “Lego” is of course the copyright name of a popular plastic construction toy, made by the Danish company The Lego Group, that children love playing with and adults hate walking on in bare feet. Did anyone not know that?!?! I have no link to The Lego Group and no plastic bricks will form part of my talk.}

Friday Philosophy – The Tech to Do What You Need Probably Exists Already November 30, 2012

Posted by mwidlake in Friday Philosophy, Perceptions.
Tags: , , ,
4 comments

How many of you have read the Oracle Concepts manual for the main version you are working on?

This is a question I ask quite often when I present and over the last 10 years the percentage number of hands raised has dropped. It was always less than 50%, it’s been dropping to more like 1 in 10 and Last year (at the UKOUG 2011 conference) was the nadir when not a single hand was raised. {Interestingly I asked this at the Slovenian User Group 3 months ago and something like 40% raised their hand – impressive!}.

Why do I feel this is important? Well, do you know all the technology solutions available across just the core RDBMS with no cost options?. No, you don’t, you (and I) really don’t. If you read the concepts manual, even just skimming it, you will be reminded of a whole load of stuff you have only dim memories of and perhaps you will even see some features that passed you by when they were first introduced.

Of course, you would need to read a few more manuals to get the full picture, such as the PL/SQL Packages and Types References, as so much good stuff is introduced via built-in packages, and the SQL Language Reference, as SQL has been extended quite a lot over the last couple of versions. Dull reading indeed but I’d estimate that if you read those three you would be aware of 90%+ of the Oracle technologies that are available to you out-of-the-box and considerably more than all but a handful of Oracle Experts. You’d know more than I as (a) I have not skimmed the PL/SQL one for years and (b) I have a rotten memory.

My point it, you can’t consider using Oracle technologies you don’t know about or remember – and they could be just what you need to fix the problem you see in front of you.

I’ll give a couple of examples.

Problem, physical IO is too high, your storage system is bottle-necked.
Answers, Reducing Physical IO:
-First up, Index Organized tables. Some of you will be aware that I am very keen on IOTs and the reason I am is that I’ve used them to physically group data that the application needed to select over and over again. It can make a massive improvement to that sort of system. They are rarely used.
-Clustered Tables. Even less used, in fact has anyone reading this used them in anger in the last 10 years? Great for situations where you need parent-children or parent-children-type-1+parent-children-type-2 data. I confess, I have not used them in anger for years.
-Move table (and order them as you do it!) and re-build indexes to remove “dead” space. This one got a bad name, especially the index rebuilds, as people were doing it needlessly without appreciating what the intention was, but now I hardly ever see it done – even when it is of benefit.
-Compress your tables and indexes. With normal Oracle compress (no need for HCC). In tests I’ve done I pretty much always see a drop in physical IO and run time. Being candid, I can’t remember doing any tests and NOT seeing an improvement but I usually only test when I expect and improvement and I don’t want to give anyone the impression it will always help.

All the above were available in Oracle 7 or 8 and all have improved over the versions.

Problem, you want to carry out some long,complex data processing in PL/SQL and if a step fails, be able to handle it and carry on.
Answers:
– Savepoints. You can rollback to a savepoint, not just to the last commit. In this way you can break the task into chunks and, if something towards the end fails, you roll back one step (or several, your choice) and call an alternative routine to handle the exception.
– Autonomous Transactions. You want to record that an error occurred but not fail the original action or not save anything it has done to date. An autonomous transaction runs in it’s own sub-session and commits in it do not effect the calling session.
– Temporary Tables. You can put your working information in them as you progress and if you need to bomb out (or some evil DBA kills your session for running too long) the temp table contents just disappear. No clean up needed.

Maybe the above is not so fair, I have not been a proper PL/SQL developer for a while now, but I hardly see the above used. Especially Savepoints. I can’t remember not having savepoints available (hmm, maybe Oracle 6) and Autonomous Transactions and Temporary tables are Oracle 8 (I thought Temporary tables might be 9 but Tim Hall’s OracleBase says 8)

Another thing I have noticed over the years is that so often I will read up on some oracle feature I know little about, only for it to come up in the next few weeks! There is a psychological aspect to this, that we only remember these “coincidences” and not the more common situations where we read up on something which does not subsequently come up before we forget about it, but I think that it is also that we tend to use only a few solutions to solve the problems we see and adding another solution to our list means the chances are high it will be suitable for something soon.

OK, so it would help us all to read the manuals (or other Oracle technical books) more. Now the big problem is finding the time.

Friday Philosophy – The Importance of Context November 23, 2012

Posted by mwidlake in Friday Philosophy, Perceptions.
Tags: , , , ,
1 comment so far

A couple of weeks ago I was making my way through the office. As I came towards the end of the large, open-plan room I became aware that there was someone following behind me so, on passing through the door I held it briefly for the person behind me {there was no where else they could be going}, turned left and through the next door – and again held it and this time looked behind me to see if the person was still going the same way as I. The lady behind gave me the strangest look.

The strange look was reasonable – the door I’d just held for her was the one into the gentleman’s bathroom. *sigh*

I was doing the correct thing, I was attempting to be helpful to a fellow person, I was in fact being very polite. But because I had utterly failed to consider the context, there is now a lady who works on the same floor as I who considers me, at best, as strange. At worst she thinks I am very strange – and more than a little creepy. I fear the latter given her reaction when she saw me in the kitchen area recently and turned around. {By the way, if anyone can think of a good way I can clear this up I’d appreciate it. After all, I can’t exactly go up to her and say “sorry about holding the door to the gents for you the other day, I did not realise you were a woman”}.

My point is that you can do what you believe is the right thing but, because you are not thinking of the context or are unaware of the full situation, you end up giving utterly the wrong impression. I had a work situation like this a while back.

Without going into too much detail, I was working with a client on a data warehouse project. The Oracle database bulk-processed large quantities of data, did classic big-data queries and was sitting on some fairly expensive hardware with dedicated storage and the intention of implementing Dataguard. One of the issues they had was with a subsidiary part of the system that created a very large number of small transactions, lots of updates. High volume OLTP on a DW setup. It was hammering the storage and eating up all the available IO. The data for this subsidiary system was transient, no need to protect it.
I realised that the hardware was not correct for this subsidiary system and it needed no archived redo. Archiving redo is an all or nothing situation for an Oracle instance. So happy that I had worked out what to do I proposed {with a smile} moving the subsidiary system to it’s own database on it’s own hardware.
When I said this to the client, their response was a stony look and the comment “We’ve just spent a fortune on this platform……”. Having dug my hole I proceeded to jump right in there “It’s OK, what I am proposing is only about 5, 7 thousand pounds of kit – nothing compared to what you spent already!”. The client now got very, very annoyed indeed.

You see, the context is that they had been sold a system that was very expensive – it was to do a demanding job. They had been getting poor performance with the system and that is partly why I was there. They also did not really understand the technical nuances well (at least, not the chaps I was talking to) and they did not appreciate why I said what I did. From their perspective, this smiling loon was suggesting that a system costing 2-3% of what they had spent on their data warehouse platform was going to be able to do the processing that the expensive system could not. Either they had spent waaay too much, this new “expert” was an idiot or else I was lying to them. And they did not like any of those options.

Looking back it is clear I should have been more aware of how they would receive what I said. I’ve done this before {several times}, bounded into a situation like a wide-eyed puppy and gone “Look! We can just do that!” without considering things like upsetting the guy who had suggested the original solution, or making the on-site expert look stupid or blowing away a salesman’s pitch. Or that I have missed a glaring and valid reason why they can’t “just do that”.

I suspect that a few people would say “no, you just tell them the way it is and if they don’t like it or you upset someone then tough”. Well, maybe, but not if you want to be there to help fix the next problem. Also, I know I am not great at appreciating the context sometimes. That is part of why I will never run a company or be a senior manager, I lack the skills to judge the impact of what I propose or say sometimes, in my rush to be helpful. I am slowly learning to just hold back on ideas though and to run things past friends or colleagues with more “whole picture” skills first though. I might be rubbish at it but I can learn I am rubbish at it.

In the case of the situation above, the expensive system was correct for what they wanted to do – and maybe not quite expensive enough. I was suggesting a slightly unusual fix for a specific problem and I should have been more laboured in explaining the problem and more circumspect in leading them to the solution. I should have taken more time.

I should have checked who was following me and where I was going before I held the door open.

Row Level Security 3 – In Pictures! November 22, 2012

Posted by mwidlake in database design, security.
Tags: , ,
3 comments

<..Part one intro and examples
<….Part two Permissions

I’ve noticed that there has not been a lot of traffic on this series on Row Level Security (data masking) so far – maybe due to how I am presenting the material? So here is a summary to date in picture/diagram format:

This is the flow of logic as to whether Row Level Security is applied to the SQL statement. Usually this is a query and usually it is on a table but the policy can be applied to other DML and also on views and synonyms:

These are the steps you need to add RLS to a table:

Below is a summary of how oracle fires and uses RLS functions and the predicates they provide:

An finally, a summary of what happens if an RLS function is invalid:

If you have not looked at the previous posts on this topic, more details are in there.

Enjoy!

Row Level Security Part 2 – permissions November 20, 2012

Posted by mwidlake in internals, security, SQL.
Tags: , ,
3 comments

<..Part 1, introduction..
..Part 3 summary in pictures..>

In this second post on the topic of “an introduction to Row Level Security” I want to cover a few things about what permissions you need to implement RLS and some of the consequences. In my introduction in part one I just said my main user has “DBA type Privileges”.

{NB This is all on Oracle V11.2 and I believe everything below is applicable to V10 as well. Also, I should point out that I am not an Oracle security expert – but despite repeatedly saying this, it seems like at least once a year I am asked to improve a system’s security on the grounds of “more than we have now is an improvement”}.

Firstly, you need the permission to create and drop RLS policies. This is controlled by having the EXECUTE privilege on SYS.DBMS_RLS. That is it. If you can execute this package you can create and drop RLS Policies, enable and disable them and also create, manage and use Policy Groups.

Obviously EXECUTE on SYS.DBMS_RLS is a privilege you need to restrict to only “empowered” accounts- but you also want to be very careful about your definition of empowered. If an account has this privilege, it can alter (disable!) any RLS policies that ANYONE created on the instance. You may need to think carefully about having several accounts that can alter all the RLS policies across all schemas in your instance.

You might want to create a single user that has the EXECUTE on SYS.DBMS_RLS and use that to implement and control RLS across the instance {and even have the account locked when it is not needed}. Of course, under normal Oracle anyone with SYS access can run the procedure – but you have SYS access under control already, don’t you?… :-).

By default, who has this privilege?

MDW> @obj_privs
Enter value for obj_name: dbms_rls
Enter value for obj_own: sys
GRANTEE              OWNER        TABLE_NAME             GRANTOR  PRIVILEGE    GRA
-------------------- ------------ ---------------------- -------- ------------ ---
EXECUTE_CATALOG_ROLE SYS          DBMS_RLS               SYS      EXECUTE      NO
XDB                  SYS          DBMS_RLS               SYS      EXECUTE      NO
WMSYS                SYS          DBMS_RLS               SYS      EXECUTE      NO

--
-- and who has that EXECUTE_CATALOG_ROLE?
--
select * from dba_role_privs where granted_role='EXECUTE_CATALOG_ROLE'

GRANTEE                        GRANTED_ROLE                   ADM DEF
------------------------------ ------------------------------ --- ---
DBA                            EXECUTE_CATALOG_ROLE           YES YES
IMP_FULL_DATABASE              EXECUTE_CATALOG_ROLE           NO  YES
EXP_FULL_DATABASE              EXECUTE_CATALOG_ROLE           NO  YES
SYS                            EXECUTE_CATALOG_ROLE           YES YES

So, our old friend the DBA role has that privilege, via the EXECUTE_CATALOG_ROLE and IMP_/EXP_FULL_DATABASE.

Let’s go back to creating policies. I’m sticking to row-level RLS for now, not column-level. When you create a policy you basically are saying “when object X is accessed, run function Y and apply the resulting string to all access, as a predicate”

Lifted an example from the manual, but adding in a couple of bits the authors were too lazy to put in { :-) } :-

DBMS_RLS.ADD_POLICY (
 object_schema   =>   'hr'
,object_name     => 'employees'
,policy_name     => 'emp_policy'
,function_schema => 'hr_rls'
,policy_function => 'emp_sec'
,statement_types => 'select');

The function HR_RLS.EMP_SEC returns a string P1

When someone issues a statement that access the HR.EMPLOYEES table the function HR_RLS.EMP_SEC is executed as part of the parse and, internally, the following view is created and placed into the orginal statement:

SELECT * FROM hr.employees WHERE P1;

As this takes place as part of the parse part of the statement processing, errors with the function are found before any effort to access the actual data occur. It is also why it is not simple to avoid RLS – it is done as part of the overall SQL processing carried out by Oracle.

If the function is invalid, generates an error or the supplied “WHERE predicate” P1 causes the final statement to be syntactically incorrect, you get an error.
At this point Oracle fails “secure” and simply won’t fire the SQL statement. Even if you own the tables and have full privilege access to them. I think it is worth a quick demonstration

I am logged on as the owner of the table and RLS function, MDW:-

MDW> select * from test1
        ID CP_NAME                        OTHER_NAME                        MASK_FL
---------- ------------------------------ ------------------------------ ----------
         1 ERIC                           ERIC THE RED                            1
         2 BROWN                          BOB THE BROWN                           2
         3 GREEN                          GARY THE GREEN                          1
         4 BLUE                           BILL THE BLUE                           3
         5 BLACK                          DAVE THE BLACK                          4
         6 PURPLE                         PETE THE PURPLE                         5
         7 RED                            ROGER THE RED                           6
--
--All works OK
MDW> @rls_func_damage
--
CREATE OR REPLACE FUNCTION hide_rows (
 v_schema IN VARCHAR2
,v_objname IN VARCHAR2)
RETURN VARCHAR2 AS
con VARCHAR2 (200);
BEGIN
-- The line below states loser not user
con := (loser='''||v_schema||''' or MASK_FL > 3)';
RETURN (con);
END hide_rows;
/
Function created.
--
-- Let us check the predicate coming from the function
--
select hide_rows(v_schema=>user,v_objname=>'TEST1') from dual;

HIDE_ROWS(V_SCHEMA=>USER,V_OBJNAME=>'TEST1')
---------------------------------------------------------------------------------------
(loser='MDW' or MASK_FL > 3)

-- "loser" is not a valid piece of syntax or a known variable. So...
--
select * from TEST1
              *
ERROR at line 1:
ORA-28113: policy predicate has error

-- I now damage the statement creating the function so that it is syntactically incorrect
CREATE OR REPLACE FUNCTION hide_rows (
...
-- missing quote
con := (loser='''||v_schema||''' or MASK_FL > 3)';
...

Warning: Function created with compilation errors.

select * from TEST1
              *
ERROR at line 1:
ORA-28110: policy function or package MDW.HIDE_ROWS has error

--re-create the function as I started with

Function created.

select * from TEST1
        ID CP_NAME                        OTHER_NAME                        MASK_FL
---------- ------------------------------ ------------------------------ ----------
         1 ERIC                           ERIC THE RED                            1
         2 BROWN                          BOB THE BROWN                           2
...
         7 RED                            ROGER THE RED                           6
--
-- Back to a working system

As you can see, you get ORA-28110 if the function used by the RLS policy is damaged and ORA-28113 if the function returns a non-valid WHERE Predicate. In either case, the data in those tables is now inaccessible. That could be a problem…

It is of course very interesting that RLS is kind of “blind” to the functions it uses until it uses it. You can alter RLS activity if you can alter the functions.

Last post I mentioned that, as you RLS security is based on these functions, you should give thought to who can see them and change them. Hopefully the above highlights that. There is another access aspect to these functions that is key.

In the first example where I use DBMS_RLS.ADD_POLICY, I state function_schema – this is who owns the PL/SQL function. It defaults to the person who creates the Policy. The function is fired in the standard way with DEFINER RIGHTS – i.e. the rights of the function owner. {You could override this when creating the function such that it runs with executioner rights – but that strikes me as a potentially very confusing and a source of much pain and anguish}.

So the function can see what the owner can see, not the executioner. This has two key effects:

  • You can restrict the access to any DB objects that the function requires to the owner of the function. The end user does not need to have access and it may be more secure if they do not.
  • You have to grant access to objects directly to the function owner, via either object grants or system grants. Stored PL/SQL does not “see” roles if executed with definer rights.

So eg if your function is owned by MDW and references a master table called SEC_MASTER.MASKED_USERS then you need to grant select on SEC_MASTER.MASKED_USERS to MDW. No using roles.
{of course, calling a table MASKED_USERS might give anyone acquiring access to the DB a clue that it is important. I tend to call such “security important” tables things that are “obviously” boring, like ROTA_DATA.}

Finally, that WHERE Predicate P1 is in effect added to your SQL statement. It can be pretty much anything. It can reference other database objects. If it references other database objects THE EXECUTING USER MUST BE ABLE TO SEE THEM. Not the owner of the function; that function has been fired and the WHERE Predicate generated already. On actually running the SQL, the executing user must have access to all objects referenced, including those introduced by the P1 WHERE Predicate. Oh, and remember synonyms!

There is there a way to sidestep RLS and you sometimes need to.

If your logon has the EXEMPT ACCESS POLICY system privilege then RLS functions are utterly ignored by you. No check is made to see if policies apply to your SQL statements, let alone run them. You may need to use this privilege if you need to access functionality that RLS has issues with (see comments by Dom Brooks and Tony Sleight on my first post). You might also need it if you have errors with the policies but you need access to the data to keep the business moving. You probably need a locked-down user somewhere with the EXEMPT ACCESS POLICY privilege if you use RLS.
Of course, great care has to be taken in making use of this privilege as it side-steps all RLS_based security.
If you have master users that have full access to the data, this privilege also removes the overhead of the SQL engine identifying any policies, firing the function and including the extra predicates.

One very final point. I have found myself writing RLS functions that all check if the user was the owner of the table and, if so, allowing them to see all data. By granting EXEMPT ACCESS POLICY to that user I was able to do away with those checks and make the functions much simpler. In this case I did not so much think of EXEMPT ACCESS POLICY turning off security but the lack of it turning it on for restricted users.

That will do for part 2.

Row Level Security Part 1 November 15, 2012

Posted by mwidlake in database design, security.
Tags: , ,
9 comments

I’ve been working a little on Row Level Security (RLS) recently and wanted to mention a few things, so first some groundwork.

If you want to limit the rows certain users can see, you might think to use views or you might think to use RLS (part of VPD – Virtual Private Database). You can also (from V10 I think) limit which columns users can see. An example is probably the best way to show this. I’m doing this on Oracle 11.2.0.3.

I have two users, MDW and MDW_OFFSHORE. MDW has DBA-type privileges and MDW_OFFSHORE has connect, resource and one or two other simple privs. I will now demonstrate creating and populating a simple table under MDW, adding RLS to it and how it alters what MDW_OFFSHORE sees.

MDW> -- create a table and show row level and column level rls
MDW> --
MDW> drop table TEST1 purge;
MDW> --
MDW> create table test1
  2  (id         number       not null
  3  ,cp_name    varchar2(30) not null
  4  ,other_name varchar2(30) not null
  5  ,MASK_FL    number
  6  )
  7  /
MDW> insert into test1 VALUES (1,'ERIC','ERIC THE RED',1);
MDW> insert into test1 VALUES (2,'BROWN','BOB THE BROWN',2);
MDW> insert into test1 VALUES (3,'GREEN','GARY THE GREEN',1);
MDW> insert into test1 VALUES (4,'BLUE','BILL THE BLUE',3);
MDW> insert into test1 VALUES (5,'BLACK','DAVE THE BLACK',4);
MDW> insert into test1 VALUES (6,'PURPLE','PETE THE PURPLE',5);
MDW> insert into test1 VALUES (7,'RED','ROGER THE RED',6);
MDW> --
MDW> COMMIT;
MDW> SELECT * FROM TEST1;

        ID CP_NAME                        OTHER_NAME                        MASK_FL
---------- ------------------------------ ------------------------------ ----------
         1 ERIC                           ERIC THE RED                            1
         2 BROWN                          BOB THE BROWN                           2
         3 GREEN                          GARY THE GREEN                          1
         4 BLUE                           BILL THE BLUE                           3
         5 BLACK                          DAVE THE BLACK                          4
         6 PURPLE                         PETE THE PURPLE                         5
         7 RED                            ROGER THE RED                           6
MDW> grant all on test1 to mdw_onshore;
MDW> grant select on test1 to mdw_offshore;
MDW> --
--
--This is where I create a function to generate the RLS rules
--
MDW> CREATE OR REPLACE FUNCTION hide_rows (
  2   v_schema IN VARCHAR2
  3  ,v_objname IN VARCHAR2)
  4  RETURN VARCHAR2 AS
  5  con VARCHAR2 (200);
  6  BEGIN
  7  -- note the use of backets to make it an atomic test
  8  con := '(user='''||v_schema||''' or MASK_FL > 3)';
  9  RETURN (con);
 10  END hide_rows;
 11  /
MDW> --
--
-- Now add the actual RLS policy to the table TEST1 using the function hide_rows
MDW> BEGIN
  2    sys.DBMS_RLS.ADD_POLICY (
  3      object_schema     => 'MDW'
  4     ,object_name       => 'TEST1'
  5     ,policy_name       => 'hide_rows_policy'
  6     ,policy_function   => 'hide_rows'
  7     ,function_schema   => 'MDW'
  8                            );
  9  END;
 10  /
MDW> --

That’s it. In effect you create a function, in this hide_rows, that creates a WHERE predicate (more on that later) and associate it with the table by creating a policy hide_ rows_policy.

Now let’s see what impact it has:

-- as the main user:
--
MDW> select * from test1
  2  /
        ID CP_NAME                        OTHER_NAME                        MASK_FL
---------- ------------------------------ ------------------------------ ----------
         1 ERIC                           ERIC THE RED                            1
         2 BROWN                          BOB THE BROWN                           2
         3 GREEN                          GARY THE GREEN                          1
         4 BLUE                           BILL THE BLUE                           3
         5 BLACK                          DAVE THE BLACK                          4
         6 PURPLE                         PETE THE PURPLE                         5
         7 RED                            ROGER THE RED                           6
7 rows selected.
MDW>
-- now as the offshore user:
mdw_offshore> select * from mdw.test1
  2  /
        ID CP_NAME                        OTHER_NAME                        MASK_FL
---------- ------------------------------ ------------------------------ ----------
         5 BLACK                          DAVE THE BLACK                          4
         6 PURPLE                         PETE THE PURPLE                         5
         7 RED                            ROGER THE RED                           6
3 rows selected.

As you can see, mdw_offshore is seeing only a subset of rows in the table. This subset is determined by the function hide_rows. This function returns a WHERE PREDICATE that is added to any query run against the table. It is actually very simply to see what that predicate is, you just call the function passing in the table owner and name:

MDW> select hide_rows(v_schema => user,v_objname=>’TEST1′) from dual;

HIDE_ROWS(V_SCHEMA=>USER,V_OBJNAME=>’TEST1′)
——————————————————————————-
(user=’MDW’ or MASK_FL > 3)

Note that my refernce to v_owner in the function comes out as the text MDW – a way to ensure the predicate does not filter data looked at by the table owner. This implies of course that yes, you can stop the table owner from seeing data!

By default, this extra WHERE PREDICATE is applied to any SELECT, INSERT, UPDATE or DELETE statement run against this table. The function has to accept the v_schema and v_objname parameters and output a varchar2 string. At the time of being called as part of a RLS policy, oracle passes in the owner and name of the object being masked (which is usually a table and sometimes a view).
There are a few things I think it is important to keep in mind in respect of this extra Where Predicate:-

  • You have very little control over what other where predicates will be applied to DML against this table, so keep it self-contained – which is why I bound it in brackets.
  • It has to be syntactically correct but no check is made for this when you create the function – it is simply a string being outputted, as far as Oracle is concerned, at the time of creation.
  • It can be any valid Where Predicate you like, so long as it does not reference other rows in the table with the policy is applied to
  • This could be fired a lot. The performance impact could be considerable, which is why CONTEXTs are often used (as they are memory resident and can be configured by eg logon triggers) but I have skipped over that. Go look at the manuals.

I mentioned that the masking of data could be applied to just a column rather than whole rows. So let’s demonstrate that. I can do this by simply dropping and reapplying the same function in a different way:

MDW> BEGIN
-- first drop the policy, simply by stating the object owner, name and policy
  2    sys.dbms_rls.drop_policy(
  3      object_schema     => 'MDW'
  4     ,object_name       => 'TEST1'
  5     ,policy_name       => 'hide_rows_policy'
  6                            );
-- and recreate - note the two new paramters 
  7    sys.DBMS_RLS.ADD_POLICY (
  8      object_schema     => 'MDW'
  9     ,object_name       => 'TEST1'
 10     ,policy_name       => 'hide_rows_policy'
 11     ,policy_function   => 'hide_rows'
 12     ,function_schema   => 'MDW'
 13     ,sec_relevant_cols => 'OTHER_NAME,MASK_FL'
 14     ,sec_relevant_cols_opt => sys.dbms_rls.ALL_ROWS
 15                            );
 16  END;
 17  /
PL/SQL procedure successfully completed.
--
-- as the main user
MDW> select * from test1
  2  /
        ID CP_NAME                        OTHER_NAME                        MASK_FL
---------- ------------------------------ ------------------------------ ----------
         1 ERIC                           ERIC THE RED                            1
         2 BROWN                          BOB THE BROWN                           2
         3 GREEN                          GARY THE GREEN                          1
         4 BLUE                           BILL THE BLUE                           3
         5 BLACK                          DAVE THE BLACK                          4
         6 PURPLE                         PETE THE PURPLE                         5
         7 RED                            ROGER THE RED                           6
7 rows selected.
--
-- now as the offshore user:
mdw_offshore> select * from mdw.test1
  2  /
        ID CP_NAME                        OTHER_NAME                        MASK_FL
---------- ------------------------------ ------------------------------ ----------
         1 ERIC
         2 BROWN
         3 GREEN
         4 BLUE
         5 BLACK                          DAVE THE BLACK                          4
         6 PURPLE                         PETE THE PURPLE                         5
         7 RED                            ROGER THE RED                           6
7 rows selected.
--
-- but what can look a little odd is that if mdw_offshore describes the table..
mdw_offshore> desc mdw.test1
 Name                                                  Null?    Type
 ----------------------------------------------------- -------- --------------
 ID                                                    NOT NULL NUMBER
 CP_NAME                                               NOT NULL VARCHAR2(30)
 OTHER_NAME                                            NOT NULL VARCHAR2(30)
 MASK_FL                                                        NUMBER
--
-- they see that OTHER_NAME is NOT NULL but...
mdw_offshore> select count(*) from mdw.test1 where other_name is null;

  COUNT(*)
----------
         4

That’s rather nice, I think. The same function can be used and rather than hide the rows it masks the stated columns. The sec_relevant_cols => ‘OTHER_NAME,MASK_FL’ parameter states the columns to apply the function on and the sec_relevant_cols_opt => sys.dbms_rls.ALL_ROWS parameter means “show a row even if a column or columns are hidden”. Without it, you only see rows if you do not reference the masked columns (even if indirectly via eg “select (*)”) – see this nice article by Tim Hall if you want an example of that.

With column masking the string from the function is not actually used as a Where Predicate but as a Boolean TRUE/FALSE test. If the test results in TRUE, you see the column(s), otherwise they are hidden.

As you can see from the example, this column-level RLS can result in some odd looking (though intentional) results. A DESC of the table shows the column OTHER_NAME is mandatory but records are found with it set to null. There can also be the situation where a user can insert a value for the masked column, but then cannot see it when they select the data back, though other users can see the populated column.

A limitation of column-level masking is that you can only mask columns with NULL and not, for example, a string of “MASKED”. Thus if a column you are masking is not manadatory, how would you distinguish between a true NULL and a masked NULL? {A nod of thanks goes to Pete Finnigan for confirming this “NULL only” limitation.}

One of the appeals of RLS over eg views is that it is hard to circumvent, as it is applied at the level of the SQL. Another is that the same function can be applied to many tables and might also be applied to both column and row masking, if designed well.

That will almost do for an introduction to the use of RLS to mask rows and columns. However I’ll finish with just a few important points that follow on from this being a security feature.

  • I demonstrated that I could see what the function responsible for this security actually returns. If you do not protect unauthorised execution of that function, that could be a security hole.
  • I masked the column that is used to control if the OTHER_NAME column is masked. Would you need to also do this?
  • Data that is missing is data that could be of interest.
  • Anyone who can alter that function can undo your security.
  • Anyone who can view DBA_SOURCE (eg has SELECT ANY DICTIONARY) can view the function and see how it works.
  • As you can appreciate from the above, apply Security securely is a right pain in the proverbial.

DBMS_APPLICATION_INFO for Instrumentation September 3, 2012

Posted by mwidlake in development, Instrumentation, performance.
Tags: , , , ,
24 comments

I just wanted to put up a post about DBMS_APPLICATION_INFO. This is a fantastic little built-in PL/SQL package that Oracle has provided since Oracle 8 to allow you to instrument your code. i.e record what it is doing. I’m a big fan of DBMS_APPLICATION_INFO and have used it several times to help identify where in an application time is being spent and how that pattern of time has altered.

Some PL/SQL developers use it and some don’t. It seems to me that it’s use comes down to where you work, as most PL/SQL developers are aware of it – but not everyone uses it (a friend of mine made the comment recently that “all good PL/SQL developers use it“. I can understand his point but don’t 100% agree).

It is incredibly easy to use. You use the procedures SET_MODULE(module_name,action_name), SET_ACTION(action_name) and SET_CLIENT_INFO(client_info) to set the values of the corresponding columns in V$SESSION for your current session. This is done in memory, there is no inserting or updating of any table rows, so it is incredibly quick and light-weight. The below shows a subset of V$SESSION including the columns that get set:

desc v$session
Name Null? Type
----------------------------------------- -------- ----------------
SADDR RAW(8)
SID NUMBER
SERIAL# NUMBER
AUDSID NUMBER
PADDR RAW(8)
USER# NUMBER
USERNAME VARCHAR2(30)
COMMAND NUMBER
...
MODULE VARCHAR2(64)
MODULE_HASH NUMBER
ACTION VARCHAR2(64)
ACTION_HASH NUMBER
CLIENT_INFO VARCHAR2(64)
...

Just something odd to note. MODULE, ACTION and CLIENT_INFO are shown at VC(64) but if you check the documentation you will see that:

MODULE is limited to 48 bytes
ACTION is limited to 32 bytes
CLIENT_INFO can be set to the whole 64 bytes

I have no idea why MODULE and ACTION are limited in this way but it might be historic, backward compatibility with prior versions.

As a quick example of their use:

test_11_2> --demo_dai.sql
test_11_2> -- quick demo of dbms_application_info
test_11_2> COL sid form 9999
test_11_2> col username form a12
test_11_2> col module form a18
test_11_2> col action form a22
test_11_2> col client_info form a15
test_11_2> -- set module and action
test_11_2> EXEC DBMS_APPLICATION_INFO.SET_MODULE(MODULE_NAME=>'SALES_LOAD',ACTION_NAME=>'VALIDATE_ALL_PRICES')
test_11_2> select sid,username,module,action,client_info from v$session where sid=650;

SID USERNAME MODULE ACTION CLIENT_INFO
----- ------------ ------------------ ---------------------- ---------------
650 ERIC SALES_LOAD VALIDATE_ALL_PRICES
test_11_2> --
test_11_2> -- Update the action within a module
test_11_2> EXEC DBMS_APPLICATION_INFO.SET_ACTION(ACTION_NAME=>'Update all Prices')
test_11_2> select sid,username,module,action,client_info from v$session where sid=650;

SID USERNAME MODULE ACTION CLIENT_INFO
----- ------------ ------------------ ---------------------- ---------------
650 ERIC SALES_LOAD Update all Prices

test_11_2> -- clear them
test_11_2> EXEC DBMS_APPLICATION_INFO.SET_MODULE(null,null)
test_11_2> select sid,username,module,action,client_info from v$session where sid=650;

SID USERNAME MODULE ACTION CLIENT_INFO
----- ------------ ------------------ ---------------------- ---------------
650 ERIC

test_11_2> -- you can set just an action
test_11_2> EXEC DBMS_APPLICATION_INFO.SET_action(ACTION_NAME=>'orphan process')
test_11_2> select sid,username,module,action,client_info from v$session where sid=650;

SID USERNAME MODULE ACTION CLIENT_INFO
----- ------------ ------------------ ---------------------- ---------------
650 ERIC orphan process

test_11_2> -- set them to something sensible
test_11_2> EXEC DBMS_APPLICATION_INFO.SET_MODULE(MODULE_NAME=>'RETURNS_LOAD',ACTION_NAME=>'MATCH_TO_SALE')
test_11_2> select sid,username,module,action,client_info from v$session where sid=650;

SID USERNAME MODULE ACTION CLIENT_INFO
----- ------------ ------------------ ---------------------- ---------------
650 ERIC RETURNS_LOAD MATCH_TO_SALE

test_11_2> -- client_info can be longer
test_11_2> EXEC DBMS_APPLICATION_INFO.SET_client_info(client_info=>'entered this module at '||sysdate)
test_11_2> select sid,username,module,action,client_info from v$session where sid=650;

SID USERNAME MODULE ACTION CLIENT_INFO
----- ------------ --------------- --------------- ----------------------------------------
650 ERIC RETURNS_LOAD MATCH_TO_SALE entered this module at 03-SEP-2012 13:07

The intention is that you set the MODULE as you enter a functional chunk of your application (be it a PL/SQL package or a chunk of JAVA, whatever {but be aware of the stateless nature of systems using middle tiers and pooled connections}) and then update the ACTION as you progress. You can set an action with no module, as demonstrated above, but I would recommend against it. On that topic I would make the following recommendations:

  • Set the MODULE at the start, eg as you enter a PL/SQL package’s main module
  • Set the MODULE before you do all the validation – you may wish to see if that validation is as quick as you think.
  • Use SET_ACTION to update the action prudently. ie when the activity of the code alters but not all the time. It will come clearer with use, but what you are after is to be able to judge the relative run durations of parts of the application. Too much detail can mask the overall picture. If you can alter code you can always add more instrumentation.
  • Clear the MODULE and ACTION explicitly as you exit a package. The activity is lightweight and you do not want any code that lacks this instrumentation to inherit values from calling code.
  • Do not set an ACTION without having set the MODULE earlier in the code.
  • Keep the text of these values short and specific.

You do not have to look at V$SESSION to see these values, which is good as you may lack the privs needed to do so. Since oracle 9 there have also been procedures to check what the values are set to. e.g:

test_11_2> BEGIN
2 DECLARE
3 V_MODULE VARCHAR2 (64);
4 V_ACTION VARCHAR2 (64);
5 begin
6 dbms_application_info.READ_MODULE(V_MODULE,V_ACTION);
7 DBMS_OUTPUT.PUT_LINE('Module IS '||V_MODULE||' Action IS '||V_ACTION);
8 END;
9 END;
10 /

Module IS RETURNS_LOAD Action IS MATCH_TO_SALE
test_11_2>

So as you can see, these values can be set, they can be accessed via the data dictionary or the package itself and, if you believe my wild claims, they are very lightweight to use. However, these values really come into themselves with eg OEM’s performance screens. I’ll leave examples of that to a second post. Here I just wanted to cover how easy it is to set and get the information.

That does lead me onto what I think is the main reason that DBMS_APPLICATION_INFO is not as widely used as it could {and should} be. Developers often lack access to eg OEM and so do not see the full benefit of using it. Also, developers tend to be {note the get-out-clause) more concerned with making the code work and less with monitoring it after it is released. DBA-types tend to be more concerned with monitoring it. Thus the DBA types would want it in the code and developers would not want to spend the time coding it in. We are all one big team though, so some negotiation and beer-buying (or management enforcement) might well resolve that one.

Friday Philosophy – Lead or Lag (When to Upgrade)? January 20, 2012

Posted by mwidlake in development, Friday Philosophy, Testing.
Tags: , , ,
10 comments

I was involved in a discussion recently with Debra Lilley which version of Oracle to use. You can see her blog about it here (and she would love any further feedback from others). Oracle now has a policy that it will release the quarterly PSUs for a given point release for 12 months once that point release is superseded. ie once 11.2.0.3 came out, Oracle will only guarantee to provide PSUs for 11.2.0.2 for 12 months. See “My Oracle Support” note ID 742060.1. However, an older Terminal release such as 11.1.0.7 is not superseded and is supported until 2015 – and will get the quarterly PSU updates. This left the customer with an issue. Should they start doing their development on the latest and theoretically greatest version of Oracle and be forced to do a point upgrade “soon” to keep getting the PSUs, or use an older version of Oracle and avoid the need to upgrade?

This is in many ways a special case of the perennial issue of should you use the latest version of Oracle (or in fact any complex software solution) or go with the version you know and trust? Plus, should you patch up to the latest version which in theory gives you protection against bugs and vulnerabilities (along with the CPUs). Yes, they are two separate issues but people tend to sit on the same side of both points, for the same reasons.

The arguments to stay using an older version are that it is working, it is stable, you do not need the new features and upgrading is a lot of work and effort. Plus the new version will have new bugs that come along with the new features you do not need and things might be turned on by default that you could do without (like stats collecting or not creating the actual segments when a new table or partition is created). If you remain on your favourite version long enough, you get another issue which is that the latest version of Oracle might not be compatible with your ancient version of the OS or another package or programming language critical to your system (I got caught in a terrible web with old perl, old O/S and old DB that resulted in a need to upgrade all three together – ouch!).

The arguments to moving forward are that you get access to the latest features, that over all older features will have more bugs fixed in newer version, performance will be better {again, overall, exceptions allowing}. Also, if you do hit bugs and problems there are no issues in having to first upgrade to a fully supported version. Plus, fixes are made for current versions first and then back-ported to older ones. Those pack-ported fixes can cause real problems when you DO decide to upgrade.

The big sticking points are the effort involved in upgrading and living with the bugs that you find that Oracle Testing didn’t.

I’ve got a few of other considerations to throw into the pot.

Firstly, if you are developing something new, it is not a lot more effort to use the latest version. This allows you to learn the new version and eases the transition of older systems to it.

Secondly, Oracle like you if you use the latest version, especially if it is the latest-latest version or even beta. Yeah, the helpdesk will not have a clue about some of your issues but in my experience you get access to those really smart guys and gals in Oracle who do the third-line support or even the development work.

Thirdly, if you are on the latest version, if you do decide to freeze on that version for a while, for stability and a quiet life, you have a lot longer before your version (at least at a major level) drops out of support.

Fourthly, dynamic, inquisitive, flexible staff like new things. In my experience, environments that freeze on an old version have a higher percentage of staff who either like it dull and repetitive, or hate it being dull and repetitive – and itch to get out. If I’m in charge, I know which type of staff I like to have more of {NB there are some very good arguments for having some staff who like it dull and repetitive}.

As you can guess, I am in the “be on the latest version” side of the argument. I was ambivalent about it until a few years ago when I noticed a trend:

Sites that like to move forward tend to (a) do it in a controlled manner and (b) have the infrastructure to do proper regression testing.
Site that like to stay still lack the ability to do regression testing and move forward only when forced – and in a pressured, unplanned and frankly chaotic manner.

That was it, that was the real key thing for me. The further you lag behind the more likely you are to eventually be forced to upgrade and it won’t be a nice time doing it. I know, there are exceptions, systems still running Oracle 6 absolutely fine on an old DOS6.1 box. In the same way you also get the odd 95-year-old life-long smokers – and thousands of 45-year-old smokers with emphysema.

When I have any sway over the situation I now always strive to be on modern versions of Oracle {OS, language, whatever} and to patch small and regular. To support all this, have very good regression testing. I’ve only a couple of times been able to get the regression testing sorted out as well as I would like, but when you do the pain of patching and upgrading, as well as developing and integrating, is so much reduced that not patching seems madness.

So to sum up:

  • If it is a new development, go for the very latest version, play with the latest features if potentially beneficial and see if you can get Oracle to be interested in your attempts. ie (B)lead.
  • If you have good regression testing, plan and carry out patch and version upgrades as they come available and stay current. ie Lead
  • If you have a complex solution in place and no/poor regression testing, do not move to a new major release, leave it a while for the worst new bugs to be found and fixed. Then move. ie Lag
  • If your system is old AND critical and all the guys and gals who implemented it are long gone, stay on that version for ever. ie stagnate.

Oh, and if that last one applies to many of your systems – dust off the CV and start reading technical manuals. One day you will need a new job in a hurry.

Follow

Get every new post delivered to your Inbox.

Join 188 other followers