jump to navigation

Memory Changes Everything July 12, 2010

Posted by mwidlake in Architecture, performance.
Tags: , , ,
9 comments

I’ve got this USB memory stick which I use to carry around my scripts, documents, presentations, Oracle manuals and enough music to keep me going for a few days. It is on an 8GB Gizzmo Junior and it is tiny. By tiny I mean as wide as my little finger, the length of a matchstick and about the same thickness of said matchstick. So small that I did indeed lose the damn thing for 6 months before I realised it had got trapped behind a credit card in my wallet.

It cost me ten British pounds about 15 months ago (less than most 4GB USB sticks seem to cost now, but then it is nothing more than the memory chip and connectors wrapped in plastic) and it highlights how cheap solid-state “storage” is becoming.

Connected to this, I was looking at buying a new PC this week and this machine comes with 10 USB slots, if you include the ones on the supplied monitor and stubs on the motherboard.
10 USB slots, 8GB gizzmo memory sticks… That would be 80GB of cheap and fast storage. Now get a few USB hubs and bulk-buy a few dozen cheap USB2 sticks and you could soon have a solid-state database of a few hundred GB for a thousand pounds. Then of course you can have fun seeing where the pinch-points in the system are (USB2 has a maximum speed per port and going USB3 right now is going to break that 1 grand barrier. But give it a year…).

This really started me thinking about when memory-based storage would take over from spinning disk as the best option for enterprise-level storage and my gut feeling is in about 5 years. I think it will be both technically possible and financially viable in much less than that, say as little as 2 years, but the cost of solid-state storage per MB will still be higher than disk by then but potentially much faster. A few considerations going through my mind were:-

  • Disk is getting a lot slower in relation to acreage. By this I mean that, for a single disc drive, capacity is doubling about every 18 months but seek time has hardly reduced in a decade and transfer rate (reading from the physical platters to the units buffer) is again almost stationary, at about 120MB/s for 10,000rpm disk and up towards 180 for those very expensive and noisy 15,000 rpm disks. Being a tad ridiculous to make the point, with modern 3TB disks you could build most Oracle database on one disc. Let’s make it two in a raid 10 configuration for redundancy. My point is, your 3TB database could well be being run right now, for real, across say 5 physical disks with a total sustainable physical throughput of around 500MB a second.
  • Solid state storage seems to be halving in price in more like 8-10 months.
  • IO subsystems are made faster by using RAID so that several physical discs can contribute to get towards the 300MB or so speed of the interface – but solid state is already that fast.
  • IO subsystems are made faster by building big caches into them and pre-fetching data that “might” be requested next. Oh, that is kind of solid state storage already.
  • Solid state storage, at least the cheap stuff in your USB stick, has the problem that you can only write to each bit a thousand or so times before it starts to get unreliable. But physical disk has exactly the same issue.
  • There are new methods of solid-state memory storage coming along – “New Scientist” had a nice article on it a few months ago, and these versions will be even higher density and more long-term reliable.
  • Seek time on solid-state memory is virtually zero, so random IO is going to be particularly fast compared to spinning disk.

Solid state memory needs less power, and thus less cooling, is silent, is potentially denser and is less vulnerable to temperature and humidity fluctuations. I can see it not needing to be kept in a specialist server room with the need for all that air con and ear defenders when you go in the room.
Just somewhere with normal air con and a lock on the door should suffice.
We do not need Solid State storage to match the size of current disks or even be as cheap to take over. As I have already pointed out, it is not acreage you need with physical disks but enough spindles and caches to make it fast enough in relation to the space. Further, we can afford to pay more for solid state if we do not need to keep it in such expensive clean-room like environments.

I can see that in a couple of years for a given computer system, say a mixed-workload order processing system, to support the storage needs we will have maybe a dozen solid-state chunks of storage, perhaps themselves consisting of several small units of memory in some sort of raid for resilience, all able to flood the IO channels into our processing server and the issue will be getting the network and io channels into the server to go fast enough. So don’t, stick all the storage directly into the server. You just got rid of half your SAN considerations.

I’m going to stop there. Partly because I have run out of time and partly because, in checking out what I am writing, I’ve just spotted someone did a better job of this before me. Over to James Morle who did a fantastic post on this very topic back in May. Stupid me for not checking out his blog more often. Jame also mentions that often it is not total throughput you are interested in at all but IOPS. That zero latency of solid-state memory is going to be great for supporting very high IOPS.

Saturday Philosophy – The unbelievably small world of VLDBs June 12, 2010

Posted by mwidlake in VLDB.
Tags: , , ,
9 comments

Yesterday I posted about the potential for a Oracle in Science community within the UK Oracle user group {and wider for that matter, there is after all a world Oracle Life Science community but it is currently less vibrant than it was, sadly}.

My friend and occasional drinking partner Peter Scott replied to say he felt there was “a place for a SIG for stonking great databases” {now wouldn’t SGDB be a better TLA than VLDB? :-) }.

Well, I would agree but for one small detail. An apparent lack of anyone willing to be part of the community.

When I was building a very considerable VLDB {and I’m sorry I keep going on about it, I’ll try and stop soon} back in the early to mid 2000’s I seemed to be working in a vacuum of information, let alone prior experience. Yes, there was stuff in the Oracle manuals about how big things could theoretically be made and some vague advice on some aspects of it, but an absolute lack of any visible Oracle customers with anything even approaching the sizes I was contemplating. 2TB was about the limit and I was already way beyond that. Was this because I really was pushing the boundaries of database size? Well, I have since found out that whilst I was up there just behind the leading edge, there were several databases much, much bigger than mine and others already envisioned that might hit the Petabyte level, let alone Terabyte.

The thing is, no one would speak about them. At all.

We were left to do it all pretty much from scratch and it would not have been possible if I had not spent years building up with VLDBS as the definition of a VLDB size increased, plus of course cracking support by the other DBAs and Systems Admins around me. And to be fair, Oracle Corp helped us a lot with our efforts to build these massive databases. Interestingly, one Oracle Consultant would regularly tell me that our systems really were not so unusually big and there were plenty larger. He usually said this when I asked, exasperatedly as something else failed to scale, if Oracle had every tested things at this level :-). But despite constantly asking to meet with these people with massive systems, so we could exchange war stories and share advice, and being promised such contacts by Oracle, they never materialized except for CERN – who we already talked to as a fellow scientific organisation – and Amazon, who it turns out did things in a very different way to us {but it was really good to talk to them and find out how they did do their big databases, thanks guys}. Both were at the same scale or just behind where we were.

This is because most of the people with massive oracle databases will not talk about them as they are either run by the largest financial organisations, are to do with defense or in some other way just not talked about. In his comment Peter refers to a prior client with an OLTP-type system that is now around the PB scale. I would be pretty sure Peter can’t say who the client is or any details about how the system was designed.

So although I think there is a real need for a “stonking great databases” forum, I think there is a real problem in getting a user community of such people/organisations together. And if you did, none of the members would be allowed to say much about how they achieved it, so all you could do would be sit around and brag about who has the biggest. There is an Oracle community about such things, called the Terabyte Club, but last I knew it was invite-only and when I managed to get invited, it turned out that mine was biggest by a considerable margin, so I was still not meeting these elusive groups with 500TB databases. Maybe there is an Oracle-supported über database society but as I never signed the official secrets act might not have been eligible to play.

If I am wrong and anyone does form such a user group (or is in one!) I would love to be a member and I would strive to present and help.

I’ll finish with what appears to be a contradiction to what I have just written. There already is a UKOUG User Group that deals with large systems and I chair it – the Management and Infrastructure SIG. {sorry, the info on the web page could do with some updating}. Part of what we cover is VLDBs. But we also cover Very Many DataBases (companies with thousands of instances) and Very Complex DataBases plus how you go about the technical and management aspects of working in a massive IT Infrastructure. It might be that we could dedicate a meeting to VLDBs and see how it goes, but I know that whilst many who come along are dealing with database of a few TB, no one is dealing with hundreds of TB or PB database. Either that or they are keeping quiet about it, which takes us back to my main point. The MI SIG is probably the closest to a VLDB SIG we have in Europe though, and is a great bunch of people, so if you have a VLDB and want to meet some fellow sufferers, we have our next meeting on 23rd September in the Oracle City office.

Friday Philosophy – Madness demands Attention May 28, 2010

Posted by mwidlake in Architecture, Friday Philosophy.
Tags: , ,
2 comments

Many years ago I had a good friend who was a psychiatric nurse. We were talking about his job once and he was saying how some patients just took up much more time than others. These were generally the ones who would be deemed “the most mad” in a non-clinical manner {and is pretty much how my friend the psychiatric nurse put it}. These patient’s actions or need for intervention would put demands on the staff far more than other patients. As a result, all the nurses tended to get to know (or know of) such patients better than others.

I thought of this the other day when a few of us were talking about some awful bit of application we were concerned about. This thing inserts rows from a MSSQL database into a table in an Oracle database. Triggers on the initial table fire and populate another table, in a 1-to-many relationship. This second table also has a trigger on it that further inserts into another set of tables. A regular process then aggregates this data – and sticks it back into the MSSQL database it came from.

Said process is done as a single transaction for all rows inserted for the day. Irrespective of the growth in rows. Or the fact that one source “application” has grown to 10 and soon will grow to 50. All rows in one transaction. The intermediate tables are never cleared out and get bigger and bigger. No one else needs any of this data in the Oracle side of the system.

There are several “madnesses” to this process – why put it into Oracle only to put it back in the source system, why use a busy production system to hold and process transient data, why no clean up, why are the records not processed in batches, the cascading triggers magnifies transaction data volumes…

This process is well-known in our group. I’ve been involved. Both the guys I was talking to have been involved. I can see from my desk 4 or 5 other people who have been roped into bullying this process thought before now. In fact, I reckon half the department have had to work on this damned thing at some point.

Can you see why I was reminded of my conversation with my old nurse friend?

The application is simply mad. And as a result it is demanding not only on our database but on all of the team, as so many of us have had to get involved working on it. We have all got to know it so well.

I’m glad to say that treatment for the application is planned and, hopefully, it will soon be a lot happier.

So will we.

Friday Philisophy – To Manage or to Not Manage March 26, 2010

Posted by mwidlake in Friday Philosophy, Management, Uncategorized.
Tags: ,
6 comments

Recently a friend of mine Graham Oaks blogged about his decision to step back from management and return to the Technical Coal Face.

I made a similar decision 3 or 4 years back, so I have a lot of empathy for his position and his decision. I found that to do the job of a manager takes up a lot more time, effort, patience and emotional effort than I had realised. Team leading is bad enough, having to coordinate the efforts of a half dozen people and sorting out the myriad issued they throw your way. Being in charge of multiple teams, responsible for strategy, dealing with staff development and moral, being a buffer against HR and having to deal with the politics created by people who WANT to be managers and wield power is more than a full-time job. Trying to hold onto a technical element as well, I found I could only manage it by doing the technical job as a “hobby”, in my own time. It was just too much to keep going year after year.

I had to chose. Give up the technical to give me enough personal resource to remain a manager and get better at it, or stop being a manager and start re-gaining my technical skills. I chose the latter.

Since I made my decision 3 years ago, I have met several people who have made the same conscious decision to step away from management and return to a more technical role. You may stand to earn more as a manager {which is something I objected to before being a manager and I still object to having been one – it should be possible to earn the same doing either} but for some of us it is not enough to make losing the hands-on work a sacrifice worth making.

One of the points Graham makes in his blog is that his spell as a manager has given him an appreciation of the challenges of management and the particular hells and stresses of the role. I think this is something that people who have never been managers have trouble really understanding.

I was working with a guy a couple of years ago and he was telling me how much of “a Moron” his boss was. In fact, he felt his current boss was even more of a moron than his previous boss. He then confessed that all of his bosses had been morons. “What, every single one of them?” I asked. Yes, absolutely all of them. That struck me as incredibly unfortunate, that every single one of these managers (and he’d had a lot as he moved between teams and positions on a regular basis), most of whom had come up through the technical ranks, were all Morons. I pointed out this unfortunate coincidence and wondered if there might actually be a common factor with all of these managers. He told me there was; They were all Morons.

He himself had never been a manager. He said he was too smart. Not smart enough to get what I was hinting at with the common factor suggestion though.

Obviously, some managers are poor at what they do; there are poor people in every job. But something I took away from my time being a manager is a lack of empathy for anyone saying all managers are a waste of time when they have never done the job themselves.

After all, I doubt there is any job where just doing it means you are an idiot.

Except Sys Admins – They are all idiots :-) (ducks behind server).

Command Line or GUI – Which is Best? February 18, 2010

Posted by mwidlake in performance.
Tags: , ,
15 comments

At present I am suffering ever so slightly from “split personality disorder”* in respect of my liking for Command Line and GUI interfaces.

On the one hand, much to my colleagues mild reproach, I use SQL*PLus and not PL/SQL Developer for my day-to-day work. Even worse, I got sick of using notepad to hack around scripts {I am working in a windows client environment and you simply can’t use MS Word with SQL files!} so I have retrograded recently and downloaded a windows-complient ‘vi’ interface and it is soooooo nice to be able to use powerful ‘vi’ commands on my files once more. “:.,$s/^ /,/”. Ahhh, it is so much easier. I can do stuff in 3 seconds in ‘vi’ that would take me 10 minutes in Notepad in a large, complex file. That and, I’m sorry, but notepad seems to be unable to manage a 100MB file, despite me having 2GB of real memory and a decent machine, but ‘vi’ has no problem with it at all.
Even more retrograde, I have direct telnet access to my linux servers and I am getting back to that as it makes me so much more productive. “ls -alrt ^d” for all directories anyone? “df -k .” to see how many data files I can add? Yep, it’s all arcane and means so little to many modern IT “Java/Struts/CDE” people but boy it is direct and fast. I might even dig out that book on SED and AWK.

On the other hand, I have finally (after much very painful discussions back and forth) got agreement that my site probably has access to AWR, ASH and all that good performance repository stuff. So I am hacking around with the OEM screens that focus on performance and snapshots and stuff. Now, I am traditionally not a big fan of GUI DBA tools. Partly it is because I am a bit old and stuck in my ways and partly it is because GUIs are really just “menus of options”. You are limited to what options are available in your DBA GUI tool and you have a harder time learning all the options available or what is really going on “under the covers”.

But with AWR and all those graphs, links and utilities, you can drill down into real problems real time or in the past so effectively that, well, once they start using this tool properly they will not need me at all. It is a fantastic boon to performance management and problem resolution, as well as proactive performance management.

So there you are, I am with Doug Burns on this one, in that I have Learned to Love Pictures. When the Pictures are well thought out and well connected and simple enough to help make sense of a complex system {and Oh Boy Oracle performance has become sooo Complex!!!!}

So right now, I spend half my day in vi/linux/command line world and half of it in pretty picture/GUI world. I think what really makes me happy is to leave behind the half-way-house of text-like Windows World {Windows SQL*Plus, Notepad}.

Just to finish, you can’t mention AWR without someone raising the ugly issue of licence cost and how Evil Oracle Corp were to charge for it. Well, I think it has been established that the guys and gals who developed AWR/ASH did not expect it to become a cost option but it did. And I suspect that what kept it a cost option was the community’s OutRage at it being a cost option. Anything that popular, hey, a commercial company is going to charge for. I still reckon Oracle Corp ballsed up as making it free and helping people use it a bit would have made 90% of customers’ lives easier and would have translated into user happiness and a certain % of sales for training courses to learn more, but heck my day job is to make things work, not maintain sales percentages, so my opinion counts for nowt. *sigh*

(*apologies to real sufferers of Dissociative Identity Disorder, I am using the term in the loose, non-scientific, “common usage” term of “not sure of my opinion” rather than having truly disparate personalities and memories.** And note, I certainly do not mean schizophrenia which, despite the on-going public-opinion misunderstanding, is rarely anything to do with multiple personality disorders or “spit minds” AT ALL, and is more to do with a difficulty in determining between reality and hallucination. ).

The Frustrated User’s perspective. November 28, 2009

Posted by mwidlake in Perceptions.
Tags: ,
1 comment so far

I got the below email from a friend this evening. Said friend does not work in IT. He works in a large organisation with a large and active IT department that might just be forgetting they provide a service as opposed to laying down the law…

****************************************************************
Hi Martin

For the last few weeks since {an edited out software virus disaster} we have been bombarded with unsolicited security policies from I.T. They pop up during the 10-15 minutes it takes to logon to our computers. You then have to download the policy and sign at the bottom to say whether you accept or decline the policy. When I scanned through the 10th policy I was struck by the fact that none of it applied to my area of responsibility except for one small part that had been covered in excruciating detail in one of their previous pathetic attempts at communicating what is expected of us. And all said missives using what looks like a variation of the english language. Having skipped the policy during a number of recent logons I was now being informed that it is “mandatory” to accept the policy or decline it giving a reason. I declined giving the above observation on the lack of relevance to my role as a reason.

I have now been informed that it is not possible to issue only the relevant policies to individuals (and presumably having identified this is not possible, have not bothered trying in the first place?) and in any case there might come a time when I “might” be given a task where the latest I.T policy applies and therefore I have to be aware of the existance of the policy. I think this latest one was something to do with purchasing software packages from suppliers -although this isn’t entirely clear. There is no way that I would be allowed to purchase software packages, which is a shame as there are off the shelf products that do what we require, whereas the in-house system foisted upon us simply does not provide any reliable or useful information what-so-ever.

The following senario occurs to me. I write a policy on controlling legionella – not unreasonable given that we have swimming pools, showers, air con etc. in our premises. I then send a copy to every employee requiring them to open it — expect them to read it —- understand it —- and accept it, “just-in-case” they get asked to go and run a sports centre. What response do think I would get?

Although the risk of catching legionella is low, people have died as a result, but we do not require everyone to sign a policy for this or any of the other more serious hazards they face at work. I am not aware of any software-purchasing-related deaths of late. For dangerous stuff employees sign one policy when they join the organisation. If they have to deal with a hazard we make them aware by warning them about it and if necessary give them additional training, guidance and support so that they can manage the risk in accordance with the overall policy.

Perhaps we have got this wrong. Maybe we should require all computer users (just for example) to complete a workstation assessment online every day when they start work – and if they don’t their computer should blow up in their face and a guilotine then drop from the ceiling removing their hands so they can’t sue for RSI or eyestrain.

That’ll teach them
************************************************************

I hope I have never been responsible for inflicting enough inconvenienve on my users to make them as aggrieved and angry as my friend.. Thing is, I now worry that I might have…

Buffer Cache Hit Ratio – my “guilty” Confession November 1, 2009

Posted by mwidlake in Perceptions, performance.
Tags: , ,
14 comments

My Friday Philosophy this week was on Rules of Thumb on buffer gets per row returned.

Piet de Visser responded with a nice posting of his own, confessing to using ratios to help tuning {We seem to be playing some sort of blog-comment tag team game at the moment}.

Well, I have a confession so “guilty” or “dirty” that I feel I cannot inflict it on someone else’s blog as a comment.

I use the Buffer Cache Hit Ratio.

And the Library Cache Hit Ratio and the other Ratios.

As has been blogged and forum’d extensively, using these ratios is bad and stupid and anyone doing so does not know what they are doing as they do not help you solve performance problems. I mean, hell, you can download Connor McDonald’s/Jonathan Lewis’s script  to set it to what you want so it must be rubbish {go to the link and chose “tuning” and pick “Custom Hit Ratio” – it’s a rather neat little script}.

The point I am trying to make is that once the Buffer Cache Hit Ratio (BCHR) was wrongly elevated to the level of being regarded as a vital piece of key information but the reaction against this silly situation has been that it is now viewed by many (I feel) as the worst piece of misleading rubbish. Again a silly situation.

I think of the BCHR as similar to a heart rate. Is a heart rate of 120 good or bad? It’s bad if it is an adult’s resting heart rate, but pretty good if it is a kitten’s resting heart rate. It’s also probably pretty good if it is your heart rate as you walk briskly. Like the BCHR it can be fudged. I can go for a run to get mine higher, I can drain a couple of pints of my blood from my body and it will go up {I reserve the right not to prove that last one}. I can go to sleep and it will drop. Comparing my resting heart rate to yours (so like comparing BCHRs between systems) is pretty pointless, as I am a different size, age and metabolism to you {probably} but looking at mine over a year of dieting and exercising is very useful. If only I could keep up dieting and exercising for a year…

So what do I think the much-maligned Buffer Cache Hit Ratio gives me? It gives me what percentage of sql access, across the whole database activity, is satisfied from memory as opposed to disc. Or, put another way, the percentage of occurences a block has to be got from the I/O subsystem. Not how many blocks are read from storage or memory though, but you can get that information easily enough. As Physical IO is several orders of magnitude slower than memory access {ignoring I/O caches I should add} , it gives me an immediate feel for where I can and can’t look for things to improve.

If I am looking at a system that is overall very slow (eg high process wait queues under l/unix, the client has said the system is generally slow) and I see that the BCHR is low, say below 90%, this tells me I probably can get some performance increase by reducing physical access. I’ll go and look for those statements with the highest physical IO and the hottest tablespaces/objects in the DB.
If the BCHR is already up at the 99% level, I need to look at other things, such as tuning sort, looking at removing activity in the database, to be very mindful of nested loop access where maybe it is not the best access method (very likely due to old stats on tables).

When I have got to know a system and what it’s BCHR generally sits at, a sudden change, especially a drop, means there is some unusual physical IO going on. If the phones start going and someone is complaining “it’s all slow”, the BCHR is one of the first things to look at – especially as it is available from so many places.

Another thing the BCHR gives me is, if I am looking at a given SQL statement or part of an application, it’s specific BCHR can be compared to the system BCHR. this does not help me tune the statement itself, but I know if it’s specific BCHR is low then it has unusually high IO demands compared to the rest of the system. Further, Reducing it might help the whole system, so I might want to keep an eye on overall system throughput. If I reduce the statement’s execution time by 75% and the whole system IO by 1%, the client is likely to be more happy, especially if that 1% equates to other programs running a little faster “for free”.

So, I don’t use the BCHR to tune individual statements but I feel confident using it to track the general health of my database, so long as I am mindful of the impact of new functionality or upgrades. It’s a rule of thumb. It’s a database heart rate. (and so is redo generation and half a dozen other things).

A Tale of Two Meetings – 11GR2 and MI SIG October 5, 2009

Posted by mwidlake in Meeting notes, Perceptions.
Tags: , ,
7 comments

Last week I attended two Oracle events, each very different from the other.

The first was an Oracle Corp event, giving details of the new 11GR2 release and what it was introducing. It was in a nice hotel in London with maybe 250, 300 attendees and all quite swish.

The other was a UK Oracle User Group meeting, the last Management and Infrastructure SIG for 2009. 30 people in the Oracle City office and far more unassuming {And note, as I chair the MI SIG, anything I say about the day is liable to bias…}.

Both events were useful to attend and I learnt things at both, but I also found the difference between the two quite interesting.

Oracle 11G Release 2

The official Oracle 11GR2 presentation was where you went for the definitive information on what Oracle Corp feel are the new features of 11G R2 that are of interest (though some of it was not R2-specific but general 11G).

Chris Baker started off by telling us “there has never been a better time” to move to the latest technology or a greater need to gain business advantage through using said latest technology. You know, it would be really nice, just once, to go to such a corporate event and not be given this same thread of pointless posturing? I know it is probably just me being old and grumpy and contrary, but after 20 years in the business I am sick to the hind teeth of Keynotes or Announcements that say the same empty “Raa-Raa” stuff as the previous 19 years – the need “now” to get the best out of your technology has been the same need since the first computers were sold to businesses, so give it a rest. Just tell us about the damned technology, we are smart enough to make our own decision as to whether  it is a big enough improvement to warrant the investment in time and effort to take on. If we are not smart enough to know this, we will probably not be in business too long.

Sorry, I had not realised how much the Corporate Fluff about constantly claiming “Now is the time”, “Now things are critical” gets to me these days. Anyway, after that there were some good overviews of the latest bits of technology and following form them some dedicated sessions in two streams on specific areas, split between semi-technical and management-oriented talks, which was nice.

There was plenty of talk about the Oracle Database Machine, which appears to be exadata version 2 and sits on top of Sun hardware, which is no surprise given the latest Oracle Acquisition. I have to say, it looks good, all the hardware components have taken a step up (so now 40Gb infiniband interconnect, more powerful processors, even more memory), plus a great chunk of memory as Sun’s “FlashFire” technology to help cache data and thus help OLTP work. More importantly, you can get a 1/4 machine now, which will probably make it of interest to more sites with less money to splash out on a dedicated Oracle system. I’ll save further details for another post, as this is getting too long.

The other interesting thing about the new Oracle Database Machine was the striking absence of the two letters ‘P’ and ‘H’. HP was not mentioned once. I cannot but wonder how those who bought into the original exadata on HP hardware feel about their investment, given that V2 seems only available on Sun kit. If you wanted the latest V2 featries such as the much-touted  two-level disc compression is Oracle porting that over to the older HP systems, are Oracle offering a mighty nice deal to upgrade to the Sun systems or are there some customers with the HP kit currently sticking needles into a clay model of top Oracle personnel?

The other new feature I’ll mention is RAT – Real Application Testing. You can google for the details but, in  a nutshell, you can record the activity on the live database and play it back against an 11g copy of the database. The target needs to be logically identical to the source {so same tables, data, users etc} but you can alter initialisation parameters, physical implementation, patch set, OS, RAC… RAT will tell you what will change.

For me as a tuning/architecture guy this is very, very interesting. I might want to see the impact of implementing a system-wide change but currently this would involve either only partial testing and releasing on a wing and a prayer or a full regression test on an expensive and invariably over-utilised full test stack , which often does not exist. There was no dedicated talk on it though, it was mentioned in parts of more general “all the great new stuff” presentations.

Management and Infrastructure SIG

RAT leads me on to the MI SIG meeting. We had a talk on RAT by Chris Jones from Oracle, which made it clearer that there are two elements to Real Application testing. One is the Database Replay and the other is SQL Performance Analyzer,  SPA. Check out this oracle datasheet for details.

SPA captures the SQL from a source system but then simply replays the SELECT only statements, one by one, against a target database. The idea is that you can detect plan changes or performance variations in just the Select SQL. Obviously, if the SELECTS are against data created by other statements that are not replayed then the figures will be different, but I can see this being of use in regression testing and giving some level of assurance. SPA has another advantage in that it can be run against a 10g database, as opposed to RAT which can only be run against 11 (though captured from a terminal 10g or 9i system – that is a new trick).
There are no plans at all to backport RAT to 10, it just ain’t gonna happen guys.

The SIG also had an excellent presentation on GRID for large sites (that is, many oracle instances) and how to manage it all. The presentation was as a result of requests for a talk on this topic by people who come to this SIG and Oracle {in the form of Andrew Bulloch} were good enough to oblige.

The two Oracle Corp talks were balanced by technical talks by James Ball and Doug Burns, on flexible GRID architectures and using OEM/ASH/AWR respectively. These were User presentations, mentioning warts as well as Wins. Not that many Warts though, some issues with licence daftness was about it as the technology had been found to work and do it’s job well. Both talks were excellent.

The fifth talk was actually an open-forum discussion, on Hiring Staff, chaired by Gordon Brown {No, not THAT Gordon Brown, as Gordon points out}. Many people joined in and shared opinions on or methods used in getting new technical staff. I found it useful, as I think did many. These open sessions are not to everyone’s taste and they can go wrong, but Gordon kept it flowing and all went very well.

 

The difference between the two meetings was striking. Both had strong support from Oracle  {which I really appreciate}. Both included talks about the latest technology. However, the smaller, less swish event gave more information and better access to ask questions and get honest answers. There was also almost no Fluff at the SIG, it was all information or discussion, no “Raa-Raa”. But then, the lunch was very nice and there were free drinks after the Corporate event {we shared rounds at a local pub after the SIG event – maybe one round too much}. 

I guess I am saying that whilst I appreciate the Big Corporate event, I get a lot more out of the smaller, user group event. Less fluff, more info. Thankfully, Oracle support both, so I am not complaining {except about the “there has never been a better time” bit, I really AM sick of that :-( ).

 So if you don’t support your local Oracle user group, I’d suggest you consider doing so. And if, like so many sites seem to, you have membership but don’t go along to the smaller events, heck get down there! There is some of the best stuff at these SIG meetings.

Friday Philosophy – Cats and Dogs October 2, 2009

Posted by mwidlake in Perceptions.
Tags: , ,
2 comments

I like cats. Cats are great. I don’t like dogs. I’ve been attacked by a nasty bitie dog and that is my reason. And dogs growl at you. And woof.

This is of course unfair, I have been bitten by cats lots more than dogs (seeing as I own cats and have never owned a dog, this is to be expected), cats scratch, cats hiss at you and yowl and they have been known to leave “presents” in my slippers.

My animal preference comes down to personal, even personality, reasons as opposed to logic. A dog needs attention, a walk twice a day, they follow you around and always want attention and tend to be unquestioning in their affection. Cats can often take you or leave you, will come when called only if they had already decide to come over and the issue of who owns who is certainly not clear. If you do not keep your cat happy, there is always Mrs Willams down the road who Tiddles can up and go and live with instead.

These same illogical preferences riddle IT I think. People make decisions for what I sometimes term “religious” reasons. As an example, I’ve worked with a lot of people who either are strongly for or against Open Source. There are logical and business reasons for and against Open Source, but it seems to me that many people have decided which they prefer for personal reasons {often, Open Source people tend towards anti-establishement and anti-corporation views, Open Source detractors tend towards supporting business and personal wealth}. They then will argue their corner with the various pros and cons but you know there is no swaying their opinion as it was not derived from logic.

In the same way I will not stop preferring cats to dogs. And I know I personally have a couple of Religious decisions about IT that are not based on cold logic {And I am not changing them, OK!}.

I think it helps to realise that people do make decisions this way (some make most of them this way, most make some decisions this way) and it’s not worth getting that angry or annoyed when someone seems to be intractable in their stance against your ideas. After all, you might have made a “religious” decision which side you are on and they can’t understand why you don’t agree with them :-)

opinions formed in this manner are difficult to change. They can and do change, but usually only over time and in a gradual way, certainly not from someone saying to them they are an idiot for preferring Sybase to Ingress and verbally berating them with various arguments for and against.

So, if it is only a work thing {and heck, computers and software really are not that important} be passionate, but try and be a little flexible too.

This post was, of course, just a shallow excuse to include a link to a Cat thing – my favorite cat animation. Sorry Dog lovers {It’s your own faulty for liking nasty, smelly dogs}.

Testing is Not Just for Code. September 16, 2009

Posted by mwidlake in Architecture, VLDB.
Tags: , , ,
7 comments

Someone I am currently working with has a wonderful tag line in her emails:

Next time we want to release untested, why don’t we just release undeveloped?

Testing is not limited to testing code of course. I have recently posted about how a backup is not a backup until you have tested it with a practice recovery.  How you think the database will work by looking at the data dictionary is just a nice theory until you run some actual tests to see how the database responds, as I have been doing with Histograms lately. Sadly, you could even say an Oracle feature is not an Oracle feature until you have tested it.

In my experience, this is particularly true when you test the edges of Oracle, when you are working on VLDBs {Very Large DataBases}.

Last month Jonathan Lewis posted about a 2TB ASM disc size bug, where if you allocated a disc over 2TB to ASM, it would fill it up, wrap around and write over the begining of the file. This week I heard from some past colleagues of mine that they hit this very same bug.
With these very same colleagues we hit a big in 10.1 where you could not back up a tablespace over 8TB in size with RMAN {I can’t give you a bug number for it as we were working with HP/Oracle direct at the time and they “handled it internally”, But when I mentioned it to him, Jonathan found a similar one, bug 5448714 , which stated a 4TB limit on backups. It could be the same bug}.

Yet another VLDB issue was we wanted to move just under one thousand tablespaces from one database to another {again, 10.1}, using transportable tablespaces. We tried to use the utility for checking you are working on a consistent set of tablespaces, but it could not cope with that many. But to plug them into the new tablespace you have to export the metadata and we found a 4000 character limit on the variable stating the tablespaces to transport. That’s 2.3 characters per tablespace, as you need comas to delimit them…Yes, you could manage if you renamed all tablespaces to AA, AB, AC…BA.,BB, BC etc.  If memory servers, the problem was with data pump export and we reverted to old style export which did not have the problem.

Another limit I’ve blogged on is that the automated stats job chokes on very large objects.

Some Data dictionary views can become very slow if you have several tens of thousands of tables/extents/tablespace/indexes

I can appreciate the issues and problems Oracle has with testing their code base, it is vast and people use the software in odd ways and it has to run on many platforms. You might also feel I am being picky by saying Oracle breaks a little when you have 8TB tablespaces or a thousand tablespaces. But

  • Oracle will say in big, glossy presentations, you can build Petabyte and Exabyte databases with Oracle {and have a product called Exadata, don’t forget}.
  • More and more customers are reaching these sizes as data continues to grow, for many site, faster than mores law.
  • Some of these limits appear with databases well below a Petabyte (say a tiddly small 50TB one :-) ).

I’ve been running into these issues with VLDBs since Oracle 7 and they are often with pretty fundamental parts of the system, like creating and backing up tablespaces! I think it is poor show that it is so obvious that Oracle has been weak in testing with VLDB-sized database before release. 

I wonder whether, with 11gR2, Oracle actually tested some petabyte data sizes to see if it all works? After all, as is often said, disk is cheap now, I’m sure they could knock one up quite quickly…

Follow

Get every new post delivered to your Inbox.

Join 199 other followers