jump to navigation

VLDB Backups October 13, 2009

Posted by mwidlake in Architecture, VLDB.
Tags: , , ,
9 comments

One of the indications that your database classes as a VLDB is that your backups are giving you problems simply due to the size of the database.

As some of you will know from previous posts of mine, like this one about how vital it is to prove your backup and this one about how you maybe need to back up more than just the database, I have a thing about backups. Or, more specifically, recovery.

My focus on backups may surprise some people who have worked with me as I often state “I am not a production DBA”. {And no Dave, this is not me saying I am too good to be a proddy DBA, it is me saying I do not have current, strong skills in many of the daily proddy DBA tasks}. However, I am an architect. Whenever I work on a new system or heavily modify an existing system, I try and keep the need for backup and recovery at the front of my mind.

The most common issue encountered with backing up a VLDB is the time it takes to run the backup, it can’t be done in the time between backup cycles, usually a day. 

The second most common issue is the impact on the live system of running the backup. Sometimes this impact is overstated, after all if the backup is one thread running on the server it is only going to consume as much resource as one thread can, which may well leave enough over for the daily processing requirements, but usually for large systems steps have been taken to run the backup in parallel, thus creating considerable load on the system.

A third issue, which is related to the first, is that the backup takes so long and uses so many tapes (or space) that it rarely completes – a network glitch, a failure of the backup suite, running out of media, all stop the backup finishing. I’ve been in the situation of attempting 4 or 5 backups for each one that succeeds as something crops up in the 2 or 3 days it takes to run the backup. {In our case it was the flaky backup software, grrrrr}.

The final issue I’ll mention is one that is often overlooked. You can’t afford the time to recover the backup if it was ever needed. I’ve seen this especially with export or expdp-based backups – Some sites still use export and it has it’s place with smaller systems – often it seems to be used with OLTP systems that have more than 75% of the database volume as indexes. The export runs fine overnight, it is only processing that 25% of the system that is data. But when you ask the client if they can wait 5 days to import the export they go pale. This time-to-recovercan also be a problem with RMAN backups, you need to read in everything you wrote out.   

I’ve said it before but I’m going to say it again – a backup is not a backup until you have done a successful test full recovery. This would certainly highlight how long your recovery takes.

So, how do you solve the problem of backing up a VLDB?

Well, one solution is to not bother. I know of a couple of sites that have two physical copies of the database, at different locations, and write all data to both. If they lose one copy, they can keep running on the other copy whilst the lost version is rebuilt. Your swap-over could be almost instant.
Drawbacks here are:

  • If you lose one copy you have no redundancy until the second system is rebuilt. This is like losing a disk out of a RAID5 array, another failure is disaster. As databases get bigger, this period of zero redundancy gets longer and thus the chance of a second failure increases (which again is just like the RAID5 array – yet another argument against massive discs).
  • As you write to both systems, if the damage is caused by the feed (eg accidentally deleting data) then both are damaged, unless you have a delay on one system, in which case you now have issues with catching up on that delay if you have to swap to the second system. Flashback may save you from damage caused by the feed.
  • The cost of the second system and the complexity of the double-writes can both be issues.

Another solution is physical DataGuard. I see this as slightly different from the double-system approach as above as you have more options, such as replicating to more than one other system, opening and reading  the DataGuard copy, opening and using the copy before flashing it back and re-recovering, even Active DataGuard, where you can have the standby database open and being used, even whilst it is kept up-to-date. Again, you can set things up so that the gap between primary system failure and bringing up a new production system is small. A few issues to be mindful of are:

  • You have to ensure that your primary database is running in forced logging mode or you are extremely, and I mean extremely, careful about what you do against the database that is unrecoverable. The latter option is just asking for trouble actually. Which is a shame, as all those performance tricks of doing direct IO, append operations and nologging activities to help you process all the data in your VLDB are no longer available to you. This might be a show-stopper.
  • You have to take care in setting it all up and may need extra licence.
  • You still have the issue of damage being transmitted to your “backup” before you spot it.
  • The main issue? Someone will get clever and use your DataGuard systems for other things {Such as opening the standby to changing it and then flashing the data back, or use active data guard for reporting which becomes critical to your business} and now you actually have a production critical system split across the DataGuard architecture. It has stopped being a backup, or at least not a dedicated backup. Ooops.

There is actually no need to backup the whole database every night, though some sites seem fixated on achieving this. Or even every week. There is nothing wrong in having an RMAN level 0 {zero} backup that is a copy of everything and then just keep backing up the archived redo logs for eg 2 weeks before doing another level 0. So long as you thoroughly test the recovery and ensure you can recover the level 0, get hold of all those redo logs and apply them in a manner timely enough to support your business. I’ve recovered a level 0 backup over a month old and then run through all the archived redo logs to recreate the system, it worked fine as the volume of redo was pretty small compared to the database. Some considerations with this method are:

  • If you ever have trouble getting files from different days out of your backup area, or occasionally find files from your backup system are corrupt, do not even think of  using this method. One missed archive redo file from 13 days back and you are in serious trouble.
  • You need to do those level zero backups and they take a while. remember what I said about issues during a long backup?
  • It can get complex.
  • There is going to be a fairly significant delay in recovering your system.

There are several options with RMAN of doing incremental and cumulative incremental level 1 backups against a level 0 baseline backup. They have the same pros and cons as above, often trading more complexity with shorter recovery times. All good so long as you practice the recovery.

Physical copy at the storage level. These solutions seems to come and go every few years, but the principle is usually either (a) splitting mirrors – you have eg 3 copies of the data duplicated across the storage, you can un-couple one copy and do to it what you want, like copy it to tape- and then reintroduce the copy and catch up on changes, ie “resilver” the mirror. (b) use fancy logging within the storage layer to create a  logical copy of the whole live DB at a point in time by tracking and storing changes. You can then take your time copying that logical version to your backup destination. Taking the initial copy is usually instantaneous and with (b) can take up a surprisingly small amount of space. Disadvantages?

  • Cost. These clever IO units that can do this seem to be eye-wateringly expensive
  • Tie-in. You move storage provider, you need to re-plan and implement a new backup strategy
  • Probably personal this one, but can you trust it?  I saw it go horribly wrong in about 1998 and even now I kind of wince internally thinking about it. 

Export and Import. OK, I kind of rubbished this approach earlier and who in their right minds would try and export/import a VLDB of 20TB? You don’t. You export the critical few percent of the database that you need to recreate some sort of temporary production-capable system. Many applications can actually get by with all the reference/lookup data and the latest month or two of active business data. It gets a workable system up and running to keep your business process ticking over whilst you sort out recovering the rest of the system. The beauty of an export is that it can be imported to any working Oracle database of a high enough release level.

3 months ago I would have said this consideration needed to have been designed into you system architecture from the start, to stand any reasonable change of working, but I know of one site that managed just this technique recently. Only because they absolutely had to, but they managed it.   

My final backup methodology I’m going to mention here is – you do not need to back up all of your database in the same way. If you can move a large percentage of your database into readonly tablespaces, you can back up that section of the database once {disclaimer, by once I mean two or three times to two or three places and check you can read what you wrote and padlock the door to the vault it is in, and repeat said once-only backup every 6-12 months} and drop that section out of your backup. Now you only need to back up the remaining, hopefully small, active section of the database with whatever method you want. You can tie in the previous above of only needing to recover a critical subset of the system to get going again, ie what is not readonly, the two approaches complement each other. A few issues:

  • It only really works when you design this into the system from the start.
  • potentially complex recovery spread over a lot of tapes. Can you rely on being able to get at them in a timely manner?
  • People have a habit of wanting to update some of the stuff you made readonly. Sometimes only a few records but spread over a lot of readonly tablespaces.

All the above can be mixed and matched to come up with an overall solution. Personally I like having a physical standby database for immediate failover and an incremental backup off site for recovery beyond some situation that gets both primary and standby database.

Friday Philosophy -Do I think Oracle is Rubbish? October 8, 2009

Posted by mwidlake in Blogging, Perceptions.
Tags:
1 comment so far

This should be a “Friday Philosophy” posting really, but heck it’s my blog, I can do what I want {quiet smile}. Besides, by the time I finish this, it might well BE Friday. Oh, heck, I’ll just change the title now to a Friday Philosophy one…

I’ve been reviewing some of my blog this week {it is coming up to 6 months since I started so I was looking back at how it has gone}. Something struck me, which is I can be pretty negative about Oracle software and even Oracle Corp at times.

I mostly seem to pick up on oddities, things that do not work as first seems, even outright bugs. I do not often post about “this is how this cool Oracle feature works” or “I used this part of Oracle to solve this problem”. Partly the reason is that there are a lot of blogs and web pages about “how this feature works”, so the need is generally already met. Partly it is that I, like most people, are more interested in exceptions, gotchas and things going wrong. If it works, heck you just need to read the manual don’t you?

So, do I like Oracle?

Yes. Over all I really like working with Oracle. This is because:

  • I can store and work with pretty much whatever data I have ever needed to with Oracle. It is rare for me to be utterly stumped how to achieve something, though it could take time and maybe be a tad slow or a little inelegant, but it can be done.
  • Despite my recent complaints, you can chuck a hell of a lot of data at Oracle. Back in 2002 I was asked if I could put 7 or 8 Terabytes of data into an Oracle database. I did not even pause before saying “Yes!” – though I knew it would be a big job to do so in a way that was maintainable. I’d now feel the same about a couple of hundred TB.
  • The core technology works really well. We all complain about bits and pieces admitedly, but if I have a complex SQL statement with 15 tables and 25 where clauses, I don’t worry about the database giving me the wrong answer, I worry about the developer having written it wrongly {or Oracle running it slowly, but that keeps me in work, hehe.}. I can back up Oracle in many ways and, once I have proven my recovery, I know I can rely on the backup continuing to work, at least from an Oracle perspective. I’ve never yet lost any production data. Do I worry about transactional consistency? Never. Maybe I should, I’ve seen a couple of blogs showing how it can happen, but in my real-work life, I never even think about it.
  • Oracle does continue to improve the core products and they will listen to the community. It might not seem like it at times, I know, but they do. It can just take a long time for things to come through. As an example, I worked with the Oracle InterMedia developers back with the Oracle 10 beta program in 2003. They {well, to be specific, a very clever lady Melli Annamalia} were adding stuff back then that we and others needed that did not get to see the light of day in 10GR1, but was there as  a load of PL/SQL to do it in 10GR2. Melli said she was adding it into the code base as ‘C’ as well but it would take a while. It did, I think it was part of the 11G release.

Will this stop me complaining and whining on about bits of Oracle I don’t like or that do not work as they should? Absolutely not. As Piet de Visser said on a comment to one of my recent blogs, it is beholden on us Users to keep Oracle Corp honest. But I thought I ought to mention, at least once, that I do actually like Oracle.

I Like Oracle, OK?

Grudgingly :-)

Big Discs are Bad September 27, 2009

Posted by mwidlake in development, performance, Uncategorized.
Tags: , , ,
8 comments

I recently came across this article on large discs for database by Paul Vallee. The article is over 3 years old but is still incredibly valid. It’s a very good description of why big discs are a problem for Oracle Database Performance. {Paul also introduces the BAHD-DB campaign – Battle Against Huge Disks for Databases, which I quite like}.

To summarise the article, and the problem in general, IT managers will buy big discs as they provide more GB per pound sterling. It saves money.
However, less discs is Bad For Performance. As an extreme example, you can now buy a single disc that is a TB in size, so you could put a 1TB Oracle database on one such disc. This one disc can only transfer so much data per second and it takes this one disc say 10ms to search for any piece of data. If you want the index entry from one place and the table row from another, that is at least two seeks. This will not be a fast database {and I am not even touching on the consideration of disc resilience}.

Now spread the data over 10 discs. In theory these 10 discs can transfer 10 times the total data volume and one disc can be looking for information while the others are satisfying IO requests {This is a gross over-simplification, but it is the general idea}.

IT Managers will understand this 1-to-10 argument when you go through it.

Kind of.

But then discussions about how many modern “fast” discs are need to replace the old “slow” discs ensure. It can be very, very hard to get the message through that modern discs are not much faster. A 1TB disc compared to a 4-year-old 100GB disc will not have a transfer speed 10 times faster and it will certainly not have a seek time ten times less, chances are the seek time is the same. And then there are the discussion of how much impact the larger memory caches of modern storage units have. Answer,(a) quite a lot so long as it is caching what you want and (b) even if it is perfectly caching what you want, as soon as you have read a cache-sized set of data, you are back to disc IO speed.

Bottom line. Disc Drives are now slower in proportion to the disc acerage than they used to be.

Anyway, I am certainly not the only person to have had these discussions, though I have had them for longer than most {due to my accidental work history of having worked on VLDBs for so long}. There are certainly practitioners of Oracle Arts who understand all of this far better than I and one of them, James Morle, recently started blogging. It’s looking good so far. If he keeps it up for a month, I’ll put him on my blog roll :-)

There is, however, one aspect of the Big Disc Performance issue that does not seem to get much mention but is something I have suffered from more than a couple of times.

As a Database Performance person you have had the argument about needing spindles not disc acreage and won. The IT manager buys enough spindles to provide the I/O performance your system needs. Success.

However, the success has left a situation behind. You need 10 spindles over a couple of RAID 10 arrays to give you the IO you need. 250GB discs were the smallest you could buy. So you have 1.25TB of available storage (RAID 10 halves the storage) and have a 500GB database sitting on it. There is 750GB of empty storage there…

That 750GB of empty storage will not be left inviolate. Someone will use it. Someone will need “a bit of temporary storage” and that nice chunk of fast storage will be too inviting. Especially if it IS fast storage. It will be used.

Now your database, who’s storage you specified to support said database, is sharing it’s storage with another app. An  app that steals some of your IO and potentially {heck, let’s say this straight WILL} impact your database performance. And the galling thing? Twice, I had no idea my storage had become shared until I started getting odd IO latency issues on the database.

You may be able to make a logical argument for the spindles you need at design time. But you have almost no chance of protecting those spindles in the future. But who said working life was easy? :-)

Friday Philosophy – A Comment on Comments September 25, 2009

Posted by mwidlake in development, internals.
Tags: , , ,
3 comments

This blog is not about blog comments. It’s about table and column comments in the data dictionary.

Some of you may well be going “huh?”. Others are probably going “Oh yes, I remember them?”. Table and column comments appear to be suffering the same fate as ERDs, who’s slow demise I bemoaned a couple of weeks ago. They are becoming a feature not known about or used by those with “less personal experience of history” {ie younger}.

It’s a simple principle, you can add a comment against a table or a column,up to 4000 characters. {you can also add comments against index types, materialized views and operators (huh?), at least in 10.2}.

comment on table widlakem.person is
'Test table of fake people for training purposes, approx 50k records'
Comment created.

select * from dba_tab_comments
where owner='WIDLAKEM'
and table_name = 'PERSON'

OWNER      TABLE_NAME                     TABLE_TYPE
---------- ------------------------------ -----------
COMMENTS
--------------------------------------------------------------------------
WIDLAKEM PERSON TABLE
Test table of fake people for training purposes, approx 50k records

--

comment on column widlakem.person.second_forename is
'null allowed, second or middle name. If more than one, delimited by / character'

select * from dba_col_comments
where owner='WIDLAKEM' and table_name = 'PERSON'
and column_name='SECOND_FORENAME'

OWNER      TABLE_NAME                     COLUMN_NAME
---------- ------------------------------ ------------------------------
COMMENTS
-------------------------------------------------------------------------------
WIDLAKEM PERSON SECOND_FORENAME
null allowed, second or middle name. If more than one, delimited by / character

So you can stick basic information into the data dictionary where it will remain with the object until the object is dropped or the comment is updated. (You can’t drop a comment, you can only update it to ”:

>comment on table widlakem.person is ”;

It’s simple, it’s sensible, it’s solid.

And it seems to be dying out. In fact, I had stopped adding comments to my tables and columns as no one else seemed to bother. Probably as a consequence of them not being added, no one ever seemed to think to look at them to get hints about the database structure and table/column use.

But Raj sitting next to me is as old a hand at this game as myself and I “caught” him adding comments to the little schema we are working on together. Well, if he is bothering, so will I!

How about Oracle Corp? How do they manage on this front? After all, the Oracle Reference manual has all these short descriptions of tables and columns in the data dictionary {some helpful, some utterly unhelpful}:

select owner,count(*) from dba_tab_comments
where owner in ('SYS','SYSTEM','SYSMAN')
group by owner
OWNER        COUNT(*)
---------- ----------
SYSTEM            151
SYSMAN 472
SYS 3894

3 rows selected.

Heyyyy, nice Oracle.

select owner,table_name,comments
from dba_tab_comments
where owner in ('SYS','SYSTEM','SYSMAN')

OWNER      TABLE_NAME
---------- ------------------------------
COMMENTS
-----------------------------------------------------
SYS ICOL$

SYS CON$

SYS FILE$

SYS        UET$

SYS        IND$
SYSTEM     MVIEW_RECOMMENDATIONS
This view gives DBA access to summary recommendations
SYSTEM     MVIEW_EXCEPTIONS
This view gives DBA access to dimension validation results
SYSTEM     AQ$_QUEUE_TABLES
SYS        SEG$

SYS        COL$

SYS        CLU$
SYSTEM     SQLPLUS_PRODUCT_PROFILE

SYSTEM     PRODUCT_PRIVS

SYSTEM     HELP

SYSMAN MGMT_NOTIFY_QTABLE

SYSMAN AQ$MGMT_NOTIFY_QTABLE_S

Oh. Lots of blanks. Not so nice Oracle. No, scrub that, several lines are not blank, so Not a bad attempt Oracle.

Why all the blanks? Why have Oracle set blank comments? That’s because a blank table comment gets created when you create a table, and a blank column comment is created per column.

create table mdw_temp (col1 number);
Table created.

select * from dba_tab_comments where table_name = 'MDW_TEMP';
OWNER      TABLE_NAME                     TABLE_TYPE
---------- ------------------------------ -----------
COMMENTS
----------------------------------------------------------------------
WIDLAKEM   MDW_TEMP                       TABLE

1 row selected.

select * from dba_col_comments where table_name='MDW_TEMP';
OWNER      TABLE_NAME                     COLUMN_NAME
---------- ------------------------------ ---------------
COMMENTS
-------------------------------------------------------------------------
WIDLAKEM   MDW_TEMP                       COL1

1 row selected.

So what populated system-like comments do we have?

select owner,count(*) from dba_tab_comments
where owner in (‘SYS’,’SYSTEM’,’SYSMAN’)
and comments is not null
group by owner

OWNER COUNT(*)
———- ———-
SYSTEM 73
SYSMAN 15
SYS 944

OK, there are some, and as you can see below, some are more useful than others…

OWNER      TABLE_NAME
---------- ---------------
COMMENTS
----------------------------------------------------------------------------------------------------
SYS        SYSTEM_PRIVILEG
           E_MAP
Description table for privilege type codes.  Maps privilege  type numbers to type names
SYS        TABLE_PRIVILEGE
           _MAP
Description table for privilege (auditing option) type codes.  Maps privilege (auditing option) type
numbers to type names
SYS        STMT_AUDIT_OPTI
           ON_MAP
Description table for auditing option type codes.  Maps auditing option type numbers to type names
SYS        RESOURCE_MAP
Description table for resources.  Maps resource name to number
SYS        SESSION_PRIVS
Privileges which the user currently has set
SYS        SESSION_ROLES
Roles which the user currently has enabled.
SYS        ROLE_SYS_PRIVS
System privileges granted to roles
SYS        ROLE_TAB_PRIVS
Table privileges granted to roles
SYS        ROLE_ROLE_PRIVS
Roles which are granted to roles
Oracle_DatabaseInstance contains one entry for each Oracle Instance that is
centrally managed.  A Real Application Cluster has one entry for each of the
instances that manipulate it.  Instances of Oracle_DatabaseInstance are created
using the database instances that are known to the Oracle Enterprise Manager
repository.
SYS        DBA_AUDIT_OBJECT
Audit trail records for statements concerning objects, specifically: table, cluster, view, index, se
uence, [public] database link, [public] synonym, procedure, trigger, rollback segment, tablespace, r
ole, user
SYS        USER_AUDIT_OBJECT
Audit trail records for statements concerning objects, specifically: table, cluster, view, index, se
uence, [public] database link, [public] synonym, procedure, trigger, rollback segment, tablespace, r
ole, user
SYSMAN     ORACLE_DATABASESTATISTICS
Oracle_DatabaseStatistics provides current information about the statistics for
a database.  Database statistics pertain to the database and have the same
value regardless of the database instance that is used.
SYSMAN     ORACLE_DBINSTANCESTATISTICS
Oracle_DBInstanceStatistics contains statistics for a database instance.  These
are retrieved from the Oracle Managment Repository that is managing the
database upon request from a managment client.

If you don’t add comments to tables and comments, you will just have blank entries for them in the data dictionary.

So why not pop a few real comments in there, especially for any tables or comments where the name is not really telling you what that column or table is for? It’s free and easy, and it might just prove useful. And if you add them to your tables, I’ll add them to mine.

What is a VLDB? September 18, 2009

Posted by mwidlake in Architecture, VLDB.
Tags: ,
10 comments

In a post just a couple of days ago on testing, I complained that VLDBs break at the edges. Coskan posted a comment asking I blog on what I consider to be a VLDB and thus what a VLDB DBA is, and I am more than happy to oblige, especially as Coskan not only comments a lot but provides a really good summary of blogs on his own blog. {in fact, I need to add Coskan to my blog roll, something I have been meaning to do for ages}.

Hopefully, this will link to the comments section of that post as Piet deVisser added a wonderful comment answering the question for me. Go and read, I suspect it is better than my attempt here!

VLDB stands for Very Large DataBase. It is not an acronym I like as it sounds suspiciously like a sexually transmitted disease, but maybe that is just a problem with my mind. The term ULDB appeared for a while but seems to have failed to gain traction. U stands for “Ultra” of course.

So what is a Very Large DataBase?

A VLDB is a database who’s very size gives you, as a DBA or database architect, extra work.

Maybe a simpler rule that you can apply is “you can’t back up the database in 24 hours using standard tools”. You can chuck more tape drives and IO channels at a DB but you will quickly hit a limit where you infrastructure or budget can’t cope.

Another, and one I pinch outrageously from Piet is “you can’t afford to duplicate the database for test/QA purposes”. That leads to a whole raft of issues.

I put forward another definition of a VLDB in a comment on the original blog. “Any database who’s size makes 8 out of 10 DBAs comment ‘that is a big database’.” That definition takes into account whether a database is generally beyond the experience of most DBAs /Designers. Why do I think that is significant? Because it means most DBAs/Designers will not have worked with a database that size and thus dealt with the associated problems. The database engine may {or may NOT, as I have complained about} cope with the database size, but you need staff to design it and look after it who know how to do so.

The definitive size of a VLDB, of course, goes up year by year. A few weeks ago I found a document I have mentioned in presentations a couple of times, an internal White Paper by Oracle Corp on what a VLDB is, written around 1994. Next time I am at home I’ll scan it. If I remember correctly, at that time 30GB and beyond on a VMS or Mainframe system was considered a VLDB and, in Unix (the new kid on the block back then), 20GB was the threshold.

Right now, as of September 2009, I would judge any database over 10TB of used space is a VLDB. In 12 months, that will be 20TB. In another 12 months, 40 or maybe 50TB.

“Moore’s Law” traditionally states that compute power doubles every 18 months, but I have just suggested that the VLDB limit doubles every 12 months. I say that as, over the last 10 years, I have worked on several systems, systems most DBAs would consider as “challengingly large”, which double in 12 months or less. Data is exploding. More and more of us are dealing with VLDBs.
This “doubling in 12 months” was not the case (I think) back in 1995, it started in 2000 or so. Before then, database size was doubling about in line or less than with Moore’s law I would say, but that is only my opinion.

What changed? People swapped from thinking you could store only what you really needed to thinking you could store “everything”. Which is “everything” your Moore’s-law expanding CPUs can process PLUS all the summary and metadata you extract from that data.

I could be wrong in my figures though. If you took size as the determining factor and doubled 20GB every 18 months from 1994, you would now class a VLDB, in 2009, as approx 20TB.

What main issues do you face with a VLDB?

  • Backing up the database. With a VLDB, a daily backup of everything via RMAN or Hot Backup is simply not possible, as you can’t run the backup in 24 hours. You need to: Backup less often; backup only part of the DB; use hardware such as mirror splitting or deltas; some other trick like, say, never backing it up but having 3 standbys. I’ve seen it done.
  • Performance. You need to consider radical changes such as removing RI or designing around full table scans and ignoring the block buffer cache for the largest tables.
  • The number or size of objects starts causing bits of Oracle to break or work less efficiently (so many tables it takes 2 minutes to select them all or you hit an unexpected limit like th 2TB disk size in ASM, because you need to use bigger disc sizes as otherwise you need more discs than you can easily manage).
  • Maintenance tasks become a challenge in their own right. This could be stats gathering, it could be adding columns to a table, it could be recreating global indexes, all of which now take more time than you can schedule in the maintenance windows {so part of the definition of a VLDB could be down to how active a database is and how small you maintenance windows are – 1TB could be a VLDB if you can never spend more than an hour doing anything!}
  • GUIs are no use to you. Listing all the tablespaces in your database with OEM is a pain in the proverbial when you have 962 tablespaces. You can’t keep track of all of them, visually.
  • You can’t properly test or prototype as you cannot afford to create a full sized test system

I’d like to pick up that last point. With A VLDB, you often end up doing things on the live system that you have been unable to test or prove because you simply lack a test system that is even within an order of magnitude the size of your live system. RAC is a particular issue, it seems many sites are happy to have the live system as a RAC system but not the test or development systems. When you raise the issue, the response is often “well, there is not that much difference between RAC and non-RAC systems is there?”. You are not allowed to get violent with the client,or even deeply sarcastic. Yes, there is a huge difference.

A VLDB DBA is someone who has had to consider the above for more than a few months, or on more than one system. Or who simply cries when you mention Oracle breaking when it exceeds size limits.

How do you know when you are dealing with a ULDB? When you can find no one else who will speak publically about a database bigger than yours. When I was working with the Sanger Institute on my pet “it is really quite huge” database I would often have these frustrating conversations with Oracle Corp:

“X is giving me trouble as the database is just so large”

“It is not that large, we have customers with bigger database”

“Can you introduce me, so we can talk about these issues?”

“Errr, no , they don’t like to talk about their systems”.

Thanks. Thanks a bunch.

Remember, there is always someone with a bigger DB than you. But they probably won’t talk about it.

 

Enough for tonight….

Testing is Not Just for Code. September 16, 2009

Posted by mwidlake in Architecture, VLDB.
Tags: , , ,
7 comments

Someone I am currently working with has a wonderful tag line in her emails:

Next time we want to release untested, why don’t we just release undeveloped?

Testing is not limited to testing code of course. I have recently posted about how a backup is not a backup until you have tested it with a practice recovery.  How you think the database will work by looking at the data dictionary is just a nice theory until you run some actual tests to see how the database responds, as I have been doing with Histograms lately. Sadly, you could even say an Oracle feature is not an Oracle feature until you have tested it.

In my experience, this is particularly true when you test the edges of Oracle, when you are working on VLDBs {Very Large DataBases}.

Last month Jonathan Lewis posted about a 2TB ASM disc size bug, where if you allocated a disc over 2TB to ASM, it would fill it up, wrap around and write over the begining of the file. This week I heard from some past colleagues of mine that they hit this very same bug.
With these very same colleagues we hit a big in 10.1 where you could not back up a tablespace over 8TB in size with RMAN {I can’t give you a bug number for it as we were working with HP/Oracle direct at the time and they “handled it internally”, But when I mentioned it to him, Jonathan found a similar one, bug 5448714 , which stated a 4TB limit on backups. It could be the same bug}.

Yet another VLDB issue was we wanted to move just under one thousand tablespaces from one database to another {again, 10.1}, using transportable tablespaces. We tried to use the utility for checking you are working on a consistent set of tablespaces, but it could not cope with that many. But to plug them into the new tablespace you have to export the metadata and we found a 4000 character limit on the variable stating the tablespaces to transport. That’s 2.3 characters per tablespace, as you need comas to delimit them…Yes, you could manage if you renamed all tablespaces to AA, AB, AC…BA.,BB, BC etc.  If memory servers, the problem was with data pump export and we reverted to old style export which did not have the problem.

Another limit I’ve blogged on is that the automated stats job chokes on very large objects.

Some Data dictionary views can become very slow if you have several tens of thousands of tables/extents/tablespace/indexes

I can appreciate the issues and problems Oracle has with testing their code base, it is vast and people use the software in odd ways and it has to run on many platforms. You might also feel I am being picky by saying Oracle breaks a little when you have 8TB tablespaces or a thousand tablespaces. But

  • Oracle will say in big, glossy presentations, you can build Petabyte and Exabyte databases with Oracle {and have a product called Exadata, don’t forget}.
  • More and more customers are reaching these sizes as data continues to grow, for many site, faster than mores law.
  • Some of these limits appear with databases well below a Petabyte (say a tiddly small 50TB one :-) ).

I’ve been running into these issues with VLDBs since Oracle 7 and they are often with pretty fundamental parts of the system, like creating and backing up tablespaces! I think it is poor show that it is so obvious that Oracle has been weak in testing with VLDB-sized database before release. 

I wonder whether, with 11gR2, Oracle actually tested some petabyte data sizes to see if it all works? After all, as is often said, disk is cheap now, I’m sure they could knock one up quite quickly…

Your Backups Are Probably Too Simple September 10, 2009

Posted by mwidlake in Architecture.
Tags:
5 comments

Following on from my post a few days back on When a Backup is Not a Backup I want to pick up on the topic of Simplicity, something Piet de Visser, Neil Chandler and I regularly discuss.

It’s very good to ask yourself the question “how to keep oracle backups simple“. A simple backup may not be the quickest to run or the fastest to recover from but it is a lot more likely to work. Which is the one critical defining feature of a backup!

However, the backing up of the Oracle database itself is not what I have in mind, after all any DBA who is beyond the trainee stage should know how to do it and there are many options with Oracle to achieve a good, simple backup (Cold backup, full export, manual hot backups, simple RMAN )

{As an aside, something that really bugs me about Oracle at the moment is their utter focus on RMAN in the backup documentation –  for example, the 10gR2 “Backup and Recovery Basics Manual” is almost dedicated to RMAN. It should not be, is not called the “Recovery Manager Basics Manual”. If you go to the Advanced guide it is all pretty much RMAN until you get to chapter 17, where you eventually find a reasonable set of information on creating a simple, user-controlled hot backup. Or use Google on “End backup Oracle” and there is endless information and help. Such a simple backup method should be made more of by Oracle Corp, I believe}

My concern is that the Oracle Database is very often only one component of the whole system.  Note  the “the.”

First up – do you use Database Links to read data between the database you are considering and others? More to the point, do you do two-phase commits between your database of concern and other oracle databases? If so, you may well need to keep the data in-synch between them. {Don’t forget – Your applications may independently update multiple database without using Database Links}. How exactly do you manage that? Have a policy that if database (a) has to be recovered, you also do database (b), both to a point-in-time?

Second, and probably most common, do you have files external to the database that are referenced, say as BFILES or via the application? If you do a recovery, do you need to make sure those files are in-synch with the database? You may well have the situation where files on the server/file store are backed up at one time and the database at another. Loss of the whole server will need a full recovery of both backups, which were taken at different times.

Three, are there other dependencies, such as password authentication, account creation, audit maintenance?

Four, If your system crashes and was half way through a batch process, how do you cleanly resume the batch or clear it down and start again (having made sure the consideration of external files above has been resolved)?

 I’m sure you get the idea now. Think about it.

Test recoveries are good at identifying the above issues, but only if the test recovery includes testing the application(s) that use the database {as opposed to the usual “select sysdate from dual – yep, it works” test}.

The above considerations are part of why I am a fan of keeping all the things to do with a business application in the Oracle database, even if it might not be the ideal from other perspectives (eg images make the database a lot larger and may not be processed as fast as they can be as external files).

All of the above are probably the first steps towards creating your Disaster Recovery Plan. “Disaster Recovery” is a whole topic in its own right. Actually, it’s a whole business and one people have very lucrative careers in. But if you are the DBA and have the backups nailed, then Disaster Recovery (what to do if the building burns down) is probably next on your list of things to worry about.

When a Backup is Not a Backup September 8, 2009

Posted by mwidlake in Architecture.
Tags: ,
13 comments

I had a call from an old client today who had found themselves in a bit of a pickle. They were facing what I’d judge is the third most common “avoidable disaster” for Oracle databases. And probably the most devastating.

They had a damaged database, just one of those things. Media failure had corrupted a couple of datafiles. This is notthe disaster, most DBAs who have been around for a few years have experienced this. Disks are mechanical devices, they go wrong, controllers develop faults, people accidentally pull the plugs out of switches.

The disaster was that they had never tested their backup. Guess what? It turned out that their backup was not be a valid backup. It was just a bunch of data on tape {or disk or wherever they had put the backup}.

A backup is not a backup until you have proven you can recover from it.

If your backup has not been tested, and by that I mean used to recreate the working system which you then test, then it is just a bunch of data and a hopeful wish. You will not know if it works until you are forced to resort to it, and that means (a) you already have a broken system and an impact on your business (b) time is going to be short and pressure is high and (c) if you are the DBA, you could be about to fail in your job in a most visible and spectacular way.

Oddly enough, only two or three weeks ago I had another client in exactly the same position. Database is corrupted, the backup had not been tested, the backup turned out to be a waste of storage. In fact, in this case, I think the backup had not been taken for a considerable period of time as a standby database was being relied on as the backup. Which would have been fine if the Standby had been tested as fulfilling the purpose of “working backup”.

The standby had not, as far as I could deduce, ever been opened and tested as a working and complete database since the last major upgrade to the host system. When tested for real, It proved not to be a working and complete database. It was an expensive “hopeful wish”.

The client from a few weeks ago finally got their system back, but it took a lot of effort for the DBAs and the developers to sort it out and they were a tad lucky. The jury is out on the client who called me today.

I can’t think of anything at all that the DBA function does that is more important than ensuring backups are taken and work. {Maybe an argument could be made that creating the system is more important as nothing can be done about that, but then you could argue that the most important thing you do in the morning is get out of bed}. Admittedly, setting up, running and testing backups is not a very exciting job. In fact it often seems to be a task passed on to a less experienced member of the team {just like creating databases in the first place}. But most of us are not paid to have fun, we are paid to do a job.

I’ll just make a handful more comments before giving up for today.

  • The database backup nearly always needs to be part of a more comprehensive backup solution. You need a way to recreate the oracle install, either a backup of the binaries and auxiliary files (sql*net, initialization files, password files etc) or at least installation media and all patches so you can recreate the installation. You need a method to recover or recreate your application and also your monitoring. You might need to be able to recreate your batch control. O/S? Have you covered all components of your system, can you delete any given file off your system and fix it?
  • Despite the potential complexity resulting from the previous point, you should keep your backup as simple as you can. For example, if you only have a handful of small databases, you might just need old-style hot backups via a shell script, rather than RMAN. Only get complex {like backing up from a standby} if you have a very, very compelling business need to do so.
  • Testing the backup once is a whole world better then never testing it. However, regular, repeated recovery tests not only allow the DBA/Sys Admin teams to become very comfortable with the recovery process and help ensure it is swift and painless, but by trying different scenarios, you may well discover issues that come under the first two points.

I’ve not even touched on the whole nightmare of Disaster Recovery :-)

Oracle Exadata – does it work? July 26, 2009

Posted by mwidlake in development, VLDB.
Tags: ,
12 comments

Does Oracle Exadata work? 

That is a tricky question as, unlike most Oracle database features, you can’t download it and give it a test.

You can try out partitioning, bigfiles, oracle Text, InterMedia {sorry, Multimedia),} all sorts of things by downloading the software. You can even try out RAC pretty cheaply, using either VM-Ware or a couple of old machines and linux, and many hundreds of Oracle techies have. The conclusion is that it works. The expert conclusion is “yes it works, but is it a good idea? It depends {my fees are reasonable}” :-).

I digress, this ability to download and play allows Oracle technophiles to get some grounding in these things, even if their employer is not currently looking to implement them {BTW how often do you look at something in your own private time that your company will not give you bandwidth for – only to have them so very interested once you have gained a bit of knowledge? Answers on a postcard please…}.

Exadata is another beast, as it is hardware. I think this is an issue.

I was lucky enough to get John Nangle to come and present on Exadata at the last UKOUG Management and Infrastructure meeting, having seen his talk at a prior meeing. John gave a very good presentation and interest was high. I have also seen Joel Goodman talk {another top presenter}, so I understand the theory. I have to say, it looks very interesting, especially in respect of what is ,perhaps, my key area of personal expertise, VLDB. Databases of 10’s of terabytes.

I don’t plan to expand here on the concepts or physical attributes of Exadata too much, it is enough to say that it appears to gain it’s advantage via two main aspects:-

  • Intelligence is sited at the “disc controller” level {which in this case is a cheap 4-cpu HP server, not really the disc controller} which basically pre-filters the data coming off storage so only the data that is of interest is passed back to the database.  This means that only blocks of interest are chucked across the network to the database.
  • The whole system is balanced. Oracle have looked at the CPU-to-IO requirements of data warehouses and decide what seems to be a good balance, they have implemented fast, low latency IO via infiniband and made sure there are a lot of network pipes from the storage up the stages to the database servers. That’s good.

The end result is that there is lots of fast, balanced IO from the storage layer to the database and only data that is “of interest” is passed up to the database.

It all sounds great in theory and Oracle Corp bandy around figures of up to 100 times (not 100%, 100 times) speedup for datawarehouse activity, with no need to re-design your implementation. At the last M&I UKOUG meeting there was also someone who had tried it in anger and they said it was 70 times faster. Unless this was a clever plant by Oracle, that is an impressive independent stated increase.

I am still very interested in the technology, but still sceptical. After all, RAC can be powerful, but in my experience it is highly unlikely that by dropping an existing system onto RAC you will get any performance (or high availability) increase. In fact, you are more likely to just make life very, very difficult for yourself. RAC works well when you design your system up-front with the intention of working on the same data on the same nodes. {Please note, this is NOT the oft-cited example of doing different work types on different nodes, ie data load on one node, OLTP on another and batch work on the third. If all three are working on the same working set, you could well be in trouble. You are better off having all load, OLTP and Batch for one set of data on one node, OLTP-load-batch  for another set of data on another node etc, etc, etc. If your RAC system is not working well, this might be why}.  Similarly, partitioning is an absolutely brilliant feature – IF you designed it up-front into your system. I managed to implement a database that has scaled to 100 TB with 95% of the database read-only {so greatly reducing the backup and recovery woes} as it was designed in from the start.

Where was I? Oh yes, I remain unconvinced about Exadata. It sounds great, it sounds like it will work for datawarehouse systems where full table scans are used to get all the data and the oracle instance then filters most of the data out. Now the storage servers will do that for you.  You can imagine how instead of reading 500GB of table off disc, across the network and into Oracle memory and then filtering it, the  eight disc servers will do the filtering and send a GB of data each up to the database. It has to be faster.

BUT.

What if you have some OLTP activity and some of the data is in the SGA? That is what stops full-table-scans working at Multi-Block-Read_Count levels of efficiency.

What happens if some of the table is being updated by a load process at the same time?

 What happens if you want some of the data hosted under ASM and full Exadata performance brilliance but you have several 10’s of TB of less-active data you just want to store on cheap SATA raid 5 discs as well? How does Exadata integrate then?

You can’t test any of this out. I did email and ask John about this inability to play with and discover stuff about a solution that is hardware and very expensive. And he was good enough to respond, but I think he missed the point of my question {I should ask again, he is a nice chap and will help if he can}. He just said that the DBA does not have to worry about the technology, it just works. There are no special considerations.

Well, there are. And I can’t play with it as I would need to buy a shed load of hardware to do so. I can’t do that, I have a wife and cat to feed.

So even though Exadata sound great, it is too expensive for anyone but large, seriously interested companies to look in to.

And I see that as a problem. Exadata experts will only come out of organisations that have invested in the technology or Oracle itself. And I’m sorry, I’ve worked for Oracle and as an employee you are always going to put the best face forward.  So, skills in this area are going to stay scarce unless it takes off and I struggle to see how it will take off unless it is not just as good as Oracle says , but better than Netezza and Teradata by a large margin.

Does anyone have an exadata system I can play on? I’d love to have a try on it.

Friday Philosophy – Simply Complex July 24, 2009

Posted by mwidlake in development, Management.
Tags: , ,
3 comments

Piet de Visser is an ardent champion of simple solutions within the Oracle arena – and I completely agree with his outlook on this. {Almost}.

Simple solutions usually win long-term as they are easy to understand, easy to look after and easy to change. OK, you may not be getting the absolute best performance you can, you may not be doing the job as completely as you can, but if it is a simple solution then you probably implemented it easily and quickly. This probably also means it cost you not-a-lot in person time and brain power, so you can personally afford to blow it away and do it again. In fact, with all that saved time, money and brain power you can probably afford to create a couple more simple solutions to other problems to.

Perhaps the only thing you are probably losing out on is the kudos of having been smart enough to come up with something very cunning and complicated, to impress everyone with. You’ll get over it, people don’t stay impressed for very long, especially when your mind-bendingly cunning and complicated solution melts into a heap {as a couple of mine have, ho-hum}.

Is your chosen solution simple? I give you a simple test – Explain it to someone.

  • If you can explain it to your colleagues, it is probably quite simple. If you can’t, either the solution is not so simple or your colleagues are.
  • If you can explain it to your boss then it is probably an excellent example of simplicity.
  • If you can explain it to your mum, you have blindingly clever simplicity and your work here is done.

Oh, you remembered that I said I almost agreed with Piet.

I can think of four reasons for not implementing a simple solution. I have listed them in ascending order of being a good reason (best last). And, unfortunately, also descending order of likelihood (most likely last).

  • You were sold a complex solution as complex solutions earn the vendor more money.
  • You needed to impress someone with your incredible technical skills {this could be your peers, quite often it is your boss, but for most of us it is usually ourselves, let’s be honest :-) }
  • You really do need to do something complex for a very valid business reason, like 99.999% availability {eg for that system people can ring up as they have a cough but are convinced they are dying of “swine flu”}.
  • you are creating a generic solution.

What do I mean by the last point? I mean that your one solution has to work for a lot of different situations or usages. The system has to work for a large range of inputs or do a lot of things.

The whole Oracle database  is {in my opinion} a generic solution. A vast and complex one to be sure, but it is intended to work for a little database on a desktop keeping track of the parts in your workshop, an integrated system in eg medical or scientific robots keeping track of thousands of samples, vast data stores of telephone calls so you can do your bills, huge on-line web-based shopping systems, a front end to image or video stores.., the list is a big one. You might need a little Oracle database to hold the list in.

With version 10 Oracle Corp made a big thing about the database looking after itself .  The database was a generic, self-managing, handle-anything solution and you would not need those pesky DBA types to keep the beast working for much longer.

That is why it is so complex and, not matter how much some of us complain {as I myself often do}, it has to be and it is getting more complex with every version. I’ll take my current favorite topic, stats gathering, as an example.

Back with the rule based optimiser, you had a set of rules. 15-16 I think {I’ll let you lot check – google “rule based optimizer oracle -burleson”}. You learnt the rules, understood them and you were home and dry. Except that the RBO could not cope with data-specific oddities, how different access methods were better for different table sizes and index specificity.

So Oracle added statistics and the cost based optimiser. To make use of the cost based logic a load of mathematical calculations and considerations had to be added (and continues to be added), based on statistics you had to collect at the right time and the right level and many sites did not. People complained the CBO “just did not work”, which it did not if you didn’t collect the stats {and sometimes even when you had} but it was doing a lot to cope with a wider range of systems automatically. Histogram stats now allowed skewed data to be coped with, in most situations. 

So they introduced a job to do it for you but it had to detect the right level and type of statistics to gather on all objects, be they tiny tables, massive tables, tables with skewed data, indexes, global indexes on partitioned tables… And yes, once again, it is another complexity you have to get to grips with if it does not fit your particular system demands.

I’m sure you can argue with me over the details, but I think I’m making a valid point that every time a system {be it Oracle or an O/S} is modified to cope automatically with more senarios, it becomes a more complex system. You need a DBA with several manuals welded to their brains to get the best out of this thing now, not less as claimed back at 10’s release {did they make the same claims for 11? I was too busy keeping systems running to really notice at the time}.

Maybe the answer is to stop using generic systems like Oracle and swap them out for a mixture of spreadsheets, MySQL-type simplistic databases, netezza-type systems for datawarehouses, hand cut ‘C’ applications for handling image stores, JAVA apps and flat files for web sites…Only you are going to have to learn how to use all those things to create all the systems you need to create.

You are only going to manage this if you create those dozens of systems as simple ones.

Fuggles was very simple. The lights were on but nobody was home. But that was fine as all she wanted was to sit on you and be scratched.

Fuggles was very simple. The lights were on but nobody was home. But that was fine as all she wanted was to sit on you and be scratched.

Follow

Get every new post delivered to your Inbox.

Join 188 other followers