Friday, February 19, 2010

Cool but unknown RMAN feature

Unknown to me anyway until just this week.

Some time ago I read a post about RMAN on Oracle-L that detailed what seemed like a very good idea.

The poster's RMAN scripts were written so that the only connection while making backups was a local one using the control file only for the RMAN repository.
rman target sys/manager nocatalog

After the backups were made, a connection was made to the RMAN catalog and a SYNC command was issued.

The reason for this was that if the catalog was unavailable for some reason, the backups would still succeed, which would not be the case with this command:


rman target sys/manager catalog rman/password@rcat

This week I found out this is not true.

Possibly this is news to no one but me, but I'm sharing anyway. :)

Last week I cloned an apps system and created a new OID database on a server. I remembered to do nearly everything, but I did forget to setup TNS so that the catalog database could be found.

After setting up the backups vie NetBackup, the logs showed that there was an error condition, but the backup obviously succeeded:

archive log filename=/u01/oracle/oradata/oiddev/archive/oiddev_arch_1_294_709899427.dbf recid=232 stamp=710999909
deleted archive log
archive log filename=/u01/oracle/oradata/oiddev/archive/oiddev_arch_1_295_709899427.dbf recid=233 stamp=710999910
Deleted 11 objects


Starting backup at 16-FEB-10
released channel: ORA_DISK_1
allocated channel: ORA_SBT_TAPE_1
channel ORA_SBT_TAPE_1: sid=369 devtype=SBT_TAPE
channel ORA_SBT_TAPE_1: VERITAS NetBackup for Oracle - Release 6.0 (2008081305)
channel ORA_SBT_TAPE_1: starting full datafile backupset
channel ORA_SBT_TAPE_1: specifying datafile(s) in backupset
including current controlfile in backupset
channel ORA_SBT_TAPE_1: starting piece 1 at 16-FEB-10
channel ORA_SBT_TAPE_1: finished piece 1 at 16-FEB-10
piece handle=OIDDEV_T20100216_ctl_s73_p1_t711086776 comment=API Version 2.0,MMS Version 5.0.0.0
channel ORA_SBT_TAPE_1: backup set complete, elapsed time: 00:00:45
Finished backup at 16-FEB-10

Starting Control File and SPFILE Autobackup at 16-FEB-10
piece handle=c-3982952863-20100216-02 comment=API Version 2.0,MMS Version 5.0.0.0
Finished Control File and SPFILE Autobackup at 16-FEB-10

RMAN> RMAN>

Recovery Manager complete.

Script /usr/openv/netbackup/scripts/oiddev/oracle_db_rman.sh
==== ended in error on  Tue Feb 16 04:07:59 PST 2010  ====


That seemed rather strange, and it was happening in both of the new databases.
The key to this was to look at the top of the log file, where I found the following:

BACKUP_MODE: lvl_0
BACKUP_TYPE: INCREMENTAL LEVEL=0
ORACLE_SID : oiddev
PWD_SID    : oiddev
ORACLE_HOME: /u01/oracle/oas
PATH: /sbin:/usr/sbin:/bin:/usr/bin:/usr/X11R6/bin

Recovery Manager: Release 10.1.0.5.0 - Production

Copyright (c) 1995, 2004, Oracle.  All rights reserved.


RMAN;
connected to target database: OIDDEV (DBID=3982952863)

RMAN;
RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-04004: error from recovery catalog database: ORA-12154: TNS:could not resolve the connect identifier specified

RMAN;
Starting backup at 16-FEB-10
using target database controlfile instead of recovery catalogallocated channel: ORA_SBT_TAPE_1
channel ORA_SBT_TAPE_1: sid=369 devtype=SBT_TAPE
channel ORA_SBT_TAPE_1: VERITAS NetBackup for Oracle - Release 6.0 (2008081305)
channel ORA_SBT_TAPE_1: starting incremental level 0 datafile backupset

Notice the line near the bottom of the displayed output?

The one that says "using target database controlfile instead of recovery catalog" ?

RMAN will go ahead with the backup of the database even though the connection to the catalog database failed.  This apparently only works when running in a scripted environment, as when I tried connecting on the command line RMAN would simply exit when the connection to the catalog could not be made.

The RMAN scripts are being run on a linux server in the following format:

$OH/bin/rman target sys/manager catalog rman/password@rcat <<-EOF >> $LOGFILE

rman commands go here

EOF


This was quite interesting to discover, and my be old news to many of you, but it was new to me.

This is not exactly a new feature either - one of the databases being backed up is 9.2.0.6. And of course there is now no need to update the backup scripts.

9 comments:

Unknown said...

We "discovered" that feature some time ago, but were concerned about how to resync our backup catalog with the backups. What if we needed to restore to the date that it used the control file? Restoring via controlfile can be tricky. You must know your DBID. And another concern is that by default, the controlfile only keeps backups for 7 days; after seven days the controlfile entries are overwritten.

Jared said...

Doing a resync is fairly simple, just fix whatever issue is preventing the connection to the catalog. Subsequent backups will cause a resync.

You could also connect manually to the catalog and the target via RMAN and and run the RESYNC CATALOG command.

Regarding the default value for control_file_record_keep_time, I set that to 60 on production databases, or any database that I may have to someday recover.

Another thing I think it a good idea to set:

CONFIGURE CONTROLFILE AUTOBACKUP ON;

This does somewhat simplify restoring a control file, though that topic could take a whole 'nuther (lengthy) blog entry.

Keeping a record of your Database DBID's is probably a good idea. I do that, but haven't checked it lately to see if it needs updating.

Thanks for the reminder. :)

Howard said...

Yes my worry exatcly - can controlfile catalogs - for it is there that the go be synced with RMAN database catalogs?

Voldy said...

I found out about this feature during a rather large outage that took down, among other servers, the RMAN catalog server. I expected every DB Backup to fail, but was gladly surprised to find that they hadn't. It was not fun to resync all the databases later, but we did not lose any backups... which during a major incident is a good thing.

Also, I found out that controlfile autobackup is very important... the few DBs that didn't have it set to ON were a pain to recover.

One interesting side note, TSM was actually of much help in locating the proper backups from where to restore the controlfiles. After that, recovery went smoothly.

Log Buffer said...

[...]Jared Still sheds some light on a cool but unknown RMAN feature. [...]

Amit Verma said...

Would you be able to post your script here? I moved my rman repository to a new server, updated all but one of the scripts and it failed. Looking at your post I would assume that it would continue the backup, but it didn't. May be it is a version issue.

Jared said...

Here is the template used to generate the RMAN commands, it should be enough to see how the RMAN commands are setup.


#
BACKUP $BACKUP_TYPE FORMAT "'${FORMAT_PREFIX}_db_${FORMAT_SUFFIX}'" DATABASE
SQL "'ALTER SYSTEM ARCHIVE LOG CURRENT'"
BACKUP FORMAT "'${FORMAT_PREFIX}_arch_${FORMAT_SUFFIX}'" ARCHIVELOG ALL NOT BACKED UP 2 TIMES
DELETE NOPROMPT ARCHIVELOG ALL BACKED UP 2 TIMES TO DEVICE TYPE sbt
BACKUP FORMAT "'${FORMAT_PREFIX}_ctl_${FORMAT_SUFFIX}'" CURRENT CONTROLFILE

Amit Verma said...

Thanks. I am not sure (no place to test) but there could be a difference in the way rman behaves as it pertains to this, if you use shell to create the whole rman script OR create a cmdfile and pass it to rman executable. We use the latter. I have to find a test machine to test it.

Jared said...

The script I posted is just the RMAN portion that is executed by a script.
In our case the RMAN command line and RMAN script are generated dynamically , stuffed into a shell variable and then executed.