raccheck 2.2 beta released

raccheck 2.2 beta released
v. 2.2.0 Beta

Support for Single Instance Configurations (i.e., no longer limited to RAC configurations)
High Availability (HA) Best Practices
New checks and bug fixes

This new version was prodiced with collaboration with ORACLE HA team in Support to add their Top 20 checks for common problems related to backup and recovery and Data Guard. These new checks augment the MAA Scorecard. These new checks can be obtained with the following syntax:

./raccheck -r – includes the standard health checks as well as HA checks.
./raccheck -c hacheck -o -v – abbreviates the report output to ONLY the HA related checks

raccheck can be downloaded from RACcheck – RAC Configuration Audit Tool (Doc ID 1268927.1)

CREDIT: Bob Caldwell

November 2, 2012

Posted In: Data Guard, Installs, RAC, RMAN, Scripts

Tags: , , ,

11gR2 clients connect to the database using SCANs

If you’ve extended your RAC cluster on a set of new nodes you already know how painful it can be to have to go through the list of your clients and make sure their SQL*Net configuration is up to date. 11gR2 solves this problem using Single Client Access Name (SCAN).

The single client access name (SCAN) is a hostname used to provide service access for clients to the cluster. Because the SCAN is associated with the cluster as a whole, rather than to a particular node, the SCAN makes it possible to add or remove nodes from the cluster without needing to reconfigure clients. It also adds location independence for the databases, so that client configuration does not have to depend on which nodes are running a particular database. Clients can continue to access the cluster in the same way as with previous releases, but Oracle recommends that clients accessing the cluster use the SCAN.

Reference: 1.3.2.2 IP Address Requirements

How is SCAN implemented?

For high availability purposes the SCAN name should be associated with at least three IP addresses using DNS round-robin resolution. If you opt to use Grid Naming Service then GNS can also be used to manage the SCAN name.

SCAN is configured at a cluster level not at the node level, that’s what makes it so flexible — no mater how many nodes your clusters consists of, your clients can continue to use SCAN to access the services of your cluster utilizing all nodes even if you add or delete them:

The SCAN is a virtual IP name, similar to the names used for virtual IP addresses, such as node1-vip. However, unlike a virtual IP, the SCAN is associated with the entire cluster, rather than an individual node, and associated with multiple IP addresses, not just one address.

SCAN works as an independent handler for the entire cluster — it acts on client’s behalf during connection request since it knows all cluster services and it’s available, least loaded nodes:

The SCAN works by being able to resolve to multiple IP addresses reflecting multiple listeners in the cluster handling public client connections. When a client submits a request, the SCAN listener listening on a SCAN IP address and the SCAN port is contracted on a client’s behalf. Because all services on the cluster are registered with the SCAN listener, the SCAN listener replies with the address of the local listener on the least-loaded node where the service is currently being offered. Finally, the client establishes connection to the service through the listener on the node where service is offered. All of these actions take place transparently to the client without any explicit configuration required in the client.

Bottom line – use SCAN – it simplifies cluster management:

Because the SCAN addresses resolve to the cluster, rather than to a node address in the cluster, nodes can be added to or removed from the cluster without affecting the SCAN address configuration.

Reference: D.1.3.5 About the SCAN

If you found this article helpful and would like to receive more like it as soon as I release them make sure to sign up to my newsletter below:

SUBSCRIBE

September 10, 2009

Posted In: RAC

Tags: , , ,

11gR2 – raw and block devices – no longer supported

I was just reading up on the 11gR2 documentation for Grid Infrastructure Installation and finally we have a closure on the topic of RAW and BLOCK devices for OCR and VOTING disks:

With this release, OUI no longer supports installation of Oracle Clusterware files on block or raw devices. Install Oracle Clusterware files either on Automatic Storage Management diskgroups, or in a supported shared file system.

For new installations, OCR and voting disk files can be placed either on ASM, or on a cluster file system or NFS system. Installing Oracle Clusterware files on raw or block devices is no longer supported, unless an existing system is being upgraded.

REFERENCE: What’s New in Oracle Grid Infrastructure Installation and Configuration?

Perfect timing! I was just mulling over what to do with OCR/VOTING on my upcoming SAN-based RAC install — now it’s clear — use 11gR2 and store them on ASM.

September 10, 2009

Posted In: RAC

Tags: , ,

Using Flashback Database to strengthen Data Guard Setup

MAA recommends to enable Flashback Database on the primary prior to FailOver:

Enable Flashback Database to reinstate failed production databases after a
failover operation has completed. Flashback Database provides a second
very significant function, enabling fast point in time recovery if needed.

See: MAA_WP_10gR2_SwitchoverFailoverBestPractices.pdf

Enabling Flashback Database involves setting up flash recovery area and setting a flashback retention target which specifies how far back you want to be able to restore your database using the Flashback Database feature.

Once Flashback Database is setup, the database starts to copy images of each altered block into the flashback logs — this works for all datafiles. When it’s time to Flashback the database, the copies of the blocks from flashback logs are used to reconstruct the datafiles to a state just prior to the desired flashback time, the redo/arch logs are then used to bring a datafiles to a consistent state.

WARNING::
Redo logs must be available for the entire time period spanned by the
flashback logs, whether on tape or on disk. (In practice, however, redo
logs are generally needed much longer than the flashback retention target
to support point-in-time recovery.)

There are also a number of operations you can perform on your database, such
as dropping a tablespace or shrinking a datafile, which cannot be reversed
with Flashback Database. After such an operation, the flashback database
window begins at the time immediately following that operation.

One thing to consider is that the only way to guarantee a database can be returned to a specific point in time is to use guaranteed restore points. In other words, don’t use “normal” restore points for this purpose and don’t rely on Flashback Database alone. The only constraint to how far you can go back, is the size of your disk space in the flash recovery area.

Again, WARNING:
Limitations that apply to Flashback Database also apply to guaranteed
restore points. For example, shrinking a datafile or dropping a tablespace
can prevent flashing back the affected datafiles to the guaranteed restore
point.

Creating a guaranteed restore point without having enough sufficient free space in the flash recovery area (FRA) will cause the FRA to fill completely, that’s because “No file in the flash recovery area is eligible for deletion if it is required to satisfy the guarantee [restore point]“. In many circumstances, this will cause your database to halt.

To save on space in FRA you can disable the Flashback Database and still create a guaranteed restore point. In this case, the first time a datafile block is modified, an image of this block before the modification is stored in the flashback logs. This saves on space because only one-time copy of every changed data block is stored there, but subsequent modifications to the same block do not cause the block contents to be logged again. This method works really well and it’s more efficient as long as your primary only needs to be able to return to the specific point in time at which the guaranteed restore point was created, such as to a before-state of a failed application upgrade that might have made changes to a database.

See: 5.1.1 About Flashback Database

That’s theory, how about some practice time? Follow along. First, we verify our setup (in this case no FRA [db_recovery_file_dest] is setup:

rac1.XRACP1-> sqlplus /nolog

SQL*Plus: Release 10.2.0.4.0 - Production on Fri Jun 12 19:25:07 2009

Copyright (c) 1982, 2007, Oracle.  All Rights Reserved.

SQL> connect / as sysdba
Connected.
SQL>    select * from v$recovery_file_dest;

no rows selected

SQL> show parameter recovery

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
db_recovery_file_dest                string
db_recovery_file_dest_size           big integer 0
recovery_parallelism                 integer     0
SQL> select * from v$flash_recovery_area_usage;

no rows selected

SQL>

Lets create FRA (NOTE: since this is a RAC database, I am creating FRA on a clustered FS (OCFS2) /u02):

[root@rac1 log]# mkdir -p /u02/oradata/rcv_area
[root@rac1 log]# chown -R oracle:dba /u02/oradata

rac1.XRACP1-> df -k /u02
Filesystem           1K-blocks      Used Available Use% Mounted on
/dev/mapper/2000b080023002235p1
                      37899136   1971200  35927936   6% /u02
rac1.XRACP1->


SQL> alter system set db_recovery_file_dest_size=28g scope=both sid='*';

System altered.

SQL> alter system set db_recovery_file_dest='/u02/oradata/rcv_area' scope=both sid='*';

System altered.

SQL>

SQL> set lines 132
SQL> col name format a35
SQL> set trims on
SQL> select * from v$recovery_file_dest;

NAME                                SPACE_LIMIT SPACE_USED SPACE_RECLAIMABLE NUMBER_OF_FILES
----------------------------------- ----------- ---------- ----------------- ---------------
/u02/oradata/rcv_area                3.0065E+10          0                 0               0

SQL> select * from v$flash_recovery_area_usage;

FILE_TYPE    PERCENT_SPACE_USED PERCENT_SPACE_RECLAIMABLE NUMBER_OF_FILES
------------ ------------------ ------------------------- ---------------
CONTROLFILE                   0                         0               0
ONLINELOG                     0                         0               0
ARCHIVELOG                    0                         0               0
BACKUPPIECE                   0                         0               0
IMAGECOPY                     0                         0               0
FLASHBACKLOG                  0                         0               0

6 rows selected.

SQL>

So far we setup the FRA, now lets try to create guaranteed restore point. One thing to remember here though is this — if flashback database is not enabled (to save on space), then the database must be mounted, not open, when creating the first guaranteed restore point (or if all previously created guaranteed restore points have been dropped). If you attempt to create a guaranteed restore point when the database is opened you get ORA-38787 error:

SQL> CREATE RESTORE POINT gr_01 GUARANTEE FLASHBACK DATABASE;
CREATE RESTORE POINT gr_01 GUARANTEE FLASHBACK DATABASE
*
ERROR at line 1:
ORA-38784: Cannot create restore point 'GR_01'.
ORA-38787: Creating the first guaranteed restore point requires mount mode when flashback database is off.

A normal restore point however, works, but it will not do anything since flashback database is off, here’s an example:

SQL> !ls -l /u02/oradata/rcv_area
total 0

SQL> CREATE RESTORE POINT nr_01;

Restore point created.

SQL> !ls -l /u02/oradata/rcv_area
total 0

SQL> create table xyz(t number) tablespace tools;

Table created.

SQL> !ls -l /u02/oradata/rcv_area
total 0


SQL> insert into xyz values (1000);

1 row created.

SQL> commit;

Commit complete.

SQL> !ls -l /u02/oradata/rcv_area
total 0

SQL>

As you can see nothing happened … even though a normal restore point is created. Lets drop it (this syntax works for both NORMAL and GUARANTEED RPs):

SQL> drop restore point nr_01;

Restore point dropped.


SQL> SELECT NAME, SCN, TIME, DATABASE_INCARNATION#,
           GUARANTEE_FLASHBACK_DATABASE,STORAGE_SIZE
           FROM V$RESTORE_POINT;
  2    3
no rows selected

Now lets create GUARANTEED restore point WITHOUT enabling FLASHBACK DATABASE which requires database to be mounted only, and since this is RAC database all other instances must be shutdown first as well:

rac1.XRACP1-> srvctl stop database -d XRACP
rac1.XRACP1-> sqlplus /nolog

SQL*Plus: Release 10.2.0.4.0 - Production on Fri Jun 12 20:01:53 2009

Copyright (c) 1982, 2007, Oracle.  All Rights Reserved.

SQL> connect / as sysdba
Connected to an idle instance.
SQL> startup mount;
ORACLE instance started.

Total System Global Area 1610612736 bytes
Fixed Size                  2084296 bytes
Variable Size             822084152 bytes
Database Buffers          771751936 bytes
Redo Buffers               14692352 bytes
Database mounted.
SQL> CREATE RESTORE POINT gr_01 GUARANTEE FLASHBACK DATABASE;

Restore point created.

SQL> shutdown immediate;
ORA-01109: database not open


Database dismounted.
ORACLE instance shut down.
SQL> exit
Disconnected from Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - 64bit Production
With the Partitioning, Real Application Clusters, OLAP, Data Mining
and Real Application Testing options
rac1.XRACP1-> srvctl start database -d XRACP
rac1.XRACP1-> srvctl start service -d XRACP
rac1.XRACP1->

Now lets see what we can gather from V$RESTORE_POINT:

SQL> col name format a15
col time format a35
set lines 132
set trims on
SQL> SQL> SQL> SQL>
SQL>
SQL>
SQL>
SQL> SELECT NAME, SCN, TIME, DATABASE_INCARNATION#,
           GUARANTEE_FLASHBACK_DATABASE,STORAGE_SIZE
           FROM V$RESTORE_POINT;
  2    3
NAME                   SCN TIME                                DATABASE_INCARNATION# GUA STORAGE_SIZE
--------------- ---------- ----------------------------------- --------------------- --- ------------
GR_01           9.0317E+12 12-JUN-09 08.02.19.000000000 PM                         1 YES     63766528

SQL>
   

rac1.XRACP1-> find /u02/oradata/rcv_area -ls
5161314    4 drwxr-xr-x   3 oracle   dba          4096 Jun 12 20:02 /u02/oradata/rcv_area
5161316    4 drwxr-x---   3 oracle   oinstall     4096 Jun 12 20:02 /u02/oradata/rcv_area/XRACP
5161317    4 drwxr-x---   2 oracle   oinstall     4096 Jun 12 20:03 /u02/oradata/rcv_area/XRACP/flashback
5161318 15576 -rw-r-----   1 oracle   oinstall 15949824 Jun 12 20:04 /u02/oradata/rcv_area/XRACP/flashback/o1_mf_5365ov16_.flb
237846 15576 -rw-rw----   1 oracle   oinstall 15949824 Jun 12 20:03 /u02/oradata/rcv_area/XRACP/flashback/o1_mf_5365qm48_.flb
4128929 15576 -rw-rw----   1 oracle   oinstall 15949824 Jun 12 20:03 /u02/oradata/rcv_area/XRACP/flashback/o1_mf_5365qtxy_.flb
7227425 15576 -rw-rw----   1 oracle   oinstall 15949824 Jun 12 20:03 /u02/oradata/rcv_area/XRACP/flashback/o1_mf_5365qtyd_.flb
rac1.XRACP1->

Much better now — it is working, and a few days later:

SQL> !date;ls -l /u02/oradata/rcv_area/XRACP/flashback
Mon Jun 15 15:02:17 PDT 2009
total 358248
-rw-r-----  1 oracle oinstall 15949824 Jun 13 04:00 o1_mf_5365ov16_.flb
-rw-rw----  1 oracle oinstall 15949824 Jun 12 20:03 o1_mf_5365qm48_.flb
-rw-rw----  1 oracle oinstall 15949824 Jun 12 20:03 o1_mf_5365qtxy_.flb
-rw-rw----  1 oracle oinstall 15949824 Jun 12 20:03 o1_mf_5365qtyd_.flb
-rw-rw----  1 oracle oinstall 15949824 Jun 12 22:00 o1_mf_536dm0jm_.flb
-rw-rw----  1 oracle oinstall 15949824 Jun 12 22:01 o1_mf_536dmnxj_.flb
-rw-rw----  1 oracle oinstall 15949824 Jun 13 02:00 o1_mf_536dntxs_.flb
-rw-rw----  1 oracle oinstall 15949824 Jun 12 23:03 o1_mf_536j9b8b_.flb
-rw-rw----  1 oracle oinstall 15949824 Jun 12 23:28 o1_mf_536j9jvj_.flb
-rw-rw----  1 oracle oinstall 15949824 Jun 13 06:00 o1_mf_536kqnmf_.flb
-rw-rw----  1 oracle oinstall 15949824 Jun 13 15:10 o1_mf_536tnpky_.flb
-rw-rw----  1 oracle oinstall 15949824 Jun 13 21:00 o1_mf_5371or32_.flb
-rw-rw----  1 oracle oinstall 15949824 Jun 13 23:03 o1_mf_5378qrx2_.flb
-rw-rw----  1 oracle oinstall 15949824 Jun 14 12:00 o1_mf_537m8tox_.flb
-rw-rw----  1 oracle oinstall 15949824 Jun 14 07:33 o1_mf_5388yvsd_.flb
-rw-rw----  1 oracle oinstall 15949824 Jun 14 19:00 o1_mf_538xgm1v_.flb
-rw-rw----  1 oracle oinstall 15949824 Jun 14 02:00 o1_mf_5394pdx7_.flb
-rw-rw----  1 oracle oinstall 15949824 Jun 14 23:04 o1_mf_539h0tcs_.flb
-rw-rw----  1 oracle oinstall 15949824 Jun 15 04:00 o1_mf_53b2lorr_.flb
-rw-rw----  1 oracle oinstall 15949824 Jun 15 15:00 o1_mf_53bl5v8y_.flb
-rw-rw----  1 oracle oinstall 15949824 Jun 15 15:00 o1_mf_53cbs5hs_.flb
-rw-rw----  1 oracle oinstall 15949824 Jun 15 15:00 o1_mf_53cs3y00_.flb
-rw-rw----  1 oracle oinstall 15949824 Jun 15 15:00 o1_mf_53dbfmx5_.flb

SQL> SELECT NAME, SCN, TIME, DATABASE_INCARNATION#,
              GUARANTEE_FLASHBACK_DATABASE,STORAGE_SIZE
              FROM V$RESTORE_POINT;  2    3

NAME                   SCN TIME                                DATABASE_INCARNATION# GUA STORAGE_SIZE
--------------- ---------- ----------------------------------- --------------------- --- ------------
GR_01           9.0317E+12 12-JUN-09 08.02.19.000000000 PM                         1 YES    366657536

SQL> select * from v$flash_recovery_area_usage;

FILE_TYPE    PERCENT_SPACE_USED PERCENT_SPACE_RECLAIMABLE NUMBER_OF_FILES
------------ ------------------ ------------------------- ---------------
CONTROLFILE                   0                         0               0
ONLINELOG                     0                         0               0
ARCHIVELOG                    0                         0               0
BACKUPPIECE                   0                         0               0
IMAGECOPY                     0                         0               0
FLASHBACKLOG               1.22                         0              23

6 rows selected.

SQL>

And that’s how you create a guaranteed restore point. But what do you use it for, with regards to STANDBY FAILOVER? You really shouldn’t, because you don’t know when a FAILOVER occurs so you can’t create a point right before it. You should however, use a guaranteed restore point during a SWITCHOVER because during a switchover you are in full control of both primary and standby databases. In a FAILOVER scenario, enabling Flashback Database [ ALTER DATABASE FLASHBACK ON; ] is more effective than trying to create guaranteed restore points.

Still have questions? Here are pointers to additional reading with excerpts:

From Note:565535.1 titled “Flashback Database Best Practices & Performance”:

Manual primary database reinstate- if Data Guard is being used without the
fast-start failover feature and a Data Guard failover is necessary, then
flashback database can be used to manually reinstate the failed primary
database. This is documented in the Data Guard Administration and Concepts
guide – see “12.4 Using Flashback Database After a Failover” here:

And from 12.4 Using Flashback Database After a Failover:

After a failover occurs, the original primary database can no longer
participate in the Data Guard configuration until it is repaired and
established as a standby database in the new configuration. To do
this, you can use the Flashback Database feature to recover the
failed primary database to a point in time before the failover
occurred, and then convert it into a physical or logical standby
database in the new configuration.

August 22, 2009

Posted In: Data Guard, RAC

Tags: , , , , ,

Oracle RAC’s share everything vs share nothing …

Google’s share nothing approach to application development has lead to the #1 search engine solution both in performance and functionality. Notice that I said “application development” because for the share nothing approach to work it needs to be built into the application from day one not as an afterthought.

On the other end of the spectrum we have ERP APPS where design with thousands of tables per module are the norm and the UNION ALL joins span a multi-page printout. In these types of applications Oracle RAC’s “share everything” approach is clearly superior, scratch that, it’s the only solution, period.

For an interesting read on this specific issue take a look at Kevin Closson’s post titled “Nearly Free or Not, GridSQL for EnterpriseDB is Simply Better Than Real Application Clusters. It is Shared-Nothing Architecture After All!” and a sort of reply to it by a blogger called “bonglonglong” titled “All in the assumptions“.

June 15, 2009

Posted In: RAC

Tags: , , ,