The Blog from the DBA Classroom

By: Joel Goodman

More on Oracle Restart

Posted by Joel Goodman on 06/09/2010

In my last post entitled “Getting Started with Oracle Restart” I discussed the architecture of Oracle Restart, which is a component of Grid Infrastructure for a Standalone Server, and contrasted it with that of Oracle Clusterware, which is a component of Grid Infrastructure for a Cluster. I am pleased to note that Uwe Hesse,  my friend and colleague at Oracle University,  has written a post called “Data Guard & Oracle Restart in 11gR2” where he discusses in detail the use of Oracle Restart for Role Based Services in Data Guard. It is an excellent article on this feature of Data Guard,  and requires the use of Oracle Restart.

I think this is essential reading and follows on nicely from my article, so please visit Uwe’s blog and have a look.

Joel

September 2010

Posted in Oracle | Leave a Comment »

Get Started With Oracle Restart

Posted by Joel Goodman on 13/08/2010

In my previous post called “How Does Your Infrastructure Stack Up” , I discussed the Architecture of Oracle Grid Infrastructure for a cluster, and explained the concepts of the lower and upper software stacks, as well as those components managed by the two stacks.  This software, which is a combination of Oracle Clusterware and Automatic Storage Management (ASM), is required for running Oracle Databases with the Real Application Clusters (RAC) option, and may be used to provide high availability for other Oracle and third party software too.

Grid Infrastructure is also available for installation on a stand-alone server and consists of the following:

  1. A lower stack only,  managed by the Oracle High Availability Serviced daemon OHASD
  2. A set of agents to manage root-owned and non root-owned resources.
  3. A subset of the clusterware to manage resources on a standalone server
  4. A subset of the clusterware to facilitate ASM communication with Oracle Database (RDBMS) Instances

As there is only a single stack in Grid Infrastructure for a Standalone Server, OHASD is used to manage it and there is no CRSD process required. The components under control of OHASD are:

  1. CSSD – This is used for Group Services as it was in previous releases (when it was installed using “localconfig add“)
  2. ASM Instance - if Automatic Storage Management is used.
  3. ASM Disk Groups - if Automatic Storage Management is used.
  4. Listeners
  5. Database Instances
  6. Database Services
  7. ONS/EONS – Used for automatic failover of connections  using Fast Application Notification (FAN) in a Data Guard environment

Grid Infrastructure for a standalone server is a combination of a Clusterware subset known as “Oracle Restart” and ASM.  The OHASD is a daemon which starts and monitors the Oracle Restart daemons themselves. It is started by init using the /etc/init.d/ohasd script and starts the ohasd.bin executable as root. It is described in the documentation in the Administrators guide in a section called “About Oracle Restart“. No OCR exists for this stack but there is an OLR containing the local resource definitions and this is used by the ohasd daemons.

There are many benefits to using Oracle Restart:

  1. Automatic resource startup at boot time without using shell scripts or the Oracle supplied dbstart and dbshut scripts.
  2. Resources are started in the correct sequence based on dependencies in the OLR.
  3. Resources are also monitored by ohasd for availability and may be restarted in place if they fail.
  4. Role managed services for Data Guard.
  5. Consistency of command line interfaced tools using crsctl and srvctl as is done with clusters.

A list of commands used to manage the Oracle Restart stack are  listed in the documentation here.  The CRSCTL utility is used but instead of referring to CRS, the commands refer to HAS. These commands are only for use in an  Oracle Restart environment:

  1. crsctl start has – to manually start the Oracle Restart stack when running disabled or after manually stopping it
  2. crsctl stop has [-f] – to manually stop the Oracle Restart stack. The -f option
  3. crsctl enable has – to enable the stack for automatic startup at server reboot
  4. crsctl disable has – to disable the stack for automatic startup at server reboot
  5. crsctl config has - to display the configuration of Oracle Restart
  6. crsctl check has – to check the current status of Restart

Some of the crsctl commands used for clusters may also be used for Oracle Restart. A typical example follows but there are many more:

  • crsctl stat res -t – to check the status of the resources in the Oracle Restart stack managed by OHASD

Note also that “crsctl stat res -init -t” does not work for Oracle Restart as there is only one stack, and therefore no need to indicate which stack is to be displayed.

Here is a typical output of the OHASD stack for Oracle Restart with ASM, some Disk Groups, a single database instance and a service:

—————————————————————————————————————————-

NAME                   TARGET  STATE           SERVER                   STATE_DETAILS

—————————————————————————————————————————-

Local Resources

—————————————————————————————————————————–

ora.CELL_DG.dg

ONLINE  ONLINE       prutser2

ora.DATA.dg

ONLINE  ONLINE       prutser2

ora.FRA.dg

ONLINE  ONLINE       prutser2

ora.LISTENER.lsnr

ONLINE  ONLINE       prutser2

ora.asm

ONLINE  ONLINE       prutser2                     Started

—————————————————————————————————————————-

Cluster Resources

—————————————————————————————————————————-

ora.cssd

1                   ONLINE  ONLINE       prutser2

ora.diskmon

1                   ONLINE  ONLINE       prutser2

ora.ora11g.db

1                   ONLINE  ONLINE       prutser2                     Open

ora.ora11g.ora11g.svc

1                   ONLINE  ONLINE       prutser2

In addition srvctl is used to manage the resources in the various Oracle Home directories that are controlled by Oracle Restart and these are in the same section of the documentation here. The commands used are the same as for a cluster except that resources such as database instances, will be configured on only the one node.

Even if ASM is not used, Oracle Restart is a useful addition to Oracle Database and resource management. So get started!

Joel

August 2010

Posted in Oracle | 2 Comments »

How Does Your Infrastructure Stack Up?

Posted by Joel Goodman on 02/06/2010

The 11g Release 2 version of the Oracle RAC and Grid Infrastructure course went live earlier this year, and has generated much discussion concerning several aspects of the release. In this post, I would like to share some observations about the software based on  my research and teaching experience during the past five months. The new release of the Grid Infrastructure consists of:

  1. A New “Local” resource management layer, known as the OHASD
  2. A new set of Agents which replace the RACG Layer
  3. Support for new features: Grid Plug and Play, Grid Naming Service, Grid IPC and Cluster Time Synchronisation Service
  4. Integration of ASM and the Clusterware to form the Grid Infrastructure
  5. A reworked Cluster Ready Services Daemon (CRSD)
  6. Automatically managed Server Pools
  7. Support for Intelligent Platform Management Interface (IPMI), for node fencing and node termination

This is an extensive change to the clusterware from previous releases, and is a very large topic so in this blog post, I will restrict the discussion to the New Local Resource management layer, called the “Lower Stack” and how it relates to the “Upper Stack“.

The Lower Stack – Managed by OHASD

The 11gR2 Grid Infrastructure consists of a set of daemon processes which execute on each cluster node;  the voting and OCR files, and protocols used to communicate across the interconnect. Prior to 11gR2, there were various scripts run by the init process to start and monitor the health of the clusterware daemons. From 11gR2, the Oracle High Availability Services Daemon (OHASD) replaces these. The OHASD starts, stops and checks the status of all the other daemon processes that are part of the clusterware using new agent processes listed here:

  1. CSSDAGENT – used to start,stop and check status of  the CSSD resource
  2. ORAROOTAGENT – used to start “Lower Stack” daemons that must run as root: ora.crsd, ora.ctssd, ora.diskmon, ora.drivers.acfs, ora.crf
  3. ORAAGENT – used to start “Lower Stack” daemons that run as the grid owner: ora.asm, ora.evmd, ora.gipcd, ora.gpnpd, ora.mdnsd
  4. CSSDMONITOR - used to monitor the CSSDAGENT

The OHASD is essentially a daemon which starts and monitors the clusterware daemons themselves. It is started by init using the /etc/init.d/ohasd script and starts the ohasd.bin executable as root. The Oracle documentation lists the “Lower Stack” daemons where they are referred to as the “The Oracle High Availability Services Stack” and notes which agent is responsible for starting and monitoring each specific daemon. It also explains the purpose of each of the stack components. (Discussions of some of these components will feature in future blog posts.) If the grid infrastructure is enabled on a node, then OHASD starts the “Lower Stack” on that node at boot time. If disabled, then the “Lower Stack” is started manually. The following commands are used for these operations:

  • crsctl enable crs – enables autostart at boot time
  • crsctl disable crs – disables autostart at boot time
  • crsctl start crs - manually starts crs on the local node

The “Lower Stack” consists of daemons which communicate with their counterparts on other cluster nodes. These daemons must be started in the correct sequence, as some of them depend on others. For example, the Cluster Ready Services Daemon (CRSD), may depend on ASM being available if the OCR file is stored in ASM. Clustered ASM in turn, depends on the Cluster Synchronisation Services Daemon(CSSD),  as the CSSD must be started in order for  clustered ASM to start up. This dependency tree is similar to that which already existed for the resources managed by the CRSD itself,  known as the “Upper Stack“, which will be discussed later in this post.

To define the dependency tree for the “Lower Stack“, a local repository called the OLR is used. This contains the metadata required by OHASD to join the cluster and configuration details for the local software. As a result,  OHASD can start the “Lower Stack” daemons without reference to the OCR. To examine the OLR use the following command, and then examine the dump file produced:

  • ocrdump -local <FILENAME>

Another benefit of the OHASD, is that there is a daemon running on each cluster node whether or not the “Lower Stack” is started. As long as the OHASD daemon is running, then the following commands may be used in 11gR2:

  1. crsctl check has – check the status of the OHASD
  2. crsctl check crs - check the status of the OHASD, CRSD, CSSD and EVMD
  3. crsctl check cluster – all – this checks the “Lower Stack” on all the nodes
  4. crsctl start cluster – this attempts to start the “Lower Stack” on all the nodes
  5. crsctl stop cluster - this attempts to stop the “Lower Stack” on all the nodes

Here are some examples:

# crsctl check has
CRS-4638: Oracle High Availability Services is online

# crsctl check crs
CRS-4638: Oracle High Availability Services is online
CRS-4537: Cluster Ready Services is online
CRS-4529: Cluster Synchronization Services is online
CRS-4533: Event Manager is online

# crsctl check cluster -all
**************************************************************
racn1:
CRS-4537: Cluster Ready Services is online
CRS-4529: Cluster Synchronization Services is online
CRS-4533: Event Manager is online
**************************************************************
racn2:
CRS-4537: Cluster Ready Services is online
CRS-4529: Cluster Synchronization Services is online
CRS-4533: Event Manager is online

**************************************************************

racn3:
CRS-4537: Cluster Ready Services is online
CRS-4529: Cluster Synchronization Services is online
CRS-4533: Event Manager is online
**************************************************************

To check the status of the “Lower Stack“  resources use the following:

  • crsctl stat res -init -t

An example is shown here:

# crsctl stat res -init -t

NAME                TARGET STATE  SERVER  STATE_DETAILS
————————————————————————————-

————————————————————————————-
Cluster Resources
————————————————————————————-
ora.asm              ONLINE ONLINE racn1    Started

ora.crsd              ONLINE ONLINE racn1

ora.cssd              ONLINE ONLINE racn1

ora.cssdmonitor   ONLINE ONLINE racn1

ora.ctssd             ONLINE ONLINE racn1    OBSERVER

ora.diskmon        ONLINE ONLINE racn1

ora.drivers.acfs    ONLINE ONLINE racn1

ora.evmd             ONLINE ONLINE racn1

ora.gipcd             ONLINE ONLINE racn1

ora.gpnpd            ONLINE ONLINE racn1

ora.mdnsd           ONLINE ONLINE racn1

To start the CSSD Daemon requires access to the Voting Files which may be stored in ASM. But a clustered ASM Instance may not start until the node has joined the cluster which requires that CSSD be up. To get around this problem, ASM Diskgroups are flagged to indicate that they contain Voting Files. The ASM Discovery string is contained in the OLR and used to scan for the ASM Disks when CSSD starts. The scan locates the flags indicating the presence of Voting Files which are stored at a fixed location in the ASM Disks. This process does not require the ASM instance to be up. Once the Voting Files are found by this scanning process, CSSD can access them, join the cluster and then the ORAAGENT can start the Clustered ASM Instance.

To check the location of the Voting Files use the following:

# crsctl query css votedisk
## STATE File Universal Id                               File Name                                Disk group
– —– ——————————————————————————————————
1. ONLINE 5b91aad0a2184f3dbfa8f970e8ae4d49 (/dev/oracleasm/disks/VOTE1) [VOTEDG]
2. ONLINE 53b1b40b73164f9ebf3f498f6d460187 (/dev/oracleasm/disks/VOTE2) [VOTEDG]
3. ONLINE 82dfd04b96f14f6dbf36f5a62b118f61 (/dev/oracleasm/disks/VOTE3) [VOTEDG]


The Upper Stack – Managed by CRSD

The “Upper Stack” consists of the daemons and resources managed by the Grid Infrastructure, once it is up and running. It uses the same architecture as OHASD, but CRSD uses its own threads of the agents to start up, stop and check the status of  the daemons and resources as follows:

  1. ORAROOTAGENT – used to start “Upper Stack” daemons that must run as root: GNS, VIP, SCAN VIP and network resources
  2. ORAAGENT – used to start “Upper Stack” daemons that run as grid owner: ora.asm, ora.eons, ora.LISTENER.lsnr, SCAN listeners, ora.ons, ASM  Diskgroups, Database Instances, Database Services. It is also used to publish High Availability events to interested clients and manages Cluster Ready Service changes of state.

The resources managed by the CRSD for the “Upper Stack” are also listed in the Oracle Documentation where they are referred to as “The Cluster Ready Services Stack” and consist of familiar resources such as Database Instances, Database Services and NodeApps such as Node Listeners. There are also some new resources such as the Single Client Access Name (SCAN), SCAN Vips, Grid Naming Service (GNS), GNS Vips and Network Resources. Some of these will be the subject of future Blog posts.

The resources managed by the “Upper Stack” are in the OCR file which may be stored in ASM. Since the Clustered ASM Instance is started by OHASD after CSSD is started but before CRSD is started, access to the OCR by CRSD is done as a normal client of ASM. The OCR file may be seen as a file in ASM, unlike the Voting Files which are not “visible” when looking at the ASM directory contents using either Enterprise Manager or the ASMCMD utility.

To check the location of the OCR do the following:

# cat /etc/oracle/ocr.loc
ocrconfig_loc=+DATA
local_only=FALSE

CRSD Resource Categories

CRSD resources are categorised as “Local Resources” or”Cluster Resources“. Local Resources are activated on a specific node and never fail over to another node. For example, an ASM Instance exists on each node, so if a node fails, then the ASM Instance that was on that node will not fail over to a surviving node. Likewise, a Node Listener exists for each node and does not fail over. These two resource types are therefore “Local Resources“.  SCAN Listeners however, may fail over as may Database Instances or the GNS (if used), so these are “Cluster Resources

Finally to check the status of the “Upper Stack” resources and daemons, do the following:

# ./crsctl status resource -t
——————————————————————————–
NAME           TARGET  STATE        SERVER                   STATE_DETAILS
——————————————————————————–
Local Resources
——————————————————————————–
ora.DATADG.dg
ONLINE  ONLINE       racn1
ONLINE  ONLINE       racn2
ora.LISTENER.lsnr
ONLINE  ONLINE       racn1
ONLINE  ONLINE       racn2
ora.FRADG.dg
ONLINE  ONLINE       racn1
ONLINE  ONLINE       racn2
ora.asm
ONLINE  ONLINE       racn1                  Started
ONLINE  ONLINE       racn2                  Started
ora.eons
ONLINE  ONLINE       racn1
ONLINE  ONLINE       racn2
ora.gsd
OFFLINE OFFLINE      racn1
OFFLINE OFFLINE      racn2
ora.net1.network
ONLINE  ONLINE       racn1
ONLINE  ONLINE       racn2
ora.ons
ONLINE  ONLINE       racn1
ONLINE  ONLINE       racn2
ora.registry.acfs
ONLINE  ONLINE       racn1
ONLINE  ONLINE       racn2
——————————————————————————–
Cluster Resources
——————————————————————————–
ora.LISTENER_SCAN1.lsnr
1        ONLINE  ONLINE       racn1
ora.LISTENER_SCAN2.lsnr
1        ONLINE  ONLINE       racn2
ora.LISTENER_SCAN3.lsnr
1        ONLINE  ONLINE       racn2
ora.oc4j
1        OFFLINE OFFLINE
ora.rac.db
1        ONLINE  ONLINE       racn1                  Open
2        ONLINE  ONLINE       racn2                  Open
ora.racn1.vip
1        ONLINE  ONLINE       racn1
ora.racn2.vip
1        ONLINE  ONLINE       racn2
ora.scan1.vip
1        ONLINE  ONLINE       racn1
ora.scan2.vip
1        ONLINE  ONLINE       racn2
ora.scan3.vip
1        ONLINE  ONLINE       racn2

There is much more to know about the Grid Infrastructure, so watch this space in future months.

Joel

02/06/2010

Posted in Oracle | 5 Comments »

Be Foxy; Try Proxy

Posted by Joel Goodman on 22/04/2010

I was speaking at the Norwegian Oracle User Group Last week, having been invited  to present talks on RMAN New Features for Oracle 11g and Partitioning New Features for Oracle 11g. The Partitioning talk had much of the typical questions and answers concerning the new partitioning methods. The RMAN session however, was somewhat unusual in that one of the most interesting discussions arising from the talk was “off topic” in that it did not concern a New 11g RMAN feature. During the RMAN talk I mentioned that the “Active Data Guard” option permits one to perform “Fast Incremental Backups” with Data Guard on a physical standby database. This initiated a discussion of how “Block Change Tracking” works in Oracle generally and additionally how Exadata Storage Cells improve the granularity by tracking at the individual block level rather than at the range of blocks level.

The point of the discussion was that the backups are made less of an overhead by:

  1. Tracking the blocks that have changed, thereby reducing the read overheads when doing backups.
  2. Offloading the entire process to a physical standby, thereby removing the entire overhead of the backup from the primary.
  3. Reducing the overheads further if using Exadata by having more granular identification of which blocks had changed.

We discussed this for a while and then I informed the 50 or so delegates that there is still the overhead of reading and writing the blocks that get backed up from the storage servers into the database server RMAN Channel processes. It was then that the subject of RMAN “Proxy Copy” came up and to my surprise few if any delegates knew about it although this has occurred in the past in some Oracle University classes. So I went a bit off topic and described how this works and then followed this up with some offline discussions after the talk ended.

  1. The database, or specific tablespaces may be backup up by having the Media Manager do it by “proxy” on behalf of RMAN
  2. RMAN requests that each file  in the database or tablespace is copied by the Media Manager
  3. The Media Manager then copies the file using whichever method it supports, such as breaking mirrors and resilvering
  4. When RMAN receives a reply that a file has been copied it sends an “info” request to the Media Manager asking for the file  name of the backup just completed
  5. RMAN then adds the information about the copy to the catalog where it is listed as a “Proxy Copy

The benefit of “Proxy Copy” is that the Media Manager may make the copy from the data files or archive logs directly to a backup location without necessarily passing the data from the storage network to the database servers and this reduces the traffic in the storage network. But balanced against this must be the elapsed time and backup volume. Lets compare Proxy Copy to incremental backups to see the pros and cons:

Incremental Backups:

  • Level Zero Incremental backups back up all formatted blocks and blocks that must be backed up for all the files covered by the backup command. This can lead to substantial network traffic on the storage network whenever the level zero backup is taken.
  • Level one Incremental backups back up only those blocks in the same range as blocks modified since the last backup.
  • Level one Incremental backups done in an Exadata Database Machine environment backup only those blocks that have changed since the last backup.
  • The degree of network traffic on a level one backup is proportional to the “delta” , ie what percentage of blocks in the database change between each backup

Proxy Copies:

  • Proxy Copies back up all the blocks of the files included in the backup command. For Database backups all the blocks are read and written as occurs when performing an RMAN Image Copy.
  • Proxy Copies however can reduce the network traffic in the storage network
  • The Media Manager products that use the Oracle System Backup to Tape (SBT) Application Programming Interface do not necessarily all support Proxy Copy as it is not mandatory for this to be included. When the RMAN channel process connects to the media manager it uses the “sbtinit” call to shake hands with the media manager and determine the capabilities provided by the vendor.
  • One may use the “sbttest” utility for example to see this and here is an example of the handshake between RMAN in 11.1.07 and the oracle.disksbt simulator supplied by Oracle which had remained static since Oracle 8i. Note that for this to work the environment variable BACKUP_DIR must be set to a directory where the simulator will write the “media” including ordinary backupsets to media and Proxy Copies.

tutor5d> sbttest sbt_test.file -trace sbtio.log -dbname tutor5d -libname oracle.disksbt -remove_before -no_remove_after -block_size 4096 -block_count 1000

The sbt function pointers are loaded from oracle.disksbt library.
– sbtinit succeeded
– sbtinit (2nd time) succeeded
sbtinit: vendor description string=WARNING: Oracle Test Disk API
sbtinit: Media manager is version 8.1.3.0
sbtinit: Media manager supports SBT API version 2.0
sbtinit: allocated sbt context area of 4160 bytes
sbtinit: proxy copy is supported
– sbtinit2 succeeded

The key part of the handshake is the second to last line showing that Proxy Copy is supported by this “Media Manager”.

To take a Proxy Copy one uses the following syntax:

backup proxy [only] [database | tablespace | datafile | datafilecopy | archivelog | controlfilecopy and some others followed by possible options. The “only” keyword means that id the media manager does not support Proxy Copy then an error is returned. Without the “only” RMAN would request an ordinary backupset in the event that Proxy Copy was not supported.

Here is an example of a proxy copy for the database:

RMAN> backup proxy only database;

Starting backup at 22-APR-10
using channel ORA_SBT_TAPE_1
current control file cannot be backed up by proxy.
SPFILE cannot be backed up by proxy.
channel ORA_SBT_TAPE_1: starting full proxy datafile backup at 22-APR-10
channel ORA_SBT_TAPE_1: specifying datafile(s) for proxy backup
input datafile file number=00001 name=/home/tutor5d/data/TUTOR5D_NLEDU02/datafile/o1_mf_system_3cxr3bkq_.dbf
proxy file handle=08lbpqs6_1_1
input datafile file number=00002 name=/home/tutor5d/data/TUTOR5D_NLEDU02/datafile/o1_mf_sysaux_3cxr3tv8_.dbf
proxy file handle=08lbpqs6_2_1
input datafile file number=00004 name=/home/tutor5d/data/TUTOR5D_NLEDU02/datafile/o1_mf_example_3cxs05ds_.dbf
proxy file handle=08lbpqs6_3_1
input datafile file number=00003 name=/home/tutor5d/data/TUTOR5D_NLEDU02/datafile/o1_mf_undo_3cxr45jp_.dbf
proxy file handle=08lbpqs6_4_1
channel ORA_SBT_TAPE_1: proxy copy complete, elapsed time: 00:00:56
channel ORA_SBT_TAPE_1: starting full datafile backup set
channel ORA_SBT_TAPE_1: specifying datafile(s) in backup set
including current control file in backup set
including current SPFILE in backup set
channel ORA_SBT_TAPE_1: starting piece 1 at 22-APR-10
channel ORA_SBT_TAPE_1: finished piece 1 at 22-APR-10
piece handle=09lbpqtu_1_1 tag=TAG20100422T212214 comment=API Version 2.0,MMS Version 8.1.3.0
channel ORA_SBT_TAPE_1: backup set complete, elapsed time: 00:00:01
Finished backup at 22-APR-10

RMAN>

To view the Proxy Copies already taken use the RMAN List Backup command as follows:

RMAN> list backup;

List of Backup Sets
===================

BS Key  Type LV Size       Device Type Elapsed Time Completion Time
——- —- — ———- ———– ———— —————
4       Full    7.75M      SBT_TAPE    00:00:02     22-APR-10
BP Key: 4   Status: AVAILABLE  Compressed: NO  Tag: TAG20100422T212214
Handle: 09lbpqtu_1_1   Media:
SPFILE Included: Modification time: 01-FEB-10
SPFILE db_unique_name: TUTOR5D_NLEDU02
Control File Included: Ckp SCN: 2143889      Ckp time: 22-APR-10

BS Key  Type LV Size       Device Type Elapsed Time Completion Time
——- —- — ———- ———– ———— —————
5       Full    200.50M    SBT_TAPE    00:00:15     22-APR-10
BP Key: 5   Status: AVAILABLE  Compressed: NO  Tag: TAG20100422T212721
Handle: 0albpr5q_1_1   Media:
List of Datafiles in backup set 5
File LV Type Ckp SCN    Ckp Time  Name
—- — —- ———- ——— —-
1       Full 2144034    22-APR-10 /home/tutor5d/data/TUTOR5D_NLEDU02/datafile/o1_mf_system_3cxr3bkq_.dbf

BS Key  Type LV Size       Device Type Elapsed Time Completion Time
——- —- — ———- ———– ———— —————
6       Full    7.75M      SBT_TAPE    00:00:02     22-APR-10
BP Key: 6   Status: AVAILABLE  Compressed: NO  Tag: TAG20100422T212721
Handle: 0blbpr6j_1_1   Media:
SPFILE Included: Modification time: 01-FEB-10
SPFILE db_unique_name: TUTOR5D_NLEDU02
Control File Included: Ckp SCN: 2144058      Ckp time: 22-APR-10

List of Proxy Copies
====================

PC Key  File Status      Completion Time Ckp SCN    Ckp Time
——- —- ———– ————— ———- —————
10      1    AVAILABLE   22-APR-10       2143862    22-APR-10
Datafile name: /home/tutor5d/data/TUTOR5D_NLEDU02/datafile/o1_mf_system_3cxr3bkq_.dbf
Handle: 08lbpqs6_1_1   Media: 3380-3
Tag: TAG20100422T212214

11      2    AVAILABLE   22-APR-10       2143862    22-APR-10
Datafile name: /home/tutor5d/data/TUTOR5D_NLEDU02/datafile/o1_mf_sysaux_3cxr3tv8_.dbf
Handle: 08lbpqs6_2_1   Media: 3380-3
Tag: TAG20100422T212214

13      3    AVAILABLE   22-APR-10       2143862    22-APR-10
Datafile name: /home/tutor5d/data/TUTOR5D_NLEDU02/datafile/o1_mf_undo_3cxr45jp_.dbf
Handle: 08lbpqs6_4_1   Media: 3380-3
Tag: TAG20100422T212214

12      4    AVAILABLE   22-APR-10       2143862    22-APR-10
Datafile name: /home/tutor5d/data/TUTOR5D_NLEDU02/datafile/o1_mf_example_3cxs05ds_.dbf
Handle: 08lbpqs6_3_1   Media: 3380-3
Tag: TAG20100422T212214

PC Key  Thrd Seq     Status      Completion Time
——- —- ——- ———– —————
6       1    242     AVAILABLE   22-APR-10
Handle: 04lbpipf_1_1   Media: 3380-3
Tag: TAG20100422T190415

7       1    243     AVAILABLE   22-APR-10
Handle: 04lbpipf_2_1   Media: 3380-3
Tag: TAG20100422T190415

8       1    244     AVAILABLE   22-APR-10
Handle: 04lbpipf_3_1   Media: 3380-3
Tag: TAG20100422T190415

RMAN>

Notice that the Proxy Copies are listed after the backup sets and are not listed as “copies” as seen here:

RMAN> list copy of database;

RMAN>

But if an RMAN Image copy is taken then this will be listed as follows:

RMAN> CONFIGURE DEFAULT DEVICE TYPE TO disk;

old RMAN configuration parameters:
CONFIGURE DEFAULT DEVICE TYPE TO ‘SBT_TAPE’;
new RMAN configuration parameters:
CONFIGURE DEFAULT DEVICE TYPE TO DISK;
new RMAN configuration parameters are successfully stored
released channel: ORA_SBT_TAPE_1

RMAN> RMAN> backup as copy database;

Starting backup at 22-APR-10
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=74 device type=DISK
channel ORA_DISK_1: starting datafile copy
input datafile file number=00001 name=/home/tutor5d/data/TUTOR5D_NLEDU02/datafile/o1_mf_system_3cxr3bkq_.dbf
output file name=/home/tutor5d/flash/TUTOR5D_NLEDU02/datafile/o1_mf_system_5x196vfy_.dbf tag=TAG20100422T213451 RECID=1 STAMP=717024917
channel ORA_DISK_1: datafile copy complete, elapsed time: 00:00:35
channel ORA_DISK_1: starting datafile copy
input datafile file number=00002 name=/home/tutor5d/data/TUTOR5D_NLEDU02/datafile/o1_mf_sysaux_3cxr3tv8_.dbf
output file name=/home/tutor5d/flash/TUTOR5D_NLEDU02/datafile/o1_mf_sysaux_5x197ypb_.dbf tag=TAG20100422T213451 RECID=2 STAMP=717024945
channel ORA_DISK_1: datafile copy complete, elapsed time: 00:00:25
channel ORA_DISK_1: starting datafile copy
input datafile file number=00004 name=/home/tutor5d/data/TUTOR5D_NLEDU02/datafile/o1_mf_example_3cxs05ds_.dbf
output file name=/home/tutor5d/flash/TUTOR5D_NLEDU02/datafile/o1_mf_example_5x198qvp_.dbf tag=TAG20100422T213451 RECID=3 STAMP=717024961
channel ORA_DISK_1: datafile copy complete, elapsed time: 00:00:15
channel ORA_DISK_1: starting datafile copy
input datafile file number=00003 name=/home/tutor5d/data/TUTOR5D_NLEDU02/datafile/o1_mf_undo_3cxr45jp_.dbf
output file name=/home/tutor5d/flash/TUTOR5D_NLEDU02/datafile/o1_mf_undo_5x19972q_.dbf tag=TAG20100422T213451 RECID=4 STAMP=717024970
channel ORA_DISK_1: datafile copy complete, elapsed time: 00:00:07
channel ORA_DISK_1: starting datafile copy
copying current control file
output file name=/home/tutor5d/flash/TUTOR5D_NLEDU02/controlfile/o1_mf_TAG20100422T213451_5x199g8c_.ctl tag=TAG20100422T213451 RECID=5 STAMP=717024974
channel ORA_DISK_1: datafile copy complete, elapsed time: 00:00:01
channel ORA_DISK_1: starting full datafile backup set
channel ORA_DISK_1: specifying datafile(s) in backup set
including current SPFILE in backup set
channel ORA_DISK_1: starting piece 1 at 22-APR-10
channel ORA_DISK_1: finished piece 1 at 22-APR-10
piece handle=/home/tutor5d/flash/TUTOR5D_NLEDU02/backupset/2010_04_22/o1_mf_nnsnf_TAG20100422T213451_5x199hh4_.bkp tag=TAG20100422T213451 comment=NONE
channel ORA_DISK_1: backup set complete, elapsed time: 00:00:01
Finished backup at 22-APR-10

RMAN> list copy of database;

List of Datafile Copies
=======================

Key     File S Completion Time Ckp SCN    Ckp Time
——- —- – ————— ———- —————
1       1    A 22-APR-10       2144262    22-APR-10
Name: /home/tutor5d/flash/TUTOR5D_NLEDU02/datafile/o1_mf_system_5x196vfy_.dbf
Tag: TAG20100422T213451

2       2    A 22-APR-10       2144276    22-APR-10
Name: /home/tutor5d/flash/TUTOR5D_NLEDU02/datafile/o1_mf_sysaux_5x197ypb_.dbf
Tag: TAG20100422T213451

4       3    A 22-APR-10       2144292    22-APR-10
Name: /home/tutor5d/flash/TUTOR5D_NLEDU02/datafile/o1_mf_undo_5x19972q_.dbf
Tag: TAG20100422T213451

3       4    A 22-APR-10       2144286    22-APR-10
Name: /home/tutor5d/flash/TUTOR5D_NLEDU02/datafile/o1_mf_example_5x198qvp_.dbf
Tag: TAG20100422T213451

RMAN>

Finally one may list the Proxy Copies by using the v$proxy_datafile view in the target database as follows:

SYS@tutor5d SQL> select file#,device_type, media,tag,status,
2  start_time,completion_time
3  from v$proxy_datafile
4* order by 1

FILE#            DEVICE_TYPE       MEDIA  TAG      S     START_TIME                            COMPLETION_TIME
———- —————– —— ——————– – —————————– —————————–
1 SBT_TAPE          3380-3  TAG20100422T212214   A 22-apr-2010 21:22:15           22-apr-2010 21:23:02
1 SBT_TAPE          3380-3  TAG20100422T185703   D 22-apr-2010 18:57:06         22-apr-2010 18:58:46
1 SBT_TAPE          3380-3  TAG20100422T211624   D 22-apr-2010 21:16:25            22-apr-2010 21:16:44
2 SBT_TAPE          3380-3 TAG20100422T185703   D 22-apr-2010 18:57:06          22-apr-2010 18:58:48
2 SBT_TAPE          3380-3 TAG20100422T212214   A 22-apr-2010 21:22:15            22-apr-2010 21:23:02
2 SBT_TAPE          3380-3 TAG20100401T171026   D 01-apr-2010 17:10:26           01-apr-2010 17:10:42
3 SBT_TAPE          3380-3 TAG20100422T212214   A 22-apr-2010 21:22:15            22-apr-2010 21:23:02
3 SBT_TAPE          3380-3 TAG20100422T185703   D 22-apr-2010 18:57:06          22-apr-2010 18:58:52
4 SBT_TAPE          3380-3 TAG20100422T212214   A 22-apr-2010 21:22:15          22-apr-2010 21:23:02
4 SBT_TAPE          3380-3 TAG20100422T185703   D 22-apr-2010 18:57:06          22-apr-2010 18:58:50

10 rows selected.

Note that proxy copies of archivelogs are listed in v$proxy_archivedlog.

Further reading is available in the documentation here .

So have fun with your proxy and see if it can help you cut some storage network overheads based on your backup volume, I/O workloads and database change delta.

Joel

22/04/2010

Posted in Oracle | Leave a Comment »

SCAN you LISTEN to this?

Posted by Joel Goodman on 18/03/2010

My good friend and colleague Harald van Breederode recently posted an article on How to Set Up a Private DNS for Your Virtual Cluster where he discussed DNS and the steps he took to create both master and slave DNS servers on his virtual RAC cluster. The article then showed how to statically define IP addresses in the DNS configuration files and how to generate the configuration and zone files. One of the “hostnames” listed had 3 IP addresses; that of the “Single Client Access Name” or SCAN as it is known.

I taught the 11gR2 Grid Infrastructure and RAC course for the first time in EMEA last week and many questions arose about SCANs and SCAN vips as many of the delegates on the course were experienced DBAs who had been using either Oracle 10g or 11g Clusterware and RAC. Some of them were conceptual and some interesting discussions arose about the SCAN vips and SCAN listeners in contrast to vips and listeners in 10g and 11g, so I thought it worth sharing.

1. Architecture for vips and listeners in 10g and 11gr1

In Oracle 10g and 11gR1 each node in the cluster has a Virtual IP Address (VIP) which is activated on the public adaptor and is used by the listener on that node. The vip is also a resource registered in the Oracle Clusterware Repository (OCR) file and the listener usually runs from the ASM home if ASM is used although this can be changed with SRVCTL.

Here is some output from the LSNRCTL utility:

LSNRCTL for Linux: Version 10.2.0.4.0 – Production on 30-SEP-
2009 15:37:15
Copyright (c) 1991, 2008, Oracle. All rights reserved.
Connecting to (ADDRESS=(PROTOCOL=tcp)(HOST=)(PORT=1521))
STATUS of the LISTENER
————————
Alias                                                            LISTENER

Version                                                      TNSLSNR for Linux:

Version                                                      10.2.0.4.0 – Production
Start Date                                                  30-SEP-2009 14:42:04
Uptime                                                       0 days 0 hr. 55 min. 11 sec
Trace Level                                                 off
Security                                                     ON: Local OS Authentication
SNMP                                                          OFF
Listener Parameter File                               /u01/app/10.2.0/asm_1/network/admin/listener.ora
Listener Log File                                         /u01/app/10.2.0/network/log/lsnr.log
Listening Endpoints Summary…
(DESCRIPTION=(ADDRESS=(PROTOCOL=ipc)(KEY=LISTENER)))
(DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=192.168.226.201)(PORT=1521)))
(DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=192.168.226.130)(PORT=1521)))
Services Summary…
Service “+ASM” has 1 instance(s).
Instance “+ASM1″, status READY, has 1 handler(s) for this service…
Service “ora10g.mynode.com” has 1 instance(s).
Instance “ora10g1″, status READY, has 1 handler(s) for this service…
Service “ora10gXDB.mynode.com” has 1 instance(s).
Instance “ora10g1″, status READY, has 1 handler(s) for this service…
The command completed successfully

10g and 11gR1 VIP Usage and Implications

In the output above, IP address 192.168.226.201 is a vip normally used by this listener on this node say node 1 for example. But if node 1 fails, then the vip is failed over to a surviving cluster node which activates the vip. If adaptor ETH0 is the public interface then on the surviving node say node 3, then ETH0:1 will be the virtual adaptor for the vip used by the listener on node 3 and ETH0:2 will be the failed over vip from node 1.

This failover of vips facilitates connect time failover so that a client or middle tier trying to connect to that listener using a load balanced connection will get an error returned from the node to which the vip failed over as no listener is using that vip. In my example node 3 has ETH0:2 for IP address 192.168.226.201 but no listener is listening on that address on node 3 so an error is returned to the client which then tries another listener address at random from the tnsnames.ora entry.

Once the failed node 1 is restarted, the vip is deactivated on node 3, the adaptor ETH0:2 is no longer active and the vip is reactivated on Node 1 as adaptor ETH0:1 and a listener on node 1 once again listens on that vip.

So a vip is used differently when failed over than it is when on its normal “home” node.

Note: This is true for Database vips. Application vips behave the same on any node where they are activated but that is beyond the scope of this discussion.

10g and 11gR1 Listener Usage and Implications

RAC Listeners from Oracle 8i onwards have had several jobs to do in supporting connection requests made by clients and middle tiers to RAC Databases:

1. Managing incoming connection requests including normal and failover connections situations

Connection requests may arrive at any time but can peak if many login requests occur simultaneously. This occurs at instance failure for example if Transparent Application Failover (TAF) is used with BASIC rather than pre-connected sessions. At such a time listeners must handle many requests depending on how many connections existed to the failed instance, and how many surviving nodes and listeners exist. For each connection request the listener must perform a load balance decision (see 2 below) and then either spawn and bequeath or redirect the request (see 3 and 4 below). Occasionally listeners may reject requests if too many are queued and this may be seen on a per service basis using “lsnrctl services” command. This is why TAF parameters exist for RETRY and DELAY. The implications for this are delays to reconnection which affects availability to the user.

2. Making Load Balancing decisions

The Listener makes “Connection Time” load balancing decisions as to which instance will be used for the connection. This decision is made by whichever listener the client or middle tier connects to as all the listeners on all the nodes should be referenced in the REMOTE_LISTENER parameter of each database instance. Since Oracle 10g, the load balancing decision may be made on a “per service” basis as each service can choose from amongst different load balancing methods:

  • Session count for long duration sessions
  • Node run queue length for short sessions with no metrics
  • Service metrics  based on either ELAPSED TIME PER CALL or CPU TIME PER CALL

3. Performing a “Spawn and Bequeath” request if the connection is to be made to an instance on the same cluster node

If the listener decides to connect the client to an instance on the same node due to a load balancing decision or because the only instance accepting logins for the service is on the same node, the the traditional spawn and bequeath method may be used. The listener spawns an oracle process from the Oracle home associated with the instance hosting the service and bequeaths the transport connection used between the client and the listener over to the new process which then communicates with the client.

4. Performing a “Redirect” request if the connection is to be made to an instance on another cluster node

Since Spawn and Bequeath is not possible for cases where the connection is made to an instance on a remote node, the listener making the load balancing decision, must employ the help of the listener on the node hosting the chosen instance. The listener first contacted by the client returns the address of the listener on the chosen node to the client which in turn contacts that listener specifying that it must connect to the specific instance on that node.

The behaviour of the vips and the listeners prior to 11gR2 has the following implications:

  • Database vips do not behave as normal vips. They fail over only to prevent the TCP timeout delay but are not used in the same way as they are when not failed over.
  • The listeners perform many different functions and when a “login storm” occurs after a failure there may be delays and retries because the same listeners are making load balancing decisions and performing spawn and bequeath connection processing
  • The code path and time required to establish a connection to an instance varies depending on whether the chosen instance for the service is hosted on the same node as the chosen listener.

Enter SCAN VIPS and SCANs in 11gR2

From 11gR2 the behaviour of the vips and the listeners is modified with the introduction of SCAN Technology.  This includes:

  • Single Client Access Name
  • SCAN VIPS
  • SCAN Listeners
  • Node or Local Listeners

Part of this architecture follows on from the implications listed above. Essentially there are now two layers of listeners:

  • Up to Three SCAN listeners – which manage the original client connection requests and which make load balancing decisions. Client connect requests always go to the SCAN listeners (except for the upgrade situation mentioned below). Much has already been written about the SCAN addresses but to summarise there can be:
  1. A single SCAN address, vip resource and listener if the server side hosts file is used to resolve the scan name but in this case the SCAN LISTENER is not highly available if the hosting node fails or the listener goes down.
  2. Three SCAN addresses statically defined in a DNS configuration with the same SCAN name as was demonstrated in Harald’s article.  Note: It is also possible to have the three static SCAN addresses managed in the corporate DNS and not in a qualified subdomain or zone. This will work perfectly for rather static clusters and is probably a good choice for many customers.
  3. Three SCAN addresses automatically acquired from DHCP and managed in the clusterware using GNS and mDNS and requiring a delegated subdomain in the corporate DNS server configuration as the IP addresses and host names will not be known outside the cluster until resolved at least once and cached in the DNS Servers for their TTL duration. This choice facilitates Grid Plug and Play (gPNP) whereby new nodes may be added to the cluster without requiring any changes to the corporate DNS configuration as DHCP assigns the IP Addresses and the host names to IP address mapping is updated dynamically by using mDNS protocol so that GNS is then able to resolve the new host names properly. It requires only a one off setup of the delegated subdomain in the corporate DNS setup and DHCP setup as well. Note that this requires the “Advanced” configuration be chosen when installing the Grid Infrastructure.
  4. Each SCAN listener has a SCAN VIP and both are so called “CLUSTER RESOURCES” in the OCR and which may be displayed using the CRSCTL command.
  5. Three SCANs and SCAN Listeners was chosen as the optimal number to make sure that at least two SCANs and SCAN listeners should be active to handle login storms if a node hosting a SCAN fails.
  • Node listeners on each node which connect clients to the chosen instance – these listeners are not normally contacted by the clients at the initial connection request and not referenced by the REMOTE_LISTENER parameter except for upgraded RAC databases from prior releases. Each local listener also has a vip and these are examples of “LOCAL RESOURCES” in the OCR.

The behaviour of these components differs from the description of listeners and vips prior to 11gR2.

  1. Clients always connect to SCAN Listeners (except for the upgrade situation mentioned above) and the DNS resolution achieves a round robin client side load balance. This relieves the client from requiring a tnsnames.ora entry with an address list for the listeners and permits load balancing using simple connect strings such as “EZ CONNECT” style or Java style syntax.
  2. SCAN Listeners always use the REDIRECT method for passing the connection request on to a Node listener even if the Node Listener is on the same node as the SCAN Listener. This has the effect of evening out the code path for connection requests.
  3. If a Node fails, then the SCAN Vip fails over to another node as in the pre-11gr2 case but the SCAN Listener fails over as well. This means that the SCAN vip behaves the same on all nodes IE it has a listener listening on the vip and is in marked contrast to the case prior to 11gR2.
  4. Node Listeners only connect clients or middle tiers to the instance hosting the service chosen by the SCAN listener performing load balancing. The separation of load balancing and initial connection handling from the “spawn and bequeath” processing means that SCAN listeners have less  to do when a login storm occurs since all they do is send out redirect packets. By having two layers of listeners, the system can be more responsive than with one layer of listeners by effectively parallelising all the connection requests.

A final word about “CLUSTER RESOURCES” and “LOCAL RESOURCES”.

when doing “crsctl stat res -t” the output is divided into these two categories.

  1. CLUSTER RESOURCES are those which can run on any node of a cluster and may fail over. Examples are SCAN vips, SCAN Listeners, GNS vip, GNS, Database instances, Database services and more. There is no requirement for multiple occurrences of a resource for it to be considered a CLUSTER resource. For example there is only ever one GNS and one GNS Vip but they both can be anywhere on the cluster. Or there are three SCAN vips and SCAN Listeners regardless of the number of cluster nodes and they may run anywhere. Or the database instances for a database may run on any servers in the server pool to which it is assigned.
  2. LOCAL RESOURCES are those which either run on a specific node or do not run at all. Examples are Local listeners which either run or don’t run, but which don’t fail over as do SCAN listeners. Or ASM Instances which run one per node or don’t run if the node is down but which do not fail over. An Example of a Local resource on only one node may be when a node is down or a resource is down on another node.

So there are both Local and Cluster resources running on multiple nodes some may run on only one node.

We certainly had fun last week discussing and debating the SCAN Architecture and the changes from previous releases and it is hoped that readers will enjoy some of these observations.

Joel

18/03/2010

Posted in Oracle | 5 Comments »

Tell RAC to Leave Your Leaves Alone

Posted by Joel Goodman on 17/02/2010

Last year I published the slides from my presentation Managing Sequences in a RAC Environment which discussed the administrative and performance management skills related to the use of Sequence Number generation in RAC using both Oracle and user defined sequences.  One use of sequences about which most DBAs and developers are aware is that of generating primary keys for a table and naturally the key data for the underlying index.

But the choice of sequence parameters can have an effect on performance when using high volume insert applications in a RAC database environment. The problem is one of Index Leaf Block contention whose symptoms are indicated by the following wait events:

  1. enq: TX – index contention
  2. gc buffer busy waits on Index Branch Blocks
  3. gc buffer busy waits on Index Leaf       Blocks
  4. gc current block busy on Remote Undo Headers
  5. gc current split

There may also be increasing numbers in the following System Statistics:

  1. Leaf node splits
  2. Branch node splits
  3. Exchange deadlocks
  4. gcs refuse xid
  5. gcs ast xid
  6. Service ITL waits

Finally there may be an increase in the following statistics in the V$SEGMENT_STATISTICS view for indexes on tables with high volume inserts:

  1. gc buffer busy
  2. gc current blocks received

I have had many delegates in the classroom on my RAC courses who have reported such symptoms which suggests that they may have this problem. We can understand the causes by looking at the implications of the [ORDER | NOORDER] options on index performance. Note that all the other options are discussed in detail in the presentation mentioned above.

If a sequence is created with the ORDER option then the order of numbers allocated is guaranteed even though multiple instances may run the application requesting the next sequence number. As pointed out in my presentation, RAC databases manage this by caching the same range of numbers in the row cache of all instances having used the sequence, and uses the SV instance lock to indicate which instance’s row cache  contains the most recent next value for the sequence. If a request for the next number comes from the instance owning the SV instance lock, then the sequence in incremented and the SV lock continues to be held by the same instance’s LCKO process. If another instance requests the next value then the LCKO process on the requesting instance acquires the SV lock from the owning instance  which also passes the nextval value in the same message to the requesting instance which becomes the new owner of the SV lock for that sequence, updates its row cache and assigns the number to the requesting server process in the requesting instance.

There is of course some overhead in acquiring the SV instance lock but the bigger problem is the overheads for an index that uses the sequence number as a key. If high volume inserts occur then the right side leaf block will get the usual buffer busy waits occurring, due to multiple concurrent inserts from a sequence based key but there may be requests from other instances for inserts into the same index. This will require that the index leaf is served over the interconnect, once it is not busy, hence the additional Cache Fusion overheads mentioned earlier.

Now consider what occurs if the sequence uses the NOORDER option. In this case each instance using the sequence caches a separate range of numbers. For example if the CACHE option is set to 50,000, and there are four instances which used the sequence initially in the order 1,3,4,2, then instance one will cache the first 50,000 numbers, instance three the next 50,000, instance four the next 50,000 and the next 50,000 by instance two. No SV lock will be used to coordinate the row caches to determine which row cache has the correct “next value” since each instance caches separate ranges of numbers.

If the table and index are new then initially all inserts into the index from all instances will use the same leaf block as there will be only one. But after a number of block splits occur to the index, discrete ranges of leaf blocks will develop holding the keys in the four ranges of numbers. For example, assuming a roughly equal number of inserts occur from all instances, then the first N leaves will contain keys for the first range of cached numbers from instance one, the second group of N leaves will contain keys for the second range of cached numbers from instance three and so on.  Effectively we now have four separate key and leaf ranges and each range has their own “right handed” leaf for that range.

Since each  “right handed” leaf will  be used only by the instance that has cached the range of values corresponding to the range of leaves, Cache Fusion overheads for the high volume insert will be nearly eliminated since effectively, each instance has their own “right hand” leaf block in which to insert keys.

The benefit of using CACHE with NOORDER on indexes is a reduction in overheads for high volume inserts but at a cost of having the keys generated out of order. So the first 10 values generated assuming the sequence starts with 1 and increments with 1 could be as follows:

  1. 50,001
  2. 1
  3. 2
  4. 50,002
  5. 150,001
  6. 3
  7. 4
  8. 50,003
  9. 100,001
  10. 100,002

If the order of key generation is important then this technique is not available but if the requirement is for unique key values and the order is unimportant then using CACHE and NOORDER is the best way to reduce the overheads of the both the sequence generation and the overheads of high volume insert activity.

In addition one may reduce  hot leaf block contention problems by using REVERSE KEY indexes if no range scans are required, by using HASH PARTITIONED GLOBAL indexes if already using the partitioning option or by a combination of both. These techniques can also help reduce buffer busy wait and cache fusion overheads but in many cases REVERSE KEY indexes can not be used since they prevent the optimiser from generating plans using an INDEX RANGE SCAN access method. HASH PARTITIONED GLOBAL indexes require the partitioning license and prevents this from being used by all Oracle DBAs.

As with tuning tip, the degree to which this will improve performance depends on many variables so try it and benchmark the difference it might make in your RAC environment.

Posted in Oracle | 8 Comments »

The Benefits of Oracle Certification

Posted by Joel Goodman on 29/01/2010

Last year I posted three videos produced by my colleague Harold Green, from a talk I gave outlining the development process for Oracle DBA Certification exams including OCA, OCP, OCE and OCM. They described the process from design to completion of creating the exams and discussed the philosophy of the exam design. If you have not seen those videos before then you can find them here.

Harold had now produced another video, the first in a two part series on the benefits of Oracle Certification. In the first part I discuss the benefits to employees and explain what motivates DBAs and other to get certified. In the second part which will be released soon the benefits to employers are discussed.

Here is Part 1 covering Benefits to Employees

Here is Part 2 covering Benefits to Employers

Posted in Oracle | 7 Comments »

What Is Your View of Your DBA Views?

Posted by Joel Goodman on 18/01/2010

A User Group delegate asked me recently if I ever post topics for new Oracle DBAs who may not yet be administering any advanced options such as RAC, ASM, Exadata etc.. Then I had a question during an Oracle Live Virtual Class (LVC) which I presented late last year also from a new DBA who wished to know if there is a quick way to examine all the possible values for certain status columns in Data Dictionary Views or in Dynamic Performance (V$ and GV$) views.

Many of these views contain various columns some of which are of datatype date, some are number, occasionally there are long or clob columns and for the V$ views there are raw columns representing pointers in from one memory structure to another. But there are many important columns that are varchar or maybe char and whose values are important in determining the current situation with regard to the resource in question.

Whilst the Oracle Database Reference contains information on Data Dictionary views in Part II (sections 2 through 6) and Dynamic Performance views in Part III (sections 7 through 9) and contains the various possible values and their meanings one must still find the documentation and navigate to the appropriate view. This may occur when one is creating a filter for example typically “WHERE <COLUMN_NAME> <operator> <VALUE>”  and wish to know the literal for “VALUE>”.

I have often encountered this when querying a view that I seldom use and altough I may know that there is a specific status in a column or a view, I may not remember the value or what other possible values are.

Lets start by using the Static Data Dictionary View “DBA_TABLESPACES“  as an example.  The columns “STATUS”, “LOGGING”, “FORCE_LOGGING” and all the other columns of “varchar2″ data type contain two or more possible values which may be used in the select list or the where clause or both in a DBA query.

Data Dictionary views are stand-alone queries or join queries that access the underlying base tables of the data dictionary and which occasionally join to dynamic performance views as well. Those objects contain metadata in encoded columns and these are decoded by the views. By querying the text of the views, one may see not only the underlying tables and views that are accessed but also the decode functions which contain all the possible values for the intersting status columns.

One may use the following script to access the text of Data Dictionary Views:

set long 20000
col text for a100
col view_name for a20
undefine viewname
select view_name,text from dba_views
where view_name like  upper (‘%&&viewname%’);
undefine viewname

Lets see what we get when looking at DBA_TABLESPACES:

SYS@tutor5d_nledu02 SQL> @sviewtext
SYS@tutor5d_nledu02 SQL> set long 20000
SYS@tutor5d_nledu02 SQL> col text for a100
SYS@tutor5d_nledu02 SQL> col view_name for a20
SYS@tutor5d_nledu02 SQL> undefine viewname
SYS@tutor5d_nledu02 SQL> select view_name,text from dba_views
2  where view_name like  upper (‘%&&viewname%’);
Enter value for viewname: DBA_TABLESPACES
old   2: where view_name like  upper (‘%&&viewname%’)
new   2: where view_name like  upper (‘%DBA_TABLESPACES%’)
Hit Any Key to Continue

VIEW_NAME
——————–
TEXT
——————————————————————————-
DBA_TABLESPACES
select ts.name, ts.blocksize, ts.blocksize * ts.dflinit,
decode(bitand(ts.flags, 3), 1, to_number(NULL),
ts.blocksize * ts.dflincr),
ts.dflminext,
decode(ts.contents$, 1, to_number(NULL), ts.dflmaxext),
decode(bitand(ts.flags, 4096), 4096, ts.affstrength, NULL),
decode(bitand(ts.flags, 3), 1, to_number(NULL), ts.dflextpct),
ts.blocksize * ts.dflminlen,
decode(ts.online$, 1, ‘ONLINE’, 2, ‘OFFLINE’,
4, ‘READ ONLY’, ‘UNDEFINED’),
decode(ts.contents$, 0, (decode(bitand(ts.flags, 16), 16, ‘UNDO’,
‘PERMANENT’)), 1, ‘TEMPORARY’),
decode(bitand(ts.dflogging, 1), 0, ‘NOLOGGING’, 1, ‘LOGGING’),
decode(bitand(ts.dflogging, 2), 0, ‘NO’, 2, ‘YES’),
decode(ts.bitmapped, 0, ‘DICTIONARY’, ‘LOCAL’),
decode(bitand(ts.flags, 3), 0, ‘USER’, 1, ‘SYSTEM’, 2, ‘UNIFORM’,
‘UNDEFINED’),
decode(ts.plugged, 0, ‘NO’, ‘YES’),
decode(bitand(ts.flags,32), 32,’AUTO’, ‘MANUAL’),
decode(bitand(ts.flags,64), 64,’ENABLED’, ‘DISABLED’),
decode(bitand(ts.flags,16), 16, (decode(bitand(ts.flags, 512), 512,
‘GUARANTEE’, ‘NOGUARANTEE’)), ‘NOT APPLY’),
decode(bitand(ts.flags,256), 256, ‘YES’, ‘NO’),
decode(tsattr.storattr, 1, ‘STORAGE’, ‘HOST’),
decode(bitand(ts.flags,16384), 16384, ‘YES’, ‘NO’),
decode(bitand(ts.flags,64), 0, null,
decode(bitand(ts.flags,65536), 65536,’FOR ALL OPERATIONS’,
‘DIRECT LOAD ONLY’))
from sys.ts$ ts, sys.x$kcfistsa tsattr
where ts.online$ != 3
and bitand(flags,2048) != 2048
and ts.ts# = tsattr.tsid

1 row selected.

SYS@tutor5d_nledu02 SQL> undefine viewname
SYS@tutor5d_nledu02 SQL>



We can see that the view is based on SYS.TS$ and SYS.KCFISTSA which is a join between a Data Dictionary Base table and an X$ struct and we can also see the decode functions for various columns with all the values.

Another example is DBA_DATA_FILES:

SYS@tutor5d_nledu02 SQL> @sviewtext
SYS@tutor5d_nledu02 SQL> set long 20000
SYS@tutor5d_nledu02 SQL> col text for a100
SYS@tutor5d_nledu02 SQL> col view_name for a20
SYS@tutor5d_nledu02 SQL> undefine viewname
SYS@tutor5d_nledu02 SQL> select view_name,text from dba_views
2  where view_name like  upper (‘%&&viewname%’);
Enter value for viewname: DBA_DATA_FILES
old   2: where view_name like  upper (‘%&&viewname%’)
new   2: where view_name like  upper (‘%DBA_DATA_FILES%’)
Hit Any Key to Continue

VIEW_NAME
——————–
TEXT
——————————————————————————-
DBA_DATA_FILES
select v.name, f.file#, ts.name,
ts.blocksize * f.blocks, f.blocks,
decode(f.status$, 1, ‘INVALID’, 2, ‘AVAILABLE’, ‘UNDEFINED’),
f.relfile#, decode(f.inc, 0, ‘NO’, ‘YES’),
ts.blocksize * f.maxextend, f.maxextend, f.inc,
ts.blocksize * (f.blocks – 1), f.blocks – 1,
decode(fe.fetsn, 0, decode(bitand(fe.festa, 2), 0, ‘SYSOFF’, ‘SYSTEM’),
decode(bitand(fe.festa, 18), 0, ‘OFFLINE’, 2, ‘ONLINE’, ‘RECOVER’))
from sys.file$ f, sys.ts$ ts, sys.v$dbfile v, x$kccfe fe
where v.file# = f.file#
and f.spare1 is NULL
and f.ts# = ts.ts#
and fe.fenum = f.file#
union all
select
v.name,f.file#, ts.name,
decode(hc.ktfbhccval, 0, ts.blocksize * hc.ktfbhcsz, NULL),
decode(hc.ktfbhccval, 0, hc.ktfbhcsz, NULL),
decode(f.status$, 1, ‘INVALID’, 2, ‘AVAILABLE’, ‘UNDEFINED’),
f.relfile#,
decode(hc.ktfbhccval, 0, decode(hc.ktfbhcinc, 0, ‘NO’, ‘YES’), NULL),
decode(hc.ktfbhccval, 0, ts.blocksize * hc.ktfbhcmaxsz, NULL),
decode(hc.ktfbhccval, 0, hc.ktfbhcmaxsz, NULL),
decode(hc.ktfbhccval, 0, hc.ktfbhcinc, NULL),
decode(hc.ktfbhccval, 0, hc.ktfbhcusz * ts.blocksize, NULL),
decode(hc.ktfbhccval, 0, hc.ktfbhcusz, NULL),
decode(fe.fetsn, 0, decode(bitand(fe.festa, 2), 0, ‘SYSOFF’, ‘SYSTEM’),
decode(bitand(fe.festa, 18), 0, ‘OFFLINE’, 2, ‘ONLINE’, ‘RECOVER’))
from sys.v$dbfile v, sys.file$ f, sys.x$ktfbhc hc, sys.ts$ ts, x$kccfe fe
where v.file# = f.file#
and f.spare1 is NOT NULL
and v.file# = hc.ktfbhcafno
and hc.ktfbhctsn = ts.ts#
and fe.fenum = f.file#

1 row selected.

SYS@tutor5d_nledu02 SQL> undefine viewname
SYS@tutor5d_nledu02 SQL>

Here we see that the view does a Union All Query to support multiple blocksizes but that each part of the query also refers to the base tables, X$ structs, and decode functions. Note also that in this view V$DBFILE is joined to FILE$ permitting the query to obtain the full path name of each data file is it appears in the controlfile. This explains why an “ALTER DATABASE RENAME FILE” command may be done with the database in MOUNT state when trhere is no access to the Data Dictionary. Only the CONTROL FILE is modified but once the database is opened DBA_DATA_FILES has the correct path names precisely because it is doing the join between FILE$ and V$DBFILE and thus getting the up to date path name from the control file.

A similar query may be done for Dynamic Performance views as follows:

set long 5000
col view_definition for a50
select view_name, view_definition
from v$fixed_view_definition
where view_name like upper(‘%&viewname%’);
undefine viewname

Lets now see what we get by looking at simple V$ view:

SYS@tutor5d_nledu02 SQL> /
Enter value for viewname: V$OPTION
old   3: where view_name like  ‘%&viewname%’
new   3: where view_name like  ‘%V$OPTION%’
Hit Any Key to Continue

VIEW_NAME            VIEW_DEFINITION
——————– ————————————————–
GV$OPTION            select inst_id,parameter, value from x$option
V$OPTION              select  PARAMETER , VALUE from GV$OPTION where inst_id = USERENV(‘Instance’)

2 rows selected.

SYS@tutor5d_nledu02 SQL>

First note how the V$ view queries the GV$ view of the same name for the current instance. This is generally true for all the V$ views. And note that we can see the X$ structs accessed by the GV$ View as well. Now lets try this with a slightly more complex V$ view:

SYS@tutor5d_nledu02 SQL> select view_name, view_definition
2  from v$fixed_view_definition
3  where view_name like upper(‘%&viewname%’);
Enter value for viewname: v$logfile
old   3: where view_name like upper(‘%&viewname%’)
new   3: where view_name like upper(‘%v$logfile%’)
Hit Any Key to Continue

VIEW_NAME                      VIEW_DEFINITION
—————————— ————————————————–

GV$LOGFILE                     select inst_id,fnfno, decode(fnflg,0,”,   decode(bitand(fnflg,1),1,’INVALID’,   decode(bitand(fnflg,2),2,’STALE’,   decode(bitand(fnflg,4),4,’DELETED’, decode(bitand(fnflg,8+32),8,”,32,”,40,”,’UNKNOWN’))))), decode(bitand(fnflg,8),0,’ONLINE’,'STANDBY’), fnnam, decode(bitand(fnflg, 32),0,’NO’,'YES’) from x$kccfn where fnnam is not null and fntyp=3

V$LOGFILE                      select  GROUP# , STATUS , TYPE , MEMBER, IS_RECOVERY_DEST_FILE from GV$LOGFILE where inst_id = USERENV(‘Instance’)

2 rows selected.

SYS@tutor5d_nledu02 SQL> undefine viewname

Once again the decodes are visible as well as the underlying structs.

In some cases one may also learn additional things by examining the queries in the views.

I find this a useful way to help people learn about important Oracle DBA views and the values one may use for filtering queries and I use this as a teaching tool for basic DBA courses at Oracle University. If you are somewhat new to Oracle DBA work, then have a go at using these and try some of the Data Dictionary or Dynamic Performance views that you commonly use.

Posted in Oracle | 2 Comments »

Database Links Masterclass Part 2

Posted by Joel Goodman on 30/11/2009

Today I presented the second in my Database Links Masterclass series on Distributed transactions at the UKOUG Annual Conference. You can see details on my “Articles and Presentations” Page

Posted in Oracle | Leave a Comment »

RMAN is Piping Hot

Posted by Joel Goodman on 24/11/2009

The RMAN “PIPE” Interface

Tips and Tricks for the Oracle DBA #1

RMAN is traditionally managed using the command line interface RMAN executable or by using Enterprise Manager. But there is a “third way” known as the “pipes” interface to RMAN, which uses the “DBMS_PIPE” Oracle, supplied package to provide PL/SQL programmes with an interface to RMAN.

The Oracle Documentation contains an example bit of code, which I have wrapped inside a demo to demonstrate the pipes interface. Essentially the RMAN command line executable must be started with the PIPE option and must specify the pipe name. Optionally a timeout value may be specified to terminate the RMAN executable if no command is sent down the named pipe within the timeout period. In my SQL*PLUS demo script RMAN is invoked automatically by shelling out to the host and running RMAN disconnected from the main script so that control returns to the SQL*PLUS script. One could of course drive this all from a shell script as well by running RMAN in the background and then running SQL*PLUS.

Here is an example of invoking RMAN to accept input from a pipe:

SQL> Host rman pipe demo_pipe timeout 600 target / \&

Two pipes will be created called ORA$RMAN_DEMO_PIPE_IN and ORA$RMAN_DEMO_PIPE_OUT and used for inbound commands to RMAN and output from the commands.

The demo is commented and does the following:

  1. Starts the RMAN Session in the background by hosting out
  2. Checks that a session is waiting on a “pipe get” event
  3. Uses DBMS_PIPE.PURGE to clean the two pipes
  4. Uses the code from the documentation to create a stored procedure called RMAN_CMD
  5. Checks for compilation errors
  6. Uses the V$DB_PIPES view to show the two pipes’ attributes
  7. Invokes the RMAN_CMD procedure several times with various RMAN commands
  8. Invokes the RMAN_CMD procedure, passing EXIT as the message to terminate the RMAN executable that is running in the background.

Possible used for the pipes interface are:

  1. Applications coded to automatically invoke one or more backup and recovery functions to achieve their design objectives
  2. Third party tools which whish to provide their own interface to RMAN
  3. DBA tools that wish to incorporate some RMAN functionality

The demo and spooled output are:

  1. rman_pipes_sql – this contains the script
  2. rman_pipes_spool – this is the spooled output of the script running on Linux.

Posted in Oracle | 2 Comments »