The 11g Release 2 version of the Oracle RAC and Grid Infrastructure course went live earlier this year, and has generated much discussion concerning several aspects of the release. In this post, I would like to share some observations about the software based on my research and teaching experience during the past five months. The new release of the Grid Infrastructure consists of:
- A New “Local” resource management layer, known as the OHASD
- A new set of Agents which replace the RACG Layer
- Support for new features: Grid Plug and Play, Grid Naming Service, Grid IPC and Cluster Time Synchronisation Service
- Integration of ASM and the Clusterware to form the Grid Infrastructure
- A reworked Cluster Ready Services Daemon (CRSD)
- Automatically managed Server Pools
- Support for Intelligent Platform Management Interface (IPMI), for node fencing and node termination
This is an extensive change to the clusterware from previous releases, and is a very large topic so in this blog post, I will restrict the discussion to the New Local Resource management layer, called the “Lower Stack” and how it relates to the “Upper Stack“.
The Lower Stack – Managed by OHASD
The 11gR2 Grid Infrastructure consists of a set of daemon processes which execute on each cluster node; the voting and OCR files, and protocols used to communicate across the interconnect. Prior to 11gR2, there were various scripts run by the init process to start and monitor the health of the clusterware daemons. From 11gR2, the Oracle High Availability Services Daemon (OHASD) replaces these. The OHASD starts, stops and checks the status of all the other daemon processes that are part of the clusterware using new agent processes listed here:
- CSSDAGENT – used to start,stop and check status of the CSSD resource
- ORAROOTAGENT – used to start “Lower Stack” daemons that must run as root: ora.crsd, ora.ctssd, ora.diskmon, ora.drivers.acfs, ora.crf
- ORAAGENT – used to start “Lower Stack” daemons that run as the grid owner: ora.asm, ora.evmd, ora.gipcd, ora.gpnpd, ora.mdnsd
- CSSDMONITOR – used to monitor the CSSDAGENT
The OHASD is essentially a daemon which starts and monitors the clusterware daemons themselves. It is started by init using the /etc/init.d/ohasd script and starts the ohasd.bin executable as root. The Oracle documentation lists the “Lower Stack” daemons where they are referred to as the “The Oracle High Availability Services Stack” and notes which agent is responsible for starting and monitoring each specific daemon. It also explains the purpose of each of the stack components. (Discussions of some of these components will feature in future blog posts.) If the grid infrastructure is enabled on a node, then OHASD starts the “Lower Stack” on that node at boot time. If disabled, then the “Lower Stack” is started manually. The following commands are used for these operations:
- crsctl enable crs – enables autostart at boot time
- crsctl disable crs – disables autostart at boot time
- crsctl start crs – manually starts crs on the local node
The “Lower Stack” consists of daemons which communicate with their counterparts on other cluster nodes. These daemons must be started in the correct sequence, as some of them depend on others. For example, the Cluster Ready Services Daemon (CRSD), may depend on ASM being available if the OCR file is stored in ASM. Clustered ASM in turn, depends on the Cluster Synchronisation Services Daemon(CSSD), as the CSSD must be started in order for clustered ASM to start up. This dependency tree is similar to that which already existed for the resources managed by the CRSD itself, known as the “Upper Stack“, which will be discussed later in this post.
To define the dependency tree for the “Lower Stack“, a local repository called the OLR is used. This contains the metadata required by OHASD to join the cluster and configuration details for the local software. As a result, OHASD can start the “Lower Stack” daemons without reference to the OCR. To examine the OLR use the following command, and then examine the dump file produced:
- ocrdump -local <FILENAME>
Another benefit of the OHASD, is that there is a daemon running on each cluster node whether or not the “Lower Stack” is started. As long as the OHASD daemon is running, then the following commands may be used in 11gR2:
- crsctl check has – check the status of the OHASD
- crsctl check crs – check the status of the OHASD, CRSD, CSSD and EVMD
- crsctl check cluster – all – this checks the “Lower Stack” on all the nodes
- crsctl start cluster – this attempts to start the “Lower Stack” on all the nodes
- crsctl stop cluster – this attempts to stop the “Lower Stack” on all the nodes
Here are some examples:
# crsctl check has
CRS-4638: Oracle High Availability Services is online
# crsctl check crs
CRS-4638: Oracle High Availability Services is online
CRS-4537: Cluster Ready Services is online
CRS-4529: Cluster Synchronization Services is online
CRS-4533: Event Manager is online
# crsctl check cluster -all
**************************************************************
racn1:
CRS-4537: Cluster Ready Services is online
CRS-4529: Cluster Synchronization Services is online
CRS-4533: Event Manager is online
**************************************************************
racn2:
CRS-4537: Cluster Ready Services is online
CRS-4529: Cluster Synchronization Services is online
CRS-4533: Event Manager is online
**************************************************************
racn3:
CRS-4537: Cluster Ready Services is online
CRS-4529: Cluster Synchronization Services is online
CRS-4533: Event Manager is online
**************************************************************
To check the status of the “Lower Stack” resources use the following:
- crsctl stat res -init -t
An example is shown here:
# crsctl stat res -init -t
NAME TARGET STATE SERVER STATE_DETAILS
————————————————————————————-
————————————————————————————-
Cluster Resources
————————————————————————————-
ora.asm ONLINE ONLINE racn1 Started
ora.crsd ONLINE ONLINE racn1
ora.cssd ONLINE ONLINE racn1
ora.cssdmonitor ONLINE ONLINE racn1
ora.ctssd ONLINE ONLINE racn1 OBSERVER
ora.diskmon ONLINE ONLINE racn1
ora.drivers.acfs ONLINE ONLINE racn1
ora.evmd ONLINE ONLINE racn1
ora.gipcd ONLINE ONLINE racn1
ora.gpnpd ONLINE ONLINE racn1
ora.mdnsd ONLINE ONLINE racn1
To start the CSSD Daemon requires access to the Voting Files which may be stored in ASM. But a clustered ASM Instance may not start until the node has joined the cluster which requires that CSSD be up. To get around this problem, ASM Diskgroups are flagged to indicate that they contain Voting Files. The ASM Discovery string is contained in the OLR and used to scan for the ASM Disks when CSSD starts. The scan locates the flags indicating the presence of Voting Files which are stored at a fixed location in the ASM Disks. This process does not require the ASM instance to be up. Once the Voting Files are found by this scanning process, CSSD can access them, join the cluster and then the ORAAGENT can start the Clustered ASM Instance.
To check the location of the Voting Files use the following:
# crsctl query css votedisk
## STATE File Universal Id File Name Disk group
— —– ——————————————————————————————————
1. ONLINE 5b91aad0a2184f3dbfa8f970e8ae4d49 (/dev/oracleasm/disks/VOTE1) [VOTEDG]
2. ONLINE 53b1b40b73164f9ebf3f498f6d460187 (/dev/oracleasm/disks/VOTE2) [VOTEDG]
3. ONLINE 82dfd04b96f14f6dbf36f5a62b118f61 (/dev/oracleasm/disks/VOTE3) [VOTEDG]
The Upper Stack – Managed by CRSD
The “Upper Stack” consists of the daemons and resources managed by the Grid Infrastructure, once it is up and running. It uses the same architecture as OHASD, but CRSD uses its own threads of the agents to start up, stop and check the status of the daemons and resources as follows:
- ORAROOTAGENT – used to start “Upper Stack” daemons that must run as root: GNS, VIP, SCAN VIP and network resources
- ORAAGENT – used to start “Upper Stack” daemons that run as grid owner: ora.asm, ora.eons, ora.LISTENER.lsnr, SCAN listeners, ora.ons, ASM Diskgroups, Database Instances, Database Services. It is also used to publish High Availability events to interested clients and manages Cluster Ready Service changes of state.
The resources managed by the CRSD for the “Upper Stack” are also listed in the Oracle Documentation where they are referred to as “The Cluster Ready Services Stack” and consist of familiar resources such as Database Instances, Database Services and NodeApps such as Node Listeners. There are also some new resources such as the Single Client Access Name (SCAN), SCAN Vips, Grid Naming Service (GNS), GNS Vips and Network Resources. Some of these will be the subject of future Blog posts.
The resources managed by the “Upper Stack” are in the OCR file which may be stored in ASM. Since the Clustered ASM Instance is started by OHASD after CSSD is started but before CRSD is started, access to the OCR by CRSD is done as a normal client of ASM. The OCR file may be seen as a file in ASM, unlike the Voting Files which are not “visible” when looking at the ASM directory contents using either Enterprise Manager or the ASMCMD utility.
To check the location of the OCR do the following:
# cat /etc/oracle/ocr.loc
ocrconfig_loc=+DATA
local_only=FALSE
CRSD Resource Categories
CRSD resources are categorised as “Local Resources” or”Cluster Resources“. Local Resources are activated on a specific node and never fail over to another node. For example, an ASM Instance exists on each node, so if a node fails, then the ASM Instance that was on that node will not fail over to a surviving node. Likewise, a Node Listener exists for each node and does not fail over. These two resource types are therefore “Local Resources“. SCAN Listeners however, may fail over as may Database Instances or the GNS (if used), so these are “Cluster Resources“
Finally to check the status of the “Upper Stack” resources and daemons, do the following:
# ./crsctl status resource -t
——————————————————————————–
NAME TARGET STATE SERVER STATE_DETAILS
——————————————————————————–
Local Resources
——————————————————————————–
ora.DATADG.dg
ONLINE ONLINE racn1
ONLINE ONLINE racn2
ora.LISTENER.lsnr
ONLINE ONLINE racn1
ONLINE ONLINE racn2
ora.FRADG.dg
ONLINE ONLINE racn1
ONLINE ONLINE racn2
ora.asm
ONLINE ONLINE racn1 Started
ONLINE ONLINE racn2 Started
ora.eons
ONLINE ONLINE racn1
ONLINE ONLINE racn2
ora.gsd
OFFLINE OFFLINE racn1
OFFLINE OFFLINE racn2
ora.net1.network
ONLINE ONLINE racn1
ONLINE ONLINE racn2
ora.ons
ONLINE ONLINE racn1
ONLINE ONLINE racn2
ora.registry.acfs
ONLINE ONLINE racn1
ONLINE ONLINE racn2
——————————————————————————–
Cluster Resources
——————————————————————————–
ora.LISTENER_SCAN1.lsnr
1 ONLINE ONLINE racn1
ora.LISTENER_SCAN2.lsnr
1 ONLINE ONLINE racn2
ora.LISTENER_SCAN3.lsnr
1 ONLINE ONLINE racn2
ora.oc4j
1 OFFLINE OFFLINE
ora.rac.db
1 ONLINE ONLINE racn1 Open
2 ONLINE ONLINE racn2 Open
ora.racn1.vip
1 ONLINE ONLINE racn1
ora.racn2.vip
1 ONLINE ONLINE racn2
ora.scan1.vip
1 ONLINE ONLINE racn1
ora.scan2.vip
1 ONLINE ONLINE racn2
ora.scan3.vip
1 ONLINE ONLINE racn2
There is much more to know about the Grid Infrastructure, so watch this space in future months.
Joel
02/06/2010