Last year I wrote a post about mirroring in ASM in the context of Exadata which discussed aspects of ASM mirroring, including failgroups, and the ability of ASM to cope with different failure scenarios. The post dealt with cases such as:
- Single disk failure in a single cell
- Multiple disk failure in a single cell
- Single cell failure
- Overlapping disk failures in multiple cells
It also covered the management of free space in ASM.
But the assumption was that the diskgroup(s) were already mounted, when the failures occurred. If an ASM instance attempts to mount a diskgroup however, then things are very different.
I have been working closely with members of the Oracle Advanced Customer Support Services team over the past 18 months, to train EMEA staff in Grid Infrastructure, RAC, Exadata and Enterprise Linux, to help skill up for supporting the Oracle Database Machine. Recently, this collaboration resulted in a joint project between Oracle University and ACS to triage some problems at an EMEA customer.
The customer’s configuration used Grid Infrastructure for a standalone server on a two node Solaris stretch cluster. This was done to utilise ASM on each node, to support a live environment on one node, and a test environment on the other node in a different location. But the test node was also part of the HA strategy, in case of problems on the live node. Oracle Restart was not used, so essentially, it was two separate nodes each with Oracle and ASM.
The ASM disks, were provided by SAN LUNs, with arrays in both the live and test locations. Each array was connected to both the live and test servers.
The live system diskgroups each had two failgroups to support ASM normal redundancy
- First failgroup on LUNs provided by the storage array at the live location
- Second failgroup on LUNs provided by the storage array at the test location
Normally, the ASM instance on the live system would have two diskgroups mounted, each with two failgroups. But to handle live location failures, scripts were written, that would:
- Shut the test database instances on the test server
- Unmount the test database diskgroups in the ASM instance on the test server
- Mount the live database diskgroups in the ASM instance on the test server
- Start the live database instance on the test server
In effect, this was a cold failover of the database. But without Grid Infrastructure for a cluster and clustered ASM, the diskgroups could only be mounted by a single ASM instance at one time. Thus the live disk groups could be mounted on the test server, only if not currently mounted on the live one.
Test were done by the customer, demonstrating that the scripts worked properly when the live database node failed. Eventually after all testing was completed the system for which this database was created, became operational.
I was called in, partly to triage a failure of the same recovery scripts that had worked during testing, and also to do a health check on the database and ASM environments. What I discovered from the ASM and database logs, was that one of the failgroups, on the storage array at the live data centre, had gone offline. ASM continued to read mirror copies from the other failgroup, as it is designed to do, whenever I/O for primary copies of ASM allocation units (AUs) on the offline failgroup were done. Eventually, the OS on the live server failed, possibly related to access to swap space or other system files on the array.
When the failover to the test system was invoked and the scripts executed:
- The ASM instance failed to mount the live database diskgroups
- Startup of the live database instance on the test server failed as it could not access the database files stored in the ASM diskgroups
An attempt was eventually made to fail back to the live server, once the live system OS was back up, but the same problem arose. When I examined the scripts, and the logs, it showed that the MOUNT command for the ASM diskgroups was missing the FORCE option. This is what caused the mount failures in both cases.
Mirrored ASM diskgroups, may survive the loss of ASM disks or even a complete failgroup, when already mounted, but to mount a mirrored ASM disk group requires that each AU have at least two copies discovered for normal redundancy diskgroups. If one of more ASM disks, is not accessible, then the diskgroup will not mount, because the stated level of redundancy will not have been met. The FORCE option overcomes this problem, by requesting that the diskgroup be mounted, without providing the stated level of redundancy, should a subsequent failure of an ASM disk occur.
Contributing to the problem, was the lack of testing for storage array failure. The tests had only been done on database node failure. No tests were done for storage array failure, storage network failure or for failure of any of the components upon which the storage depends. This was why the failure to mount the diskgroups in this situation, was not discovered during testing.
An alternative approach to the software architecture, would have been to use Grid Infrastructure for a cluster, rather than Grid Infrastructure for a standalone server. This would have meant the following:
- The live database diskgroups could be mounted simultaneously from the ASM instances on both the live and test servers, because clustered ASM can do this.
- No need to write custom scripts to mount the diskgroups on the test server would be needed.
- The Grid Infrastructure could be used to control the “cold failover” of the non-RAC database instance from the live server to the test server
- Since the diskgroups could be mounted already they would be accessible and the live database could start up normally on the test server
It was an interesting bit of troubleshooting, and served to re-enforce my classroom message about testing all failure scenarios when planning for High Availability. Instead of suffering mounting failures, aim for success.