I am often asked about monitoring and management when teaching The Exadata Database Machine administration course, both to customers and to Oracle internal staff. This is a popular topic due to the nature of the Database Machine, the monitoring of which requires knowledge and skill in several areas. These include:
- database administration
- storage administration
- operating system administration
- network administration
- hardware knowledge
Some monitoring and performance management may be done using command line tools. Some of these may be familiar to DBAs whilst others are more familiar to OS and network administrators. The course discusses there only briefly is focuses primarily on concepts, architecture, and cell administration. These include:
- the cellcli utility to monitor the Exadata cells and to set thresholds for alerts.
- the Automated Diagnostics Repository (ADR) and the adrci command line tool for managing and monitoring traces, dumps and alert logs on database servers and on cells.
- snmp traps to receive alerts from, and to configure device parameters on the database server
- snmp traps to receive alerts from, and to configure device parameters on the Exadata cells
- snmp traps to receive alerts from, and to configure device parameters on the Ethernet switch
- snmp traps to receive alerts from, and to configure device parameters on the Infiniband switch
- snmp traps to receive alerts from, and to configure device parameters on the Keyboard, Video and Mouse (KVM) switch
- snmp traps to receive alerts from, and to configure device parameters on the Power Distribution Unit (PDU)
- ILOM command line and GUI interfaces to monitor hardware, control power state and provide remote console for Exadata cells and DB servers
Monitoring may be made easier however, by using some or all of the Enterprise Manager Plug-ins for the Database machine.Each plug-in contains logic for a specific DBM component and extends the GUI interface of the Enterprise Manager Oracle Management Server (OMS) and the Enterprise Management Agent to manage the custom targets contained in a Database Machine.
The plug-ins available are:
- Exadata Storage Server or Cell plug-in
- Database Server ILOM plug-in
- Infiniband Switch System Monitoring plug-in
- Cisco Switch System Monitoring plug-in
- Avocent MergePoint (KVM) plug-in
- Power Distribution Unit (PDU) plug-in
Enterprise Manager is well suited to this type of monitoring, as one may define an Aggregate Service comprising the various targets in a Database Machine.In an X2-2 full rack, there are more than 60 targets and having a single “dashboard” to represent the entire Database Machine simplifies monitoring. Each type of component may be grouped into an abstract collection called a System, and then the Systems are aggregated together into the Aggregate Service. Finally a monitoring “dashboard“, based on the Aggregate Service is created to facilitate monitoring from a single screen in Enterprise Manager.
Here is a brief summary of each plug-in, discussing the purpose and general use.
1. Exadata Storage Server plug-in
This extends the monitoring of exadata cells in addition to providing a GUI interface. The plug-in uses an SSH connection to the cellmonitor user on the cells and uses list commands only. This is for interactive monitoring. One may also set thresholds using the plug-in which are distinct from any thresholds set using cellcli utility as the celladmin user. For alerts to be sent to the plug-in, SNMP traps are used as follows:
- Cell ILOM alerts are sent to the cell Management Server (MS) via an SNMP trap. The MS then send SNMP notifications onward to the plug-in.
- Cell alerts flagged by MS itself, such as cell thresholds being exceeded, or ADR software alerts, are sent to the plug-in using SNMP.
Note: cells may also send SNMP notifications to multiple locations including third party tools. This may be configured by the celladmin user using the cellcli utility, or by running the distributed command line interface (dcli) utility.
2. Database Server ILOM plug-in
Monitoring databases and their instances, ASM environments, the Grid Infrastructure, and the host software environment are done by Enterprise Manager in the usual way as these are standard targets. But monitoring the hardware for the database servers requires the ILOM plug-in, as there is no Management Server (MS) on the database servers to receive SNMP traps from the ILOM. The plug-in will receive sensor state and availability data from the ILOM including alerts based on pre-set ILOM thresholds.
3. Infiniband Switch System Monitoring plug-in
Infiniband switches running software from version 1.1.3 onwards can send SNMP data and the plug-in will show if the switches are up and if any alerts have been sent. If necessary one may then ssh to the switch ILOM and use native command line tools to diagnose problems.
Note: There are also numerous command line tools on the database servers and cells to check the infiniband ports, which are beyond the scope of this overview.
4. Cisco Switch System Monitoring plug-in
Clients and middle tiers connect to the database instances on the Database nodes over the Client Access Network. The Database nodes use the Infiniband network for the Cluster Interconnect, and Database nodes and Exadata Storage servers use the Infiniband network as the storage network. As the Cisco switch is used on DBM Management network, it is not a critical component in the same way as are the components already listed, but the switch is needed for admin and monitoring traffic in the DBM. All the ILOM SNMP traffic for example uses this network.
Monitoring of the switch using the plug-in is for switch availability and hardware sensor thresholds. The default values are set in the plug-in to monitor for switch availability. SNMP traps are sent by the switch for asynchronous notification of alerts.
Note: Customers may replace the Cisco switch if they prefer as it is not considered a critical component but then they would be responsible for monitoring.
5. Avocent MergePoint (KVM) plug-in
All models of DBM have the KVM switch, except for the X2-8 which has no room for it in the rack. Console access to the machines on the x2-8 is done using the ILOM. The plug-in is simply to monitor for availability. Alerts are sent to the plug-in via SNMP traps.
Note: Customers may replace the KVM if they prefer but then they would be responsible for monitoring.
6. Power Distribution Unit (PDU) plug-in
There are two PDUs per rack. Monitoring them using the plug-in is optional but if one wishes to do so, then each PDU will require an ethernet cable to connect to the PDU. On the X2-2 Full rack, all 48 ports on the Cisco switch are used, so a network drop from an external switch will be required to connect. And it is good practice to do this on the X2-8 despite the availability of ports, because the failure of the PDU could also affect the Cisco switch in the DBM preventing monitoring and alerts from being sent on the management network. Remote monitoring using the plug-in allows one to observe the electric power used in the DBM and any alerts as well. The factors that affect power consumption are:
- Type of DBM – quarter, half or full rack
- Version of hardware – V2, X2-2 or full rack X2-8
- Type of power supply – low or high voltage; one or three phase power
Note: My Oracle Support note 1299851.1 covers the threshold settings for the PDUs depending on the variations in model and power.
Installing and configuring all these plug-ins can be time consuming, especially to people who are new to Enterprise Manager, the Database Machine or both. As of release 188.8.131.52.x of the Exadata software, an Enterprise Manager setup scripts are provided to help customers install and configure the monitoring tools and plug-ins. It contains:
- A script to set up EM agents on the Database Machine
- A script to set up a new Grid Control OMS and OMR for customers new to EM
These scripts will save time by:
- Installing the latest Exadata DBM plug-ins
- Discovering all Exadata DBM system targets
- Configuring all the Exadata DBM targets.
The Plug-in bundle may be downloaded from http://www.oracle.com/technetwork/oem/grid-control/downloads/devlic-188770.html
Oracle University will soon offer a seminar on Exadata Database Machine Monitoring where the plug-ins mentioned here and various command line monitoring tools will be examined in detail.