Create Alarms and Actions
2. Select Monitoring in the drop-down menu. The Monitoring Manager panel displays.
3. Select the Create button in the Alarms tab.
4. In the Create panel, create an alarm by performing the following substeps:
- Name: choose an appropriate name.
- Server: select a virtual instance.
- Metric: select the relevant metric.
- select Total for absolute numbers (example: CPU utilization).
- Total is unsupported for other metrics except if you’re using “delta” aggregation.
- select a per-time rate in all other cases.
- Threshold Type: define equality of value.
- Threshold: enter the threshold value.
- Range: define the time range within which the threshold must exceed the configured parameter limits before triggering an alarm event. The minimum time allowed is 120 seconds.
- Range Aggregation: time range aggregation types are Average or Delta.
- Average calculates the average of the defined metrics for all samples collected within the defined range. The average metric must be outside the configured parameter for the defined range to trigger an action event. This range aggregation is useful to detect constant load patterns.
- Delta compares the change of the defined metric during the defined range. This range aggregation helps detect significant positive or negative spikes within the load patterns.
- Alarm Delay: define the duration between the monitoring metric exceeding the configured threshold parameters and sending an alarm notification. The minimum time allowed is zero seconds, which will trigger an action event immediately once the defined criteria are fulfilled. You can also delay the alarm notification by setting a higher value. Setting a higher alarm delay will consider further metric samples in the calculation and continue shifting the range as time passes. As a result, the system may return to its regular functioning mode after a brief spike in the load pattern.
- Actions: select one or multiple actions upon alarm trigger.
- Select Create to save the alarm.
Once an alarm has been created, you can either edit the alarm configuration or delete it if it is no longer needed.
The details view provides an option to list the status and history of an alarm.
Upon alarm trigger, the alarm icon will blink to indicate that the configured threshold has been triggered. The blinking will stop automatically once the system is back within the threshold. This is monitored within the defined duration. The alarm status is OK while the system runs within the boundaries of the threshold. Once it is outside the boundaries and meets the criteria for triggering an action, the status will change from OK to FIRING. This transition will execute the defined action. When the system returns to the defined threshold boundaries, the status returns from FIRING to OK. The status change will be visible in the alarm manager but will not trigger an action.
The alarm configuration defines an average CPU utilization of 70% or higher as critical, and it has a defined range of two minutes and an alarm delay of zero seconds. The system collects four samples in two minutes in the following example, even though it collects metrics more often.
- Sample01: 20% CPU utilization
- Sample02: 30% CPU utilization
- Sample03: 50% CPU utilization
- Sample04: 70% CPU utilization
The alarm will not trigger for this two-minute range because the average of samples for the considered range will be below 70%. Considering subsequent samples:
- Sample05: 71% CPU utilization
- Sample06: 72% CPU utilization
- Sample07: 75% CPU utilization
- Sample08: 30% CPU utilization
When reaching "Sample07", the monitoring service will calculate an average CPU utilization above 70%. As the alarm delay is set to zero seconds, it will trigger an action to send an alarm immediately.
Assume the same example but with an alarm delay of 30 seconds. For Sample07, the monitoring service calculates an average CPU utilization above 70%. Still, the system does not trigger an action this time, as an alarm delay of another 60 seconds is configured, but the monitoring system constantly collects further metrics. When it receives Sample08, the monitoring services have calculated the average CPU utilization for the previous 120 seconds, which will be below 70%. The system does not trigger an action as the alarm threshold criteria are unfulfilled.
The examples below show possible configurations for the
expressionproperty of an Alarm.
Current CPU Load for all Cores
Setting a trigger when the average load of all cores over the last hour exceeds 90%.
Increase in Received Bytes
Setting a trigger when more than 1MB is incoming within the last ten minutes.
Sent Packets per Second (lower bound)
Set a trigger when there is less than one outgoing packet per second. ****
Storage Writes per Second
Set a trigger when there are more than 100 write operations per second.
In the Actions section, you can configure an action that will be executed when an alarm is activated. Currently, MaaS supports email notifications.
1. Open the Monitoring Manager and select the Actions tab. The Actions tab displays.
2. Create an Action by performing the following sub-steps:
- Define a name for the action.
- Select the action type (send email only at the moment).
- Provide an Email address.
- Select Create.
After creation, you can edit the action configuration or delete it if it is not needed anymore.
The details view of a created action contains an option to list the execution history of an alarm.
You can only delete an action when it is not in use by an alarm.