Epoch AOC: Alerts¶
Epoch helps you get notified before critical issues occur with the help of Alerts. This section will walk through the setup and management of Alerts, and options to integrate them with PagerDuty and webhooks to route notifications.
Creating An Alert¶
Create an alert by clicking the New Alert button in the Manage Alerts tab.
Creating new alerts
An alert definition comprises of multiple sections, as described below.
Create query statements for an alert. A query statement includes a metric, aggregate datapoint and filter selection. Query Builder enables you to build multiple query statements both for the required main query and the optional subquery. You can also create an expression statement supporting simple arithmetic operations, rolling aggregates and top-N ranking. Read more in the Expressions documentation.
Set a threshold for critical and warning alerts. The defined average, min or max value is compared against critical and warning thresholds in the selected time frame. You can also create a "no data" alert when data is missing for a certain time.
Notifications use triggered alert descriptions in their payload.
You can specify a message subject and body, as shown in the example below, using the available template variables:
alertName alertStatus triggeredGroup alertThreshold alertOperator alertAggregation aggregationDuration evaluatedValue triggeredTime triggeredTimeUnix_ms triggeredTimeUnix_sec triggeredTimeUTC triggeredTimeISO alertLink
If multiple groupbys have been selected for the alert (for example,
triggeredGroup will evaluate, in the alert notification, to:
http.uri: /login, server.host_name: hostname
You can also specify a key to get the corresponding value, using the following format:
Specify a notification recepient. See more in the Alert Notification Settings section below.
Set a global filter for an alert. This global filter is ANDed with the query filters. Read more in the Query Builder documentation.
An alert is always associated with a status. The following are the four alert status values with their respective color-codes:
- Red: critical threshold violation
- Orange: warning threshold violation
- Green: no violations
- Grey: “no data” alert shown when data is missing for a metric - only applicable to metrics that have been seen in the past but are missing data in the evaluation window of the alert
Alert Notification Settings¶
Alert notification recepients can be configured under Settings > Notifications in the menu. You can receive notifications for an alert through webhooks or PagerDuty.
Manage Alerts tab
Clicking on any alert in the Manage alerts list will take you to the alert page where you can see the groups (selected in GroupBy) that have been triggered by that alert and modify the alert settings.
Additionally from the list view you can:
- mute any alert, clone or delete it
- use out-of-the-box templates for creating new alerts
Viewing Alerted Groups¶
On an alert page, under the Status tab, the All Groups tab shows all the groups associated with the alert. You can also see Triggered Groups, which contains all the groups that have been triggered from among the groups that you have selected in GroupBy (eg. http.uri) when creating an alert. You can add, remove, and search for groups of interest from the filter bar.
View triggered groups for an alert