API Documentation | Rule Set

Rule sets let you define a collection of rules to apply to a given metric (or metrics which match a given pattern.) The rules are processed in order, and the first one to be found in violation generates an alert, stopping further processing. Each rule has a severity which links to a contact group or groups to be notified about the alert.

Fields

_cid string
The primary key
"/rule_set/194602"
A string containing a rule_set cid
_host string
The hostname or IP that the check for this ruleset gets its data from
"acme.circonus.com"
A string containing freeform text
check string
The ID of the check which collects the metric this ruleset is defined for.
"/check/11641"
A string containing a check cid
contact_groups object
A collection of contact_groups to contact at each severity level if this ruleset faults.
{"5":[],"1":["/contact_group/426","/contact_group/428"],"4":[],"3":[],"2":["/contact_group/428"]}
An object
1 array
A list of contact groups to contact when this ruleset generates a severity level 1 fault
An array
string (zero or more times)
A contact group to contact
A string containing a contact_group cid
2 array
A list of contact groups to contact when this ruleset generates a severity level 2 fault
An array
string (zero or more times)
A contact group to contact
A string containing a contact_group cid
3 array
A list of contact groups to contact when this ruleset generates a severity level 3 fault
An array
string (zero or more times)
A contact group to contact
A string containing a contact_group cid
4 array
A list of contact groups to contact when this ruleset generates a severity level 4 fault
An array
string (zero or more times)
A contact group to contact
A string containing a contact_group cid
5 array
A list of contact groups to contact when this ruleset generates a severity level 5 fault
An array
string (zero or more times)
A contact group to contact
A string containing a contact_group cid
derive string
A summary of the windowing_function used in each rule. If you pass this field it will provide a default value for the windowing_function for each rule that you pass that doesn't otherwise have a windowing_function passed in that rule defintion. When read this field contains the name of the windowing_function used in the ruleset if and only if all rules use the same windowing_function. If more than one windowing function is used this field will read 'mixed', and if no rules are defined in this ruleset it will have a null value.
null
A string containing either null, 'average', 'stddev', 'derive', 'derive_stddev', 'counter', 'counter_stddev', 'derive_2', 'derive_2_stddev', 'counter_2', 'counter_2_stddev' or 'mixed'. May also contain the string 'mixed' when read.
filter string
Tag filter expression. https://www.irondb.io/docs/tag-queries.html
A string containing freeform text
link string
A link to external documentation (or anything else you feel is important) for this metric that will show up in email alerts and in the Circonus UI.
"http://example.com/how2fix/webserver_down.html"
A string containing a URL
metric_name string
The name of the metric this ruleset is defined for.
"tt_firstbyte"
A string containing freeform text
metric_pattern string
A pattern to match on rather than using metric_name
A string containing freeform text
metric_tags array

An array of tags. The tags in the array are automatically sorted, deduplicated and transformed into their lowercase canonical form.
string (zero or more times)
An associated tag
A tag is just a string, with or without a colon, such as 'foo', 'bar', 'datacenter:london', or 'os:linux'. The part of the string before the colon is considered the category the tag is in; Tag strings without a colon will place the string in the 'uncategorized' category. Circonus will lowercase the contents of the string before storing it.
metric_type string
The type of values the metric returns
"numeric"
A string containing either 'numeric' or 'text'
name string
Optional name to display instead of check and metric / pattern+filter. Short and unique names work best.
"webserver response"
A string containing freeform text
notes string
Any notes on how to recover alerts for this metric, or anything you feel is important to convey to any responders to alerts. These notes show up in the Circonus UI.
"Determine if the HTTP request is taking too long to start (or is down.) Don't fire if ping is already alerting"
A string containing freeform text
parent string
The metric id of the parent for this ruleset. If the parent is in a severity 1 status alerts for its children will be silenced until it clears.
"11640_ping"
A string containing a fully qualified metric name in the format <digits>_<string>.
rules array
The rules used to evaluate the metric. Rules are assessed in order until the first matching rule is determined.
[{"windowing_duration":300,"windowing_function":null,"severity":"1","wait":5,"value":"300","criteria":"on absence"},{"value":"1000","criteria":"max value","windowing_duration":300,"windowing_function":null,"severity":"2","wait":5}]
An array
object (zero or more times)

An object
criteria string
The criteria Circonus should evaluate the metric on to determine if the ruleset is faulting
A string containing either 'match', 'does not match', 'contains' or 'does not contain' for a text metric, 'min value' or 'max value' for a numeric metric, or 'on change' or 'on absence' for either a numeric or text metric
severity number
The severity of alert the alert should raise (and by implication, which of the contact groups in contact_groups should be contacted).
A number containing an integer between 0 and 5 inclusive
value string
The value the criteria will compare the metric value against.
A string or number to compare against. For an 'on absence' rule this is the number of seconds the metric must not have been collected for, and should not be lower than the period and timeout of the metric being collected.
wait number
The length of time we should wait before contacting the contact groups after this ruleset has faulted.
A number containing an integer number of minutes to wait
windowing_duration number
The time, in seconds, over which windowing is applied (or null when no windowing)
A number containing an integer greater than or equal to zero
windowing_function string
The type of windowing to perform (or null when no windowing)
A string containing either null, 'average', 'stddev', 'derive', 'derive_stddev', 'counter', 'counter_stddev', 'derive_2', 'derive_2_stddev', 'counter_2' or 'counter_2_stddev'
tags array

An array of tags. The tags in the array are automatically sorted, deduplicated and transformed into their lowercase canonical form.
string (zero or more times)
An associated tag
A tag is just a string, with or without a colon, such as 'foo', 'bar', 'datacenter:london', or 'os:linux'. The part of the string before the colon is considered the category the tag is in; Tag strings without a colon will place the string in the 'uncategorized' category. Circonus will lowercase the contents of the string before storing it.

Example

Fetching a Rule Set

Fetching details for a rule set is as simple as performing a GET on the rule set cid:

GET /rule_set/194602
{"_cid":"/rule_set/194602","check":"/check/11641","name":"webserver response","notes":"Determine if the HTTP request is taking too long to start (or is down.) Don't fire if ping is already alerting","contact_groups":{"1":["/contact_group/426","/contact_group/428"],"4":[],"5":[],"3":[],"2":["/contact_group/428"]},"rules":[{"severity":"1","windowing_duration":300,"windowing_function":null,"wait":5,"value":"300","criteria":"on absence"},{"value":"1000","criteria":"max value","severity":"2","wait":5}],"derive":null,"parent":"11640_ping","metric_name":"tt_firstbyte","_host":"acme.circonus.com","link":"http://example.com/how2fix/webserver_down.html","metric_type":"numeric"}

From this we can see some important things:

  • We're monitoring the "tt_firstbyte" (time till first byte) metric of the check "/check/11641".
  • Since the rules are evaluated in order, first the rule will check if no results have been gathered in the last five minutes. If this case is true it'll raise a severity one alert; If after a further five minutes this hasn't been resolved it'll contact all the contacts in the severity 1 contact groups (/contact_group/426 and /contact_group/428).
  • If the on absence rule didn't match then the rule set will next check the max value of the metric (i.e. the time in milliseconds it took to get the first byte back). If this is more than a 1000 (i.e. more than a second) then this will raise a severity 2 alert; If after five minutes this alert hasn't been resolved then the contacts listed for the severity 2 contact groups will be contacted (just /contact_group/428).
  • We can also see that this ruleset won't generate an alert at all if the parent rule set is already faulting; i.e., if the parent ping test is failing then we assume the whole server is offline and we don't need to also worry about informing anyone about the webserver breakage on top of that.
  • When something goes wrong, the API has been provided with information in the form of notes and a link that will be easily accessible to people when they are notified, hopefully giving them the procedure they need to follow to fix it.

Creating a New Rule Set

Creating a new rule set is as simple as making a HTTP POST request to /rule_set and passing JSON representing the new rule set:

POST /rule_set
{"metric_type":"numeric","metric_name":"orders`canceled","check":"/check/14214","rules":[{"value":"20","criteria":"max value","severity":"1","wait":10}]}

As is usual, the POST request returns the full JSON in the response (including the new rule set's cid).

Note that without a contact group the alert will still fire, but no-one will be informed unless they're looking at the Circonus dashboard.

Updating a Rule Set

Rule Sets can be updated simply by using the PUT HTTP method with new field values. Most of the values for a rule set cannot be changed once it has been created. The "metric_name", "metric_type" and "check" all form the core identity of the rule.

You can however alter who the alert contacts for the various severities at any point:

PUT /rule_set/194603
{"contact_groups":{"2":["/contact_group/911"],"3":["/contact_group/112"],"5":[],"4":[],"1":["/contact_group/999"]}}

Or alter the conditions that trigger the alert (and the severities)

PUT /rule_set/194603
{"rules":[{"criteria":"max value","value":"50","wait":10,"severity":"1"}]}

Care should be taken to ensure that the rules make sense. In particular you should ensure:

  • If you care about being alerted if for any reason the metric can't be collected, then you should ensure an "on_absence" rule is properly added; without this, a time till first byte alert for an HTTP check wouldn't be trigged by a "max value" rule alone.
  • That the rules make sense when evaluated in the order they're listed. The first matching rule stops all further evaluation. Consider a case where you add a low priority rule ahead of a high priority rule, and they both would be triggered by a particular event. The high priority rule will never trigger in that situation because the low priority rule is first in the list.

Using a Windowing Function

Sometimes you don't want to alert when something changes, but only if it changes too quickly. Circonus supports a selection of "windowing functions" that can be used to monitor rate of change over a given duration of time.

For example, to check that orders being canceled doesn't change by more than five over a five minute period:

PUT /rule_set/194603
{"rules":[{"wait":0,"windowing_function":"derive","windowing_duration":300,"severity":"1","criteria":"max value","value":"5"}]}

Listing and Searching Rule Sets

Rule sets can be listed simply by performing a HTTP GET request on /rule_set. You can search in the usual way, for example to list only rule sets monitoring check /check/14214:

GET /rule_set?search=(check_id%3A14214)
[{"parent":null,"derive":null,"metric_name":"duration","contact_groups":{"2":["/contact_group/911"],"3":["/contact_group/112"],"4":[],"1":["/contact_group/999"],"5":[]},"rules":[{"windowing_duration":300,"severity":1,"windowing_function":null,"wait":10,"value":"50","criteria":"min value"}],"link":null,"metric_type":"numeric","_host":"acme.circonus.com","_cid":"/rule_set/194603","check":"/check/14214","notes":null}]

Removing Rule Sets

Rule Sets can be simply removed by using the DELETE HTTP method on the cid:

DELETE /rule_set/194603