Page Comparison

Table of Contents

opEvents provides the Event Action Policy as a flexible mechanism for reacting to events. This document briefly describes how to configure the service, the policy language and the currently supported actions.

Action Policy Language

The action policy is configured in conf/EventActions.nmis, primarily in the section named policy. The policy consists of any number of nested if-this-then-that clauses, which specify the conditions an event must conform to and what actions to take in case of a match. Further configuration sections specific to particular actions can be present in the same file.

...

Code Block

'policy' => {
 '1' => {
 	IF => 'node.configuration.customer eq "important"',
 	THEN => { # a sub-policy
	  '10' => {
		 IF => 'node.roleType eq "core" and event.event =~ qr{Down}',
 		 THEN => ['log.disaster()', 'escalate.twentyfourseven()'], # the recommended format for specifying actions is a list
		 BREAK => 'true'
 	  },
	 '20' => {
		 IF => 'node.roleType eq "distribution" and event.event =~ qr{Down}',
		 THEN => 'priority(+2) AND email(admin)', # do avoid this legacy single-string format!
		 BREAK => 'false'
	  },
 	BREAK => 'false'
 },
 '2' => ...
},

...

No policy actions are performed for events with the property action_checked set to 1 or for events that are (already) acknowledged.
The former can be controlled by custom parser rules, the latter is mostly affected by the configuration options opevents_auto_acknowledge and opevents_auto_acknowledge_up:
With auto-acknowledge enabled, a stateful down event is automatically acknowledged when the corresponding up event arrives. In that case, the up event itself is also automatically acknowledged if and only if opevents_auto_acknowledge_up is set.
If the configuration option opevents_no_action_on_flap is set to true in conf/opCommon.nmis, then no actions are performed on the down event that is involved in a flap event, and the down event is acknowledged. This is the default behaviour.
Policy action handling is delayed by state_flap_window seconds for all stateful events, so that state flaps can be detected before any actions are performed.
Policy action handling is delayed for synthetic events, if the event creation rule sets the property delayedaction.

Supported Policy Actions

...

Action Name Description
log.logtype() Log the event to a file, as plain text or in JSON format
script.scriptname() Execute a user-defined script, possibly capturing the output
escalate.policyname() Mark this event for escalation using a particular escalation policy
email(contactname) Email the event details to a particular contact
syslog.targetserver(prio) Send the event as Syslog message to a Syslog server,
optionally overriding the event priority
nmissyslog.targetserver(prio) Send the event as Syslog message to an NMIS Syslog server,
in the format expected by NMIS
priority(adjustment) Change the priority of the event
Adjustment can be a number between 0 and 10 for fixed assignment, or +number or -number for relative adjustment.
tag.tagname(value) Set a custom event property's value for static enrichment.
Tagname is the name of the property to modify and must be a single string without spaces. Values are not restricted.
(In the database the custom tag will be stored as "tag_tagname", hence you cannot overwrite opEvents-internal properties with this action.
As a consequence, if your policy has IFs that need a tag's value, then these need to reference the tag with the 'long form' "tag_tagname".)
In opEvents 2.0.2 and newer the tagname "`kb_topic`" is special and controls linking to external data sources.
acknowledge() Acknowledges the event in question (which stops all escalation activity for the event). Supported in opEvents 2.0.3 and newer.
watchdog.set(waittime)
watchdog.disable() Creates or updates a watchdog timer for the node associated with the current event. The timer is set to expire in waittime seconds from now. If the timer is not disabled or updated before the expiration time, then a synthetic event named "Watchdog Timer expired" is generated. Note that all four watchdog actions are disabled if the current event itself is a watchdog expiration event.
element_watchdog.set(waittime)
element_watchdog.disable() Similar to the previous, but for watchdog timers that are specific to both the node and the element (e.g. an interface) of the current event. Element watchdog timers are independent of node watchdogs and of each other: Updating or disabling an element watchdog for say, `eth1` doesn't affect a timer for `lo0` for the same node.

Notes for watchdog and element_watchdog

...

The script action lets you execute a program of your choice, and optionally captures and saves that program's output with the event. As usual, the section script of conf/EventActions.nmis contains the required configuration directives:

...

'script'

...

=>

...

{

...

'traceroute_node'

...

=>

...

{

...

arguments

...

=>

...

'--max-hops=20

...

node.host',

...

exec =>

...

'/bin/traceroute',

...

output

...

=>

...

'save'

...

},

...

'ping_node'

...

=>

...

{

...

arguments

...

=>

...

'-c

...

5

...

node.host',

...

exec =>

...

'/bin/ping',

...

output =>

...

'save'

...

},

...

# supported since the 2016-11-01

...

rerelease

...

of

...

version

...

2.0.6

...

'future_proof'

...

=>

...

{
exec =>

...

[

...

"/usr/local/bin/someprogram",

...

"--first-fixed-arg",

...

"no

...

substitution

...

happens

...

here"

...

],

...

arguments =>

...

[

...

"event.node",

...

"event.event",

...

"--extra",

...

"event.details"

...

],

...

output =>

...

"save",

...

stderr =>

...

"save",

...

exitcode =>

...

"save",

...

max_tries

...

=>

...

2,

...

},

...

}

The path to the program file must be given in the the exec option. Arguments can be passed to the program; simply add them to the arguments option. Any tokens of the form event.name or node.name will be replaced by the named event or node property, respectively. If the option output is set to save, then the output of the program execution is captured and saved with the event in question; otherwise the output is discarded.

Please Note:

opEvents versions up to 2.0.3 do not support long-running programs in script actions, and opeventsd blocks until the action program terminates.
From version 2.0.4 onwards, script action handling is asynchronous and parallel, and the event status gets updated whenever processing of a script action completes.
Because of the asynchronous processing your action policy does not have access to any script.<scriptname> event properties.
Up to version 2.0.6, script actions are excuted using the system shell.
- As a consequence you have to ensure your script arguments are shell-safe, ie. that spaces are escaped or suitably quoted, that quotes line up and that the arguments do not contain unescaped shell metacharacters (",',`,!, &...).
- The exit code of the external program is not captured, only its output on STDOUT (and STDERR, unless the exec argument disposes of STDERR explicitely with a "2>..." construct).
- Argument substitution for event.name and node.name may need to be disabled (if your arguments need to contain a verbatim "event.sometext" string.
  This can be done by escaping the "." with an (escaped) backslash. For example
  Code Block
  arguments => 'node.host event\\.event ...and other stuff to feed the program'
  will cause the argument to contain the unsubstituted text 'event.event'. Node the use of single quotes.
Since the refresh of opEvents 2.0.6 on 2016-11-01, script actions are no longer executed using a shell, but directly by opeventsd instead.
This is much safer from a security perspective, and also generally faster.
- It is recommended (but not required) that you change your script configuration to use the new list format for arguments (and exec), as shown in the example above (see "future_proof").
  If you use the list format, then each token is analysed for potential property substitution and then passed on to your program, separate from all other tokens.
  Spaces, backticks or other shell metacharacters are thus no longer problematic in an event or node property.
- You can continue using the single-string arguments or exec, but then opEvents will perform the necessary word-splitting and minimal amendments for backwards-compatibility only:
  If your arguments string contains quoted tokens like "--some_program_arg=event.event", the surrounding double (or single) quotes are stripped.
  Please note that this is not performed for quotes anywhere else in your arguments string.
  I.e. with an arguments string like --weird_argument=don't, the single quote will be passed through to your program as-is.
- If you need to disable substitution (to pass in strings like "event.sometext" verbiatim), escape the "." with a backslash.
  As a much better alternative you can also put verbatim arguments in the exec list, because only the arguments list is subject to substitution.
- It is now possible to select whether the script exitcode should be captured and saved with the event.
  This is enabled by default, unless you add exitcode => 'false' to your script configuration.
- It is now also possible to select which combination of STDOUT and STDERR output of a script should be captured and saved.
  The config property output covers STDOUT, the property stderr STDERR. stderr defaults to the value of output, if not given explicitely.
  Adding "2>&1" to your script arguments is no longer supported.
- Should you absolutely require shell features in your script action, simply use /bin/sh as the exec and set the arguments to your liking, but
  please note that this is substantially less secure than direct execution if event.X or node.Y substitutions are involved.
opEvents version 2.2.2 and newer also supports the max_tries parameter which determines how often a failed script action may be retried; if max_tries is not set, then the default value 3 is used, i.e. up to three attempts to perform the action. Please note that action failure in this context means a script exceeding the maximum configured runtime or opEvents encountering a problem with starting the script, but not a script returning a nonzero exit code.

...

Versions Compared

Old Version 4

New Version Current

Key

Action Policy Language

Supported Policy Actions

Notes for watchdog and element_watchdog

Action Name	Description
log.logtype()	Log the event to a file, as plain text or in JSON format
script.scriptname()	Execute a user-defined script, possibly capturing the output
escalate.policyname()	Mark this event for escalation using a particular escalation policy
email(contactname)	Email the event details to a particular contact
syslog.targetserver(prio)	Send the event as Syslog message to a Syslog server, optionally overriding the event priority
nmissyslog.targetserver(prio)	Send the event as Syslog message to an NMIS Syslog server, in the format expected by NMIS
priority(adjustment)	Change the priority of the event Adjustment can be a number between 0 and 10 for fixed assignment, or +number or -number for relative adjustment.
tag.tagname(value)	Set a custom event property's value for static enrichment. Tagname is the name of the property to modify and must be a single string without spaces. Values are not restricted. (In the database the custom tag will be stored as "tag_tagname", hence you cannot overwrite opEvents-internal properties with this action. As a consequence, if your policy has IFs that need a tag's value, then these need to reference the tag with the 'long form' "tag_tagname".) In opEvents 2.0.2 and newer the tagname "`kb_topic`" is special and controls linking to external data sources.
acknowledge()	Acknowledges the event in question (which stops all escalation activity for the event). Supported in opEvents 2.0.3 and newer.
watchdog.set(waittime) watchdog.disable()	Creates or updates a watchdog timer for the node associated with the current event. The timer is set to expire in waittime seconds from now. If the timer is not disabled or updated before the expiration time, then a synthetic event named "Watchdog Timer expired" is generated. Note that all four watchdog actions are disabled if the current event itself is a watchdog expiration event.
element_watchdog.set(waittime) element_watchdog.disable()	Similar to the previous, but for watchdog timers that are specific to both the node and the element (e.g. an interface) of the current event. Element watchdog timers are independent of node watchdogs and of each other: Updating or disabling an element watchdog for say, `eth1` doesn't affect a timer for `lo0` for the same node.