Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Table of Contents



Feature Overview

opEvents has the ability to forward events based on filters to another server running opEvents (or other service desk systems, like Servicenow and Connectwise).   This is used as part of the Opmantek Multi-server architecture with distributed pollers and centralised primary servers,  it is also extremely useful in situations such as the primary an opEvents instance not being reachable across the internet, but the secondary isanother central instance of opEvents is reachable.

Events are forwarded using http or https which is setup separately on your server, independent of opEvents. A typical Apache SSL configuration works just fine.

Related Wiki Articles

Configuring opEvents

opEvents Configuration Settings

Create remote event

Configuring SSL on apache for NMIS and OMKImage Removed

PreConditions

We're assuming you already have a

primary instance of

poller running NMIS and opEvents along with

a secondary

another instance of opEvents setup, configured and working.

This functionality is intended for users with advanced knowledge of NMIS and opEvents.

If you need HTTPS security, this should already have been configured (see link above).

The details here are related to opEvents 4.x and higher which integrates with NMIS9.

References

Image Added

Configuration

To enable this functionality, you must edit the opEvents "Event Actions" JSON file. We recommend using the web GUI to do this as there is a very handy "Validate" button that can be used to ensure your configuration changes are indeed valid JSON and won't break opEvents. If you must, you can also edit the JSON file directly at /usr/local/omk/conf/EventActions.json file. Beware that making changes that result in an invalid JSON file will result in your actions not functioning as intended.


Validating a JSON file on the command line can be done using the command:

Code Block
python -mjson.tool /usr/local/omk/conf/EventActions.json


opCommon.json

Certain node properties must be defined in opCommon.json in order for opEvents to make use of them. 

Disable Auto Acknowledge of Up Events

The opEvents engine will normally auto acknowledge up events as they are clearing a down event, to make this solution work you will need to disable this feature.

Verify and if necessary modify the configuration file /usr/local/omk/conf/opCommon.json and change the setting opevents_auto_acknowledge_up to be false. 

Setup Macros

Code Block
   "macro" : {
      "authority" : "YOUR_SERVER_NAME"
   },

Copy Node Properties

opEvents supports the ability to copy node properties to the event so they persist with the event and can be used.  In earlier versions for example, group and location were at the root level. These are now under configuration so should be defined as below.

Code Block
      "opevents_event_copy_node_properties" : [
         "configuration.group",
         "configuration.location"
      ],

Not like below.

Code Block
languagejs
      "opevents_event_copy_node_properties" : [
         "group",
         "location"
      ],

This is mostly of relevance to customers who have upgraded from opEvents 2.x and if you already have opEvents 4.x running successfully, it likely is simply something to double check.

Verify the opCommon.json file is valid JSON

Validating a JSON file on the command line can be done using the command:

Code Block
python -mjson.tool /usr/local/omk/conf/opCommon.json

Restart the opEvents Daemon

Once you have made these changes you will need to restart the opEvents Daemon (opeventsd).

Code Block
sudo service opeventsd restart


EventRules.json

Also check for similar naming in conf/EventRules.json, ie: node.serviceStatus is incorrect. It should be node.configuration.serviceStatus.

Again, this is mostly for those users setting up opEvents after upgrading from an earlier version.

Just something to double check and be aware of.

Event Policy

There are a few sections in EventActions.json, but the one we're concerned with first is the policy section. Entries in a given section are numbered for determining the order in which they are processed. Usually we make forwarding the last entry (the entry with the highest number).

Here is the full policy together, the following will explain how it flows.

Code Block
   "policy": {
        "10":
        {
            "BREAK": "false",
            "IF": "event.any",
            "THEN":
            {
                "10":
                {
                    "IF": "event.state =~ /up|closed/",
                    "THEN": "tag.escalateToCentral(FALSE) and tag.sendToCentral(TRUE)",
                    "BREAK": "true"
                },
                "20":
                {
                    "IF": "node.configuration.serviceStatus eq 'Production'",
                    "THEN": "tag.escalateToCentral(TRUE) and tag.sendToCentral(FALSE)",
                    "BREAK": "true"
                }
            }
        },
        "20":
        {
            "BREAK": "false",
            "IF": "event.any",
            "THEN":
            {
                "10":
                {
                    "IF": "event.tag_escalateToCentral eq 'TRUE'",
                    "THEN": "escalate.central()",
                    "BREAK": "true"
                },
                "20":
                {
                    "IF": "event.tag_sendToCentral eq 'TRUE'",
                    "THEN": "script.sendToCentral()",
                    "BREAK": "true"
                }
            }
        }
    },

Check All Events

The first policy block says process all events, e.g. IF event.any

First we want to check all events and filter them in sub-sections. This just makes things easier for a human to read and understand. So our IF section at the first level policy.10 is a simple node.any and event.any. For any node and any event. The event, the THEN section is again numbered for order, that will then be processed.

"policy": {
    "100":
    {
        "BREAK": "false",
        "IF": "node.any and event

.

any",

        "THEN":

Tag any up events

Next we need to decide which events we would like to send to the second central instance. We're going to send (for the purposes of this example) all events with a node that has a serviceStatus of 'Production'. These events will go into a second list, inside the top level. Also, we  We should send all 'Node Up' or 'Closed' events to the second instance, regardless of the node so it cancels out any escalation. It dows does not matter that no 'down' event might not have been sent previously. So our JSON will now look like the below.

"policy": {
    "100":
    {
        "BREAK": "false",
        "IF": "node.any and event.any",
        "THEN":
        {
            "10":
            {
                "IF": "event.state = 'up|closed'",
                "THEN": "tag.escalateToSecondary(FALSE) and tag.sendToSecondary(TRUE)",
                "BREAK": "false"
            }
        },

Tag any Production node events

Now you may have noticed we have two items to action - escalateToMaster and sendToMaster. escalateToMaster sends an escalation where-as sendToMaster doesn't do anything other than simply send the event. No escalation required. We set our event tags to be escalateToCentral = FALSE and sendToCentral = TRUE




Code Block
                "10":
                {
                    "IF": "event.state =~ /up|closed/",
                    "THEN": "tag.escalateToCentral(FALSE) and tag.sendToCentral(TRUE)",
                    "BREAK": "true"
                },


You may also have noticed the 'up|closed' entry. These entries take perl regular expressions, so in this case if up or closed is present in the event/state, it will match and trigger the THEN.

Next we'll add an event that we want to escalate. Let's check if the serviceStatus is production and if so, tag it for escalation.

"policy": {
    "100":
    {
        "BREAK": "false",
        "IF": "node.any and event.any",
        "THEN":
        {
            "10":
            {
                "IF": "event.state = 'up|closed'",
                "THEN": "tag.escalateToSecondary(FALSE) and tag.sendToSecondary(TRUE)",
                "BREAK": "false"
            }
        },
            "20":
            {
                "IF": "node.serviceStatus = 'Production'",
                "THEN": "tag.escalateToSecondary(TRUE)",
                "BREAK": "false"
            }
        },


That's all we'll bother with to keep things simple

Tag any Production node events to escalate important events

We're going to escalate (for the purposes of this example) all events with a node that has a serviceStatus of 'Production', this could be some other criteria like the criticality of the event, e.g. "event.priority > 3".

These events will go into a second list, inside the top level, we set our event tags to be escalateToCentral = TRUE and sendToCentral = FALSE

Code Block
                "20":
                {
                    "IF": "node.configuration.serviceStatus eq 'Production'",
                    "THEN": "tag.escalateToCentral(TRUE) and tag.sendToCentral(FALSE)",
                    "BREAK": "true"
                }

Now you may have noticed we have two items to action - escalateToMaster and sendToMaster. escalateToMaster sends an escalation where-as sendToMaster doesn't do anything other than simply send the event. No escalation required.

Check tags

Next we need to check the tags and if required, escalate or send it to the second instance.

"policy": {
    "100":
    {
        
Code Block
        "20":
        {
            "BREAK": "false",

        

            "IF": "
node.any and
event.any",

        

            "THEN":

        {
            "10":
            {
                

            {
                "10":
                {
                    "IF": "event.
state = 'up|closed'",
                "THEN": "tag.escalateToSecondary(FALSE) and tag.sendToSecondary(TRUE)",
                "BREAK": "false"
            }
        },
            "20":
            {
                "IF": "node.serviceStatus = 'Production'",
                "THEN": "tag.escalateToSecondary(TRUE)",
                "BREAK": "false"
            }
        },
            "30":
            {
                "IF": "event.tag escalateToSecondary eq 'TRUE'",
                "THEN": "escalate.secondary()",
                "BREAK": "false"
            }
        },
            "40":
            {
                "IF": "event.tag sendToSecondary eq 'TRUE'",
                "THEN": "script.sendToSecondary()",
                "BREAK": "false"
            }
        }
    }
tag_escalateToCentral eq 'TRUE'",
                    "THEN": "escalate.central()",
                    "BREAK": "true"
                },
                "20":
                {
                    "IF": "event.tag_sendToCentral eq 'TRUE'",
                    "THEN": "script.sendToCentral()",
                    "BREAK": "true"
                }
            }
        }

Escalate

As you can see, we reference two different routines in the THEN sections for #30 and #40.policy.20.10 and policy.20.20

For the THEN action escalate.secondary in #30central in policy.20.10, we are simply checking if the event has a priority high enough to send and our hours of operation correspond. If it does, send it. This is defined in the escalate section of JSON.  The 360 refers to the escalation to take place after 360 seconds has passed.  Implementing like this will prevent flaps and noisy events from going to the central server, this greatly reduces the noise of transient events in the environment, you can use a lower time here if you want them faster.  You can also perform other action here, so many options.

"escalate": {
    "secondarycentral": {
      "name": "secondarycentral",
      "IF": {
        "priority": ">= 1",
        "days": "Monday,Tuesday,Wednesday,Thursday,Friday,Saturday,Sunday",
        "begin": "0:00",
        "end": "24:00",      
        },
        "10360": "script.sendToSecondarysendToCentral()",
    }
}

Script

Any 'script.X' THEN items are defined in the script section of EventActions.json - hey, big surprise, I know (smile) (smile)

For #40policy.20.20, we don't bother testing for a priority, we just send them. This is because events such as 'up' or 'closed' have low priority - but obviously we want to send them to cancel and the 'down' type events.

So our script blocks looks as below.

Code Block
"script": {


    "
sendToSecondary
sendToCentral": {


        "arguments": [


            "-s",


            "https://your_
secondary
central_opevents_server/omk",


            "-u",


            "your_opevents_user",


            "-p",


            "your_opevents_password",


            "authority=macro.authority",


            "location=https://your_local_opevents_server/omk/opEvents/events/event._id/event_context",


            "
host
node=event.node
.host
",


            "event=
NETWORK-
event.event",


            "details=event.details",


            "time=event.time",


            "date=event.date",

 
            "element=event.element",


            "interface_description=event.interface_description",


            "type=event.type",

 
            "priority=event.priority",


            "level=event.level",

 
            "
nodeType
state=
node
event.
nodeType
state",


            "
state
stateful=event.
state",
            "stateful=event.stateful",
            "deviceType=node.deviceType",               
        ],
        "exec":
stateful"
        ],
        "exec": "/usr/local/omk/bin/create_remote_event.exe",
        "output": "save",
        "stderr": "save",
        "exitcode": "save"
    }
}

Secure Wrapper

If you wish to not include your username and password in the Event Actions file, you can use a wrapper script.

Create a script e.g. /usr/local/omk/bin/create_remote_event.sh and make the ownership and permissions for root only.

Code Block
sudo touch /usr/local/omk/bin/create_remote_event.sh
sudo chown root:root /usr/local/omk/bin/create_remote_event.sh
sudo chmod 700 /usr/local/omk/bin/create_remote_event.sh

The contents of this script would be:

Code Block
#/usr/bin/env bash

# username and password below, pass through all other arguments.
/usr/local/omk/bin/create_remote_event.pl -u "your_opevents_user" -p "your_opevents_password" $@

Note: If your user's password has special characters (such as $) you may need to escape them in the script. For example a password of "opeventsPas$w0rd" would be written as:

Code Block
-p "opeventspas\$w0rd"

Event Actions Script would be updated as such, leaving out the username and password.

Code Block
"script": {
    "sendToCentral": {
        "arguments": [
            "-s",
			"https://your_central_opevents_server/omk",
            -- snip same as above in here --
            "stateful=event.stateful"
        ],
        "exec": "/usr/local/omk/bin/create_remote_event.
exe
sh",


        "output": "save",


        "stderr": "save",


        "exitcode": "save"
    }
}

Handling a High Volume of Events Being Sent to Primary Event Server

A faster CLI tool was developed written in GO,


    }
}

which executes in less than half the time, you can find details here Create remote event → Fast create remote event

You can download the latest version from the link in the page above, copy the binary into /usr/local/omk/bin, and rename the file or make a symbolic link so you can deal with shorter name.

Code Block
ln -s fast-remote-event-1.x.x-LinuxX86_64.bin fast-remote-event

Then update your Event actions or your shell wrapper /usr/local/omk/bin/create_remote_event.sh to use this binary instead of create_remote_event.pl or create_remote_event.exe

e.g.

Code Block
"exec": "/usr/local/omk/bin/fast-remote-event",

To have it return the resulting event id from the primary, include "-q=0" in the arguments.

Finishing Up

And that's it!

You have selected events to be forwarded and tagged them. Tested those tags for actions and depending on the tag and priority, forwarded it to your secondary central instance.

So for future versions, you might define an event on a node with serviceStatus of "Testing" and choose only to forward it during office hours (for example). Give it a different tag, check that tag and use escalate with an additional item (say "testing_office_hours_secondarycentral"). The ways to configure opEvents really are limited only by your imagination.

Hopefully this article has given you some ideas and pointed you in the right direction.