Troubleshooting opEvents

General

Grep for the following in opEvents.log:

  • Event ID
  • State Object ID

Event not found

Look in the raw log.

If an event is skipped due to old age, but the time looks correct, check to see if the opeventsd was running at the time the event was received.

Event Processing

When troubleshooting event processing it's useful to understand the order that the various opEvents configuration files are processed in and the general function of each one.

State

When troubleshooting state it's important to realize that event.event and event.stateful are two completely different things.  event.stateful is referred to as 'State Type' in the node context view.  State is tracked based on event.stateful only, state status is generally up or down and may be found in the value of event.state.

EventParserRules.nmis provides the ultimate in flexibility in allowing the user to dictate what event.stateful and event.state will be presented to opEvents.  For example event.event can be a completely different value then event.stateful.

  • event.event=Apple; event.stateful=Banana; event.state=up
  • event.event=Orange; event.stateful=Banana; event.state=down

With this in mine always confirm event.stateful when troubleshooting state inconsistencies.

Poller/Primary State Mismatch

If state has been lost between the poller and Primary servers check to see if a correlation rule has fired suppressing the more specific event. 

If the issue is not related to a correlation rule look for the corresponding event on the poller.  In the event context check the 'Actions taken for event' section.  Was a script executed that would have sent the event to the Primary?  Was it successful, what was the exit code?