Troubleshooting opEvents
General
Grep for the following in opEvents.log:
- Event ID
- State Object ID
Event not found
Look in the raw log.
If an event is skipped due to old age, but the time looks correct, check to see if the opeventsd was running at the time the event was received.
Event Processing
When troubleshooting event processing it's useful to understand the order that the various opEvents configuration files are processed in and the general function of each one.
State
When troubleshooting state it's important to realize that event.event and event.stateful are two completely different things. event.stateful is referred to as 'State Type' in the node context view. State is tracked based on event.stateful only, state status is generally up or down and may be found in the value of event.state.
EventParserRules.nmis provides the ultimate in flexibility in allowing the user to dictate what event.stateful and event.state will be presented to opEvents. For example event.event can be a completely different value then event.stateful.
- event.event=Apple; event.stateful=Banana; event.state=up
- event.event=Orange; event.stateful=Banana; event.state=down
With this in mine always confirm event.stateful when troubleshooting state inconsistencies.
Poller/Primary State Mismatch
If state has been lost between the poller and Primary servers check to see if a correlation rule has fired suppressing the more specific event.
If the issue is not related to a correlation rule look for the corresponding event on the poller. In the event context check the 'Actions taken for event' section. Was a script executed that would have sent the event to the Primary? Was it successful, what was the exit code?