opCharts Node Dependancy Management (Root Cause Analysis)

Overview

The opCharts 3.3.5 and 4.2.2 release included a feature to automate the management of node dependancies in NMIS.  This feature is very powerful for reducing the overall noise generated by a large network outage.

Currently the feature works by identifying all the routers in your network and all the subnets connected to those routers.  It then uses the opCharts topology information to find all nodes on a given subnet, this would include servers and switches, and then making them dependant on the routers connected to that subnet.  

The result for example is that when a remote office has a problem with power, you would get many Node Down events in NMIS, but all these nodes are now dependant on the main router, so NMIS will ONLY escalate the routers Node Down events and the other events will be suppressed.

Events will still be logged, but not escalated, do get the benefit of this feature you need to be using the NMIS escalation system.  To change opEvents to use the NMIS Escalation system you can follow the instructions at Leveraging NMIS Dependancy in opEvents

Using the Feature

You should first of all make a backup of your NMIS Nodes.nmis file, if your setup is a regular setup, there will be automatic configuration backups run every 24 hours.

Clear Existing Dependancies

Optionally, clear all the existing dependancy setup.  You should do this if you have "N/A" as dependancies but if you have been manually setting this you can just let the system add dependancies by skipping this step.

/usr/local/omk/bin/opcharts-cli.pl act=clear-all-node-depend

Auto Discovery of Subnet Dependancies

/usr/local/omk/bin/opcharts-cli.pl act=update-subnet-dependancy

That's it.  You can see the what is happening by adding "debug=true" to the end of the command.

Configuration Options - WIP

It would not be an Opmantek solution if it did not include configuration options.  Currently there are three fairly straight forward options which are configured in omk/conf/opCommon.nmis

Configuration OptionDefault ValueDescription

opcharts_subnet_skip

qr/(^127\.0\.0|^169.254\.|^192.168\.)/Do not include these subnets for dependancy analysis, this might include your core network for example.
opcharts_router_namesqr/^RTR/These devices are DEFINITELY routers and will be considered as such.
opcharts_router_skip qr/-router1$/These devices might be routers but you do not want to do process anything from them.