SNMP Troubleshooting


Sometimes with NMIS and Network Management in general, you get these funny products, like wierd devices which don't really conform to the best practices and standards for SNMP.  They can be a pain to troubleshoot.  Here are some tips for things we have found.

SNMP Working, but not finding Interfaces with ifIndex

When running an NMIS update, e.g. nmis.pl type=update node=NODENAME debug=true, you might stop at this line "SNMP ERROR" see below:

01:35:35 getIntfInfo, Get Interface Info of node STRANGENODENAME, model TELDAT
01:36:23 checkResult, SNMP ERROR (STRANGENODENAME) (ifIndex) No response from remote host "STRANGENODENAME"
01:36:23 getIntfInfo, ERROR (STRANGENODENAME) on get interface index table
01:36:23 notify, Start of Notify
01:36:23 eventAdd, event added, node=STRANGENODENAME, event=SNMP Down, level=Critical, element=, details=SNMP error

This looks odd because SNMP is working, but this very important operation is failing.  So the problem is likely to be with support for the maximum SNMP packet size which is controlled by something called max repetition, which is actually how many SNMP PDU's will be packed into the SNMP packet.  

So to troubleshoot the above you might run an SNMPWALK like this:

NMIS# snmpwalk -v 2c -c COMMUNITYSTRING STRANGENODENAME ifIndex
IF-MIB::ifIndex.1 = INTEGER: 1
IF-MIB::ifIndex.2 = INTEGER: 2
IF-MIB::ifIndex.3 = INTEGER: 3
IF-MIB::ifIndex.4 = INTEGER: 4
IF-MIB::ifIndex.5 = INTEGER: 5
IF-MIB::ifIndex.6 = INTEGER: 6
IF-MIB::ifIndex.7 = INTEGER: 7
IF-MIB::ifIndex.8 = INTEGER: 8
IF-MIB::ifIndex.9 = INTEGER: 9
IF-MIB::ifIndex.10 = INTEGER: 10
IF-MIB::ifIndex.11 = INTEGER: 11
IF-MIB::ifIndex.12 = INTEGER: 12


If you ran a TCP DUMP which you would run with this command, you will need to make sure you are using TCPDUMP on the interface you are sending packets out of, check the route table on the server if you have multiple interfaces:

tcpdump -i INTERFACE host 2.3.4.5


You would see this:

01:31:37.093751 IP 1.2.3.4.48560 > 2.3.4.5.snmp:  C=COMMUNITYSTRING GetBulk(29)  N=0 M=10 interfaces.ifTable.ifEntry.ifIndex
01:31:37.115557 IP 2.3.4.5.snmp > 1.2.3.4.48560:  C=COMMUNITYSTRING GetResponse(185)  interfaces.ifTable.ifEntry.ifIndex.1=1 interfaces.ifTable.ifEntry.ifIndex.2=2 interfaces.ifTable.ifEntry.ifIndex.3=3 interfaces.ifTable.ifEntry.ifIndex.4=4 interfaces.ifTable.ifEntry.ifIndex.5=5 interfaces.ifTable.ifEntry.ifIndex.6=6 interfaces.ifTable.ifEntry.ifIndex.7=7 interfaces.ifTable.ifEntry.ifIndex.8=8 interfaces.ifTable.ifEntry.ifIndex.9=9 interfaces.ifTable.ifEntry.ifIndex.10=10
01:31:37.116194 IP 1.2.3.4.48560 > 2.3.4.5.snmp:  C=COMMUNITYSTRING GetBulk(30)  N=0 M=10 interfaces.ifTable.ifEntry.ifIndex.10
01:31:37.139792 IP 2.3.4.5.snmp > 1.2.3.4.48560:  C=COMMUNITYSTRING GetResponse(241)  interfaces.ifTable.ifEntry.ifIndex.11=11 interfaces.ifTable.ifEntry.ifIndex.12=12 interfaces.ifTable.ifEntry.ifDescr.1="ethernet0/0" interfaces.ifTable.ifEntry.ifDescr.2="ethernet0/1" interfaces.ifTable.ifEntry.ifDescr.3="serial0/0" interfaces.ifTable.ifEntry.ifDescr.4="bri0/0" interfaces.ifTable.ifEntry.ifDescr.5="x25-node" interfaces.ifTable.ifEntry.ifDescr.6="voip1/0" interfaces.ifTable.ifEntry.ifDescr.7="serial2/0" interfaces.ifTable.ifEntry.ifDescr.8="fr2"


What is interesting here is this: GetBulk(29) N=0 M=10 interfaces.ifTable.ifEntry.ifIndex, this is using a maximum of 10 SNMP PDU's in a packet, NET-SNMP on the command line appears to use 10 as a default OR not use bulk walks.

If you have not configured max repetitions in NMIS, you would see this:

01:41:37.093751 IP 1.2.3.4.48560 > 2.3.4.5.snmp:  C=COMMUNITYSTRING GetBulk(29)  N=0 M=25 interfaces.ifTable.ifEntry.ifIndex
01:51:37.093751 IP 1.2.3.4.48560 > 2.3.4.5.snmp:  C=COMMUNITYSTRING GetBulk(29)  N=0 M=25 interfaces.ifTable.ifEntry.ifIndex

Then NMIS would give you the errors above.  This is using a default of M=25 which set in the Perl NET-SNMP libraries or somewhere even more obscure.

Net Result, you will need to configure your NMIS Node with

'max_repetitions' => ’10',

You can find more details about SNMP things @ SNMP Tuning


snmpd returns "invalid(4)" process state (hrSWRunStatus) for process names containing spaces

net-snmp version 5.7.2 is known to be affected:
https://bugzilla.redhat.com/show_bug.cgi?id=1782180

When querying the hrSWRunStatus table via SNMP when using snmpd, it should generally return 1 or 2 for processes that are running or runnable.
However, if the process name contains a space, snmpd return 4 (invalid) for the process state.

This appears to be because it's reading /proc/$PID/stat and simply splitting on space and then grabbing the third element,
which would normally be the process status, but when the process name contains a space, this is no longer true.


NMIS# systemctl status omkd
● omkd.service - Opmantek Webserver
   Loaded: loaded (/etc/systemd/system/omkd.service; enabled; vendor preset: disabled)
   Active: active (running) since Thu 2020-09-24 11:02:54 UTC; 10h ago
 Main PID: 3137 (opmantek.pl web)
   CGroup: /system.slice/omkd.service
           ├─3137 opmantek.pl webserver                                                        -f -p /var/run/opmantek.exe.pid -r
           ├─3253 opmantek.pl webserver                                                        -f -p /var/run/opmantek.exe.pid -r
           ├─3254 opmantek.pl webserver                                                        -f -p /var/run/opmantek.exe.pid -r
           ├─3255 opmantek.pl webserver                                                        -f -p /var/run/opmantek.exe.pid -r
           ├─3256 opmantek.pl webserver                                                        -f -p /var/run/opmantek.exe.pid -r
           └─3257 opmantek.pl webserver                                                        -f -p /var/run/opmantek.exe.pid -r

Sep 24 11:02:22 omk-vm9-centos7 systemd[1]: Starting Opmantek Webserver...
Sep 24 11:02:54 omk-vm9-centos7 systemd[1]: Started Opmantek Webserver.

NMIS# snmpd --version
NET-SNMP version:  5.7.2

NMIS# cat /proc/3253/stat
3253 (opmantek.pl web) S 3137 3137 3137 0 -1 4202816 5749 0 0 0 390 43 0 0 20 0 1 0 4442 543064064 64976 18446744073709551615 1 1 0 0 0 0 0 4224 5 18446744073709551615 0 0 17 0 0 0 0 0 0 0 0 0 0 0 0 0 0

NMIS# snmpwalk -v 2c -c COMMUNITYSTRING 127.0.0.1 1.3.6.1.2.1.25.4.2.1.7.3253
HOST-RESOURCES-MIB::hrSWRunStatus.3253 = INTEGER: invalid(4)


A consequence of this issue is that when an affected version of snmp is installed,
MMIS will report a monitored service as 'down' when it is 'running' if the process name contains a space.

Max Message size too small/large

snmp_max_msg_size

Go to /usr/local/nmis9/Conf/

vim Conf.nmis


The primary tunable NMIS configuration setting for SNMP is snmp_max_msg_size, which controls how large a single SNMP packet may be.

This can be set as a system-wide default (in the System menu, under System Configuration), or as a per-host setting (in the Edit Node menu, under Advanced Options).

The default for snmp_max_msg_size is 1472 bytes, just below the 1500 byte packet limit for normal Ethernets. In LAN-only scenarios it is possible to increase this past 1500 bytes: this causes IP fragments and packet reassembly, but unless your LAN is saturated and starving for bandwidth fragmentation is not a problem. The benefit of a larger SNMP packet would be that the data to be collected fits into fewer packets.

To quickly adjust this setting you could run the following command using the node_admin.pl tool that ships with NMIS. The max_msg_size value of course can be increased or decreased as desired.

/usr/local/nmis9/admin/node_admin.pl act=set node=nodename entry.configuration.max_msg_size=2800

SNMP Partially Working in NMIS but SNMPWALK works no problem

Many SNMP agents to not comply to the SNMPv2c specifications so do not support multiple SNMP PDU's in a single packet.  This feature can be disabled by setting max_repetitions to 1, which means one SNMP PDU per SNMP packet.

/usr/local/nmis9/admin/node_admin.pl act=set node=nodename entry.configuration.max_repetitions=1