Overview

Opmantek gets many questions on how to scale NMIS8 for very large networks. There are many factors impacting polling performance and the upper limits of polling is really only limited by the number of processor cores, available memory and disk IO performance. We have customers managing 10's of 1000's of nodes using NMIS.

Table of Contents

Server Specifications

The following server specifications are guidelines for NMIS installations.

Storage performance is one of the greatest factors for scaling polling, you should consider using highly performant storage like SSD.

The ideal way to determine specifications for your server is to baseline some nodes on the server and determine what resources are required.

	Small	Medium	Large	Massive
OS Storage	20GB	20GB	20GB	20GB
Data Storage	40GB	60GB	140GB	280GB
Memory

2GB

4GB

4-

8GB

16GB

32GB+
CPU	2 x vCPU	2 to 4 x vCPU

4 to 6

8 x vCPU

8

16+ vCPU
Device Count	< 500 devices	< 1500 devices	< 2500 devices	A very large number of devices
Element Count	2000 elements	8000 elements	14000 elements	A very large number of elements

Elements are additional data being collected, an interface is an element, a CBQoS class is an element.

An element requires additional SNMP polling to collect the values and then storage on the disk to save the data.

...

If you try to add 1000's of nodes to NMIS at once it is going to take a while to process the nodes for the first time. You two main choices, add nodes in smaller batches, or stop polling while adding large numbers of nodes.

To The best thing to do if If you want to do this, is to stop the polling , run an update cycleThe most likely root cause here is just too much going on when you add all 2000 devices. When you add nodes and and then run an update and cycle, then a collect , it creates LOTS of RRD files, so the disk IO is at a peak during this process.

Configuration Considerations

We have been working with our commercial customers using NMIS8 at this sort of scale and it works. They use between 12 and 16GB with that is handling it. What has been a problem is Disk IO performance.

...

cycle manually, then start the polling.

To stop polling modify the configuration option global_collect to false or stopping polling in the crontab, comment out this line:

Code Block
#/5 * * * /usr/local/nmis8/bin/nmis.pl type=collect mthread=true maxthreads=8

Then add nodes, all of them at once or in batches. Restart fpingd so that it reloads all the nodes you added.

Code Block
/usr/local/nmis8/bin/fpingd.pl restart=true

Then run an update manually with nohup, if you have 12GB of memory you can give it lots of threads, probably 20 should do it, but watch your memory usage you can probably get to 30 threads.

Code Block
cd ~

...


nohup /usr/local/nmis8/bin/nmis.pl type=update mthread=true maxthreads=20&

This will take a while to run the first time, when it finishes, run a collect cycle the same way

Code Block
nohup /usr/local/nmis8/bin/nmis.pl type=collect mthread=true maxthreads=20&

Now all the big disk activity is done and you should be able to start NMIS polling by letting the poller go again.

Code Block
/5 * * * /usr/local/nmis8/bin/nmis.pl type=collect mthread=true maxthreads=<MAX THREADS BASED ON YOU BASELINE>

Configuration Considerations

You can also control the way NMIS does its thing by moving the summary and thresholding to cron, I would suggest this as a good practice for larger installations.

In CRON:

Code Block
/2 * * * /usr/local/nmis8/bin/nmis.pl type=summary


4-59/5 * * * * /usr/local/nmis8/bin/nmis.pl type=threshold

In Config.nmis:

Code Block
'threshold_poll_cycle' => 'false',

...


'nmis_summary_poll_cycle' => 'false',
'disable_interfaces_summary' => 'true',

The other BIG consideration is what is your polling policy

...

, the more interfaces you collect on, the more disk, cpu and memory you will consume,

...

just collecting more data may not help you operationally, collect the right data, which is how NMIS has been configured.

If you hare having problems scaling your NMIS installation, you could contact Opmantek for assistance.

Using JSON for NMIS Database

To optimise how NMIS files are saved, you can use the JSON database, this will require NMIS 8.4.8g or greater. The following needs to be run on every Primary and poller server in an NMIS cluster and this should be co-ordinated to run very close together.

Code Block
/usr/local/nmis8/admin/convert_nmis_db.pl

This script will stop NMIS polling, convert the database files, update the NMIS configuration to use the new database format, then start the polling again.

Versions Compared

Old Version 1

New Version Current

Key

Overview

Related Articles

Server Specifications

Configuration Considerations

Configuration Considerations

Using JSON for NMIS Database

Page Comparison

Versions Compared

Old Version 1

New Version Current

Key

Overview

Related Articles

Server Specifications

Configuration Considerations

Configuration Considerations

Using JSON for NMIS Database