There are lots of factors that determine the system health of a server. The hardware capabilities - CPU, memory or disk - is an important one, but also the server load - number of devices (Nodes to be polled, updated, audited, synchronised), number of products (NMIS, OAE, opCharts, opHA - each running different processes), number of concurrent users.
We all want the best performance for a server, and to optimise physical resources, our configuration has to be fine-grained adjusted. In this guide you will find recommended parameters, that may not suit in all cases, as a server performance will depend on a lot of factors.
Table of Contents |
---|
Related Articles
- Scaling NMIS Polling
- Scaling NMIS polling - how NMIS handles long running processes
- Recommended NMIS 9 - Configuration Options for Server Performance Tuning
NMIS 9
Before Start
The first thing to do will be get the information of out system:
- System Information: NMIS and OMK support tool will give us all the information needed.
- Monitor services: NMIS can monitor the involved processes - apache2, nmis9d, omkd and mongod - and provide useful information about CPU and memory - among others.
Number of processes
NMIS runs a daemon to obtain periodically the nodes information.
...
Code Block |
---|
omkd_max_requests |
MongoDB memory usage
MongoDB, in its default configuration, will use will use the larger of either 256 MB or ½ of (ram – 1 GB) for its cache size.
MongoDB cache size can be changed by adding the cacheSizeGB argument to the /etc/mongod.conf configuration file, as shown below.
Code Block |
---|
storage: dbPath: /var/lib/mongodb journal: enabled: true wiredTiger: engineConfig: cacheSizeGB: 1 |
Here is an interesting information regarding how MongoDB reserves memory for internal cache and WiredTiger, the underneath technology. Also some adjustment that can be done: https://dba.stackexchange.com/questions/148395/mongodb-using-too-much-memory
Server examples
Two servers are compared in this section.
- Master only have one node, but more than 400 poller nodes. opHA process is what will require more CPU and memory usage.
- Poller have more more than 500 nodes. nmis process will require more CPU and memory, for polling the information for all the nodes.
Stressed system
Status | ||||
---|---|---|---|---|
|
System information:
Name | Value |
---|---|
nmisd_max_workers | 10 |
omkd_workers | 4 |
omkd_max_requests | 500 |
Nodes | 406 |
Active Nodes | 507 |
OS | Ubuntu 18.04.3 LTS |
role | poller |
...
Check processes once nmis9d is restarted again:
Code Block |
---|
top |
Healthy system
Status | ||||
---|---|---|---|---|
|
System information:
Name | Value |
---|---|
nmisd_max_workers | 5 |
omkd_workers | 10 |
omkd_max_requests | undef |
Nodes | 2 |
Poller Nodes | 536 |
OS | Ubuntu 18.04.3 LTS |
role | master |
...
Daemons graphs:
omk:
mongo:
NMIS 8
The main NMIS 8 process is called from different cron jobs to run different operations: collect, update, summary, master, ...
For a collect or an update, the main thread creates forks to perform the operation requested.
Configurations that affect performance
There are some important configuration that affects performace:
...
- sort_due_nodes: When NMIS decides what to poll it can do some in a pseudo random order which is the default, if your server is overloaded you will likely see some nodes never getting polled, hence pseudo random, so for heavily loaded servers, enable sort_due_nodes, in the NMIS configuration add with the value set to 1.
Gaps in Graphs
If the server takes a long time to collect and cannot complete any operation, an useful tool is nmis8/admin/polling_summary. Here we can see how many nodes have any late collect, and a summary of nodes being collected and not collected.
...