NMIS9 opstatus data
Introduction
The opstatus data saves information about the jobs performed on the scheduler queue and the results of each operation. These jobs are internal to NMIS9, related to the health and system maintenance, and are not always related to a node. opstatus data is different to status data, which saves status information related to a node.
opstatus data can be visualised using the NMIS9 Ops Status page:
NMIS → System → Host Diagnostics → NMIS Ops Status
e.g. http://volla.opmantek.com/cgi-nmis9/opstatus.pl
Schema
This is the schema for the opstatus collection:
- Activity: collect - escalations - metrics - selftest
- Context:
- node_uuid
- queue_id
- queue_tag
- worker_progress
- Details: Detail of the operation; completed successfully - failed: loading failed… - Can’t call method…
- expire_at
- Stats:
- Time
- Status: ok - error
- Time
- Type: completed - exception
Example
{"_id":"5cbdb0fd2b1813502c414c52",
"time":1555935485.269778,
"stats":
{"time":0.1784820556640625},
"context":
{"queue_tag":null,
"worker_process":20524,
"queue_id":"5cbdb0fd2b1813500c414988"},
"type":"completed",
"expire_at":"2019-06-21T12:18:05.269Z",
"details":"completed successfully, backup saved as /usr/local/nmis9/backups/nmis-config-backup-2019-04-22-2218.tar.gz",
"status":"ok",
"activity":"configbackup"}
How NMIS9 handles opstatus data
- Nmisng saves or updates the status of an operation, called by the nmisd demon.
- An opstatus data is created/updated:
- with the result of a queue job
- when the worker starts/performs/stops an operation
- when a job finish with an error, or is completed successfully.
- The expiration date is almost 2 months by default (purge_opstatus_after is 5184000):
$expire_at = $statusrec->{time} + ( $self->config->{purge_opstatus_after} || 60 * 86400 );
- Used by selftest to check where was the most recent operation for update and collect.
- Used by nmisd to check when some operations were last tried/run/performed.
- A node deletes the opstatus data related when it is deleted.
- A node dumps their opstatus data.
- supports.pl exports the last 500 opstatus records. supports.pl exports collect data.
How OMK handles opstatus data
- opCharts records/update the status of an operation on its own database.
- opFlow records the status of an operation.
- opAddress saves an opstatus on the document (As oper_status)
- opHA records/update the status of an operation on its own database.
- The max age values by default (opCommon.nmis) are the following:
opaddress_opstatus_maxage => 604800, # seconds, 1 week
opflow_opstatus_maxage => 1209600, # seconds, 2 weeks
opconfig_opstatus_maxage => 604800, # seconds, 1 week
purge_opstatus_after => 86400*90, # 3 months (opHA).