NMIS9 opstatus data
Introduction
opStatus Data saves information about the jobs performed on the queue and the result of the operation. This jobs are internal to NMIS9, related to the health and system maintenance, and it is not always related with a node. Contrary to status data, that save information related to a node.
Schema
This is the schema for the opstatus collection:
- Activity: collect - escalations - metrics - selftest
- Context:
- node_uuid
- queue_id
- queue_tag
- worker_progress
- Details: Detail of the operation; completed successfully - failed: loading failed… - Can’t call method…
- expire_at
- Stats:
- Time
- Status: ok - error
- Time
- Type: completed - exception
Example
{"_id":"5cbdb0fd2b1813502c414c52", "time":1555935485.269778, "stats": {"time":0.1784820556640625}, "context": {"queue_tag":null, "worker_process":20524, "queue_id":"5cbdb0fd2b1813500c414988"}, "type":"completed", "expire_at":"2019-06-21T12:18:05.269Z", "details":"completed successfully, backup saved as /usr/local/nmis9/backups/nmis-config-backup-2019-04-22-2218.tar.gz", "status":"ok", "activity":"configbackup"}
opStatus data can be visualised on the NMIS9 interface:
http://volla.opmantek.com/cgi-nmis9/opstatus.pl
How NMIS9 handles opStatus data
- Nmisng saves or updates the status of an operation, called by the nmisd demon.
- An opstatus data is created/updated:
- with the result of a queue job
- when the worker starts/performs/stops an operation
- when a job finish with an error, or is completed successfully.
- The expiration date is almost 2 months by default (purge_opstatus_after is 5184000):
$expire_at = $statusrec->{time} + ( $self->config->{purge_opstatus_after} || 60 * 86400 );
- Used by selftest to check where was the most recent operation for update and collect.
- Used by nmisd to check when some operations were last tried/run/performed.
- A node deletes the opstatus data related when it is deleted.
- A node dumps their opstatus data.
- supports.pl exports the last 500 opstatus records. supports.pl exports collect data.
How OMK handles opStatus data
- opCharts records/update the status of an operation on its own database.
- opFlow records the status of an operation.
- opAddress saves an opstatus on the document (As oper_status)
- opHA records/update the status of an operation on its own database.
- The max age values by default (opCommon.nmis) are the following:
opaddress_opstatus_maxage => 604800, # seconds, 1 week
opflow_opstatus_maxage => 1209600, # seconds, 2 weeks
opconfig_opstatus_maxage => 604800, # seconds, 1 week
purge_opstatus_after => 86400*90, # 3 months (opHA).