Table of Contents |
---|
...
When the Updates and Collects last occurred can be found using the GUI, in the Menu "System > Configuration Check > Node Admin Summary".
Running Collect and Update Jobs Manually
You may need to schedule a collect or update to run immediately, generally if you are doing some modelling activities.
Run an update job on a node called "sol" with debug and log it to a file:
Code Block |
---|
/usr/local/nmis9/bin/nmis-cli act=schedule job.type=update job.node=sol job.verbosity=9 job.force=true job.output=/tmp/sol |
The result will be something like:
Code Block |
---|
Job 6142a01930437a20d2084c91 created for node sol (05575270-a4ed-4c79-b992-18218c70ce42) and type update. |
If you get this error, change the node name:
Code Block |
---|
No nodes found matching your selectors! |
The debug logs would be in a file starting with /tmp/sol e.g.
Code Block |
---|
keith@kaos:~$ ls -lrt /tmp/sol*
-rw-r--r-- 1 root root 315334 Sep 16 11:38 /tmp/sol-1631756322.04417.log |
To run an update on all nodes once you have finished with your new model
Code Block |
---|
/usr/local/nmis9/bin/nmis-cli act=schedule job.type=update job.force=true |
The result will be a list of nodes with jobs scheduled
Code Block |
---|
keith@kaos:~$ /usr/local/nmis9/bin/nmis-cli act=schedule job.type=update job.force=true
Job 6142a1be9eb635425dd1c211 created for node excalibur (f8653511-9cb5-45a0-a1aa-bef81f4e34b8) and type update.
Job 6142a1be9eb635425dd1c213 created for node sif (46b8e7d2-e2d6-4ea4-8599-349fba105556) and type update.
Job 6142a1be9eb635425dd1c215 created for node sol (05575270-a4ed-4c79-b992-18218c70ce42) and type update. |
To view the scheduler
Code Block |
---|
keith@kaos:~$ /usr/local/nmis9/bin/nmis-cli act=list-schedules verbose=t
Active Jobs:
Id When Status What Parameters
6142a2236301fbc46bb58ee1 Thu Sep 16 11:47:15 2021 In Progress since Thu Sep 16 11:47:16 2021 (Worker 16471) collect {'uuid'='afaea97b-d72d-4ffe-bd09-80df44a8295b','wantsnmp'=1,'wantwmi'=1}
6142a229b45a12c0c4863b87 Thu Sep 16 11:47:21 2021 In Progress since Thu Sep 16 11:47:26 2021 (Worker 16563) update {'force'=1,'uuid'='9cfed9b9-5395-43a9-a52e-f339e1c69c21'}
6142a229b45a12c0c4863b8b Thu Sep 16 11:47:21 2021 In Progress since Thu Sep 16 11:47:26 2021 (Worker 16663) update {'force'=1,'uuid'='42bed16d-8029-401e-bf54-fbe6c074c072'}
6142a229b45a12c0c4863b8f Thu Sep 16 11:47:21 2021 In Progress since Thu Sep 16 11:47:28 2021 (Worker 16303) update {'force'=1,'uuid'='3b1a2c57-97e7-449e-bfad-c30c2d0d645a'}
Queued Jobs:
Id When Priority What Parameters
6142a229b45a12c0c4863b90 Thu Sep 16 11:47:21 2021 1 update {'force'=1,'uuid'='7c197b17-2d50-434c-a9d2-b8f685afe75a'}
6142a229b45a12c0c4863b91 Thu Sep 16 11:47:21 2021 1 update {'force'=1,'uuid'='f8653511-9cb5-45a0-a1aa-bef81f4e34b8'}
6142a229b45a12c0c4863b92 Thu Sep 16 11:47:21 2021 1 update {'force'=1,'uuid'='46b8e7d2-e2d6-4ea4-8599-349fba105556'}
6142a229b45a12c0c4863b93 Thu Sep 16 11:47:21 2021 1 update {'force'=1,'uuid'='d51dab62-2d6e-4dba-be31-eff1f496cfcb'}
6142a229b45a12c0c4863b94 Thu Sep 16 11:47:21 2021 1 update {'force'=1,'uuid'='801c9c70-0c06-47e3-a830-76bcabf07e8a'}
6142a229b45a12c0c4863b95 Thu Sep 16 11:47:21 2021 1 update {'force'=1,'uuid'='fab72303-93dd-4eb0-a917-02c6c3f20efd'}
6142a229b45a12c0c4863b96 Thu Sep 16 11:47:21 2021 1 update {'force'=1,'uuid'='05575270-a4ed-4c79-b992-18218c70ce42'}
6142a229b45a12c0c4863b97 Thu Sep 16 11:47:21 2021 1 update {'force'=1,'uuid'='4550361e-26a8-43d6-b48d-339b986b9534'} |
Fault-recovery
If a job remains stuck as active job for too long then the nmis daemon will abort it and reschedule a suitable new job. Such stuck jobs can appear in the queue if you terminate the nmis daemon with act=abort
or service nmis9d stop
, because these actions immediately kill the relevant processes and don't take active operations into account.
...
- There was no default abort_plugins_after option in the configuration. This value can be added in Config.nmis:
'overtime_schedule' => { 'abort_plugins_after' => 7200, # Seconds ... }
- The schedule keeps adding these jobs into the queue. The workers can discard these jobs changing the configuration options postpone_clashing_schedule to 0.
'postpone_clashing_schedule' => 0,
After theses two changes, nmis9d daemon needs to be restarted.
...