Scenario 1 : Using opHA4 ‘Pull’ on Primary to synchronize nmisng collections.
Move the desired Poller to opha4 to sync upto the latest (opha5 => opha4).
on the desired Peer:
/usr/local/omk/bin/ophad cmd producer pause
Move the desired Poller back to opha4 (opha4 => opha5)
/usr/local/omk/bin/ophad cmd producer start
Scenario 2 : opHA-MB failover and failback commands.
State: It is possible to get the state of the Peers on the Main Primary using the cli
| Code Block |
|---|
| breakoutMode | wide |
|---|
| breakoutWidth | 760 |
|---|
|
sudo /usr/local/omk/bin/ophad cmd consumer state |
...
Failback: There is a cli command to accomplish the same which needs to be run the Main Primary (and Primary)
| Code Block |
|---|
| breakoutMode | wide |
|---|
| breakoutWidth | 760 |
|---|
|
sudo /usr/local/omk/bin/ophad cmd consumer failback <Poller Cluster ID> |
There is also a way to force a Failover which again needs to be run on Main Primary (and Primary)
| Code Block |
|---|
| breakoutMode | wide |
|---|
| breakoutWidth | 760 |
|---|
|
sudo /usr/local/omk/bin/ophad cmd consumer failover <Poller Cluster ID> |
Scenario
...
3 : (Replication mode) If the main-primary were to go down in replication mode.
Switching Main and Secondary Primary Servers
...
Run as root user
| Code Block |
|---|
| breakoutMode | wide |
|---|
| breakoutWidth | 760 |
|---|
|
systemctl restart nmis9d opchartsd opeventsd omkd ophad |
...
Check sudo journalctl -f -u ophad
| Code Block |
|---|
| breakoutMode | wide |
|---|
| breakoutWidth | 760 |
|---|
|
shankarn@opha-dev2:/usr/local/omk/log$ sudo journalctl -f -u ophad
-- Journal begins at Fri 2024-09-06 16:23:19 AEST. --
Aug 01 10:15:59 opha-dev2 ophad[46242]: ophad v0.0.0: agent
Aug 01 10:16:01 opha-dev2 ophad[46242]: cannot init logger: cannot create logfile open /usr/local/omk/log/ophad.log: permission denied
Aug 01 10:16:01 opha-dev2 systemd[1]: ophad.service: Main process exited, code=exited, status=1/FAILURE
Aug 01 10:16:01 opha-dev2 systemd[1]: ophad.service: Failed with result 'exit-code'. |
edit /etc/systemd/system/ophad.service to remove the below lines
| Code Block |
|---|
| breakoutMode | wide |
|---|
| breakoutWidth | 760 |
|---|
|
Type=simple
User=root
Group=root |
| Code Block |
|---|
| breakoutMode | wide |
|---|
| breakoutWidth | 760 |
|---|
|
cat /etc/systemd/system/ophad.service.bkup
[Unit]
Description=opHA daemon
After=network-online.target
Wants=network-online.target
[Service]
Type=simple
User=root
Group=root
#on failure try to restart every RestartSec, upto StartLimitBurst times within StartLimitInterval
Restart=on-failure
RestartSec=10
StartLimitInterval=300
StartLimitBurst=10
WorkingDirectory=/usr/local/omk
ExecStart=/usr/local/omk/bin/ophad agent --streaming-type=nats
[Install] |
reload and restart ophad
| Code Block |
|---|
| breakoutMode | wide |
|---|
| breakoutWidth | 760 |
|---|
|
sudo systemctl daemon-reload
sudo systemctl restart ophad |