Scenario 1 : Using opHA4 ‘Pull’ on Primary to synchronize nmisng collections.

If the Poller has been running for a while, it would be better to move to opHA4 and then do a “Pull” to sync the data. After the sync has happened, its easy to move it back to opHAMB.
Move the desired Poller to opha4 to sync upto the latest (opha5 => opha4).

Peer: Pause the message bus on the Peer

/usr/local/omk/bin/ophad cmd producer pause

Primary: on the opHA-MB peer portal, “Pull” to sync data from Peer that has been paused.

Screenshot 2025-10-21 at 07.34.39.png

 Move the desired Poller back to opha4 (opha4 => opha5)

This command would set opHA to start using the Message bus.

/usr/local/omk/bin/ophad cmd producer start
Screenshot 2025-10-21 at 07.36.51.png


Scenario 2 : opHA-MB failover and failback commands.

State: It is possible to get the state of the Peers on the Main Primary using the cli

sudo /usr/local/omk/bin/ophad cmd consumer state

Failover: If Poller were to go down, the Mirror would take over automatically. But, once the Poller comes back online, the switchover from Mirror to Poller is not automatic.

Failback: There is a cli command to accomplish the same which needs to be run the Main Primary (and Primary)

sudo /usr/local/omk/bin/ophad cmd consumer failback <Poller Cluster ID>

There is also a way to force a Failover which again needs to be run on Main Primary (and Primary)

sudo /usr/local/omk/bin/ophad cmd consumer failover <Poller Cluster ID>

Scenario 3 : (Replication mode) If the main-primary were to go down in replication mode.

Switching Main and Secondary Primary Servers

In the unforeseen event where the main-primary server goes down the second-primary will take over and become the primary server and ensure that the system still runs. Once we recover the main-primary server we can then restart all the services on the main-primary server, to do that run the following command.

Run as root user

systemctl restart nmis9d opchartsd opeventsd omkd ophad

To switch from the Secondary Primary back to the Main-Primary so the main-primary is the master again follow these steps:

  1. Connect to MongoDB on the master server in this case the (second-primary):

    mongosh --username opUserRW --password op42flow42 admin
  2. Update member priorities:

    cfg = rs.conf()
    cfg.members[0].priority = 0.6
    cfg.members[1].priority = 0.5
    rs.reconfig(cfg)

Enable logging:

  1. ophad logging : /usr/local/omk/conf/opCommon.json under “opha” add the line
    "ophad_logfile" : "/usr/local/omk/log/ophad.log",

     "opha" : {
          "opha_role" : "Main Primary",
          "ophad_logfile" : "/usr/local/omk/log/ophad.log",
          "ophad_streaming_apps" : [
             "nmis",
             "opevents"
          ],
  2. nats-server logging : add the lines to /etc/nats-server.conf
    log_file: "/var/log/nats-server.log"

    shankarn@opha-dev4:~$ cat /etc/nats-server.conf
    server_name: "opha-dev4.opmantek.net"
    http_port: 8222
    listen: 4222
    jetstream: enabled
    
    #tls {
    #    cert_file: "<path>"
    #    key_file:  "<path>"
    #    #ca_file:   "<path>"
    #    verify: true
    #}
    
    log_file: "/var/log/nats-server.log"

Debugging guide:

Scenario 1 : ophad doesn’t come up

Check sudo journalctl -f -u ophad

shankarn@opha-dev2:/usr/local/omk/log$ sudo journalctl -f -u ophad
-- Journal begins at Fri 2024-09-06 16:23:19 AEST. --
Aug 01 10:15:59 opha-dev2 ophad[46242]: ophad v0.0.0: agent
Aug 01 10:16:01 opha-dev2 ophad[46242]: cannot init logger: cannot create logfile open /usr/local/omk/log/ophad.log: permission denied
Aug 01 10:16:01 opha-dev2 systemd[1]: ophad.service: Main process exited, code=exited, status=1/FAILURE
Aug 01 10:16:01 opha-dev2 systemd[1]: ophad.service: Failed with result 'exit-code'.  

edit /etc/systemd/system/ophad.service to remove the below lines

Type=simple
User=root
Group=root
cat /etc/systemd/system/ophad.service.bkup
[Unit]
Description=opHA daemon
After=network-online.target
Wants=network-online.target

[Service]
Type=simple
User=root
Group=root

#on failure try to restart every RestartSec, upto StartLimitBurst times within StartLimitInterval

Restart=on-failure
RestartSec=10
StartLimitInterval=300
StartLimitBurst=10

WorkingDirectory=/usr/local/omk
ExecStart=/usr/local/omk/bin/ophad agent --streaming-type=nats

[Install]

reload and restart ophad

sudo systemctl daemon-reload                                                    
sudo systemctl restart ophad