Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Table of Contents
minLevel1
maxLevel6
include
outlinefalse
indent
stylenone
exclude
typelist
class
printabletrue

Config PreChecks

  1. Nats cluster config: check
    Check /usr/local/omk/conf/opCommon.json on all the vms including Main Pri, Sec Pri, Pollers, Mirrors) for nats_cluster and verify it has all the 3 server DNS address of Main Primary, Secondary Primary and the Arbiter Poller

    Code Block
    omkadmin@lab-ophamb-mp01:/usr/local/omk/conf$ grep -a4 nats_cluster /usr/local/omk/conf/opCommon.json
          "db_use_v26_features" : 1,
          "redis_port" : 6379,
          "redis_server" : "localhost",
          "db_port" : "27017",
          "nats_cluster" : [
             “Main Primary ,
             “Sec Primary”,
             “New Arbiter Poller”
          ],

     

  2. Nats number of replicas setting: check
    Check /usr/local/omk/conf/opCommon.json on all the vms including Main Pri, Sec Pri, Pollers, Mirrors) it has 3 (for replicated)

    Code Block
    omkadmin@lab-ophamb-mp01:/usr/local/omk/conf$ grep nats_num_replicas /usr/local/omk/conf/opCommon.json
          "nats_num_replicas" : 3,

  3.  Nats stream info check (only for replicated setup with 3 Nats servers)

    Code Block
    nats stream info --user omkadmin --password op42opha42
  • The cluster group needs to have all 3 servers in the replica set - DNS address of Main Primary, Secondary Primary and the Arbiter Poller

  • nats replicas needs to be set 3

...

  1. Mongo cluster heartbeat check ‘uptime’ on Main Primary

    Code Block
    shankarn@opha-dev4:/usr/local/omk/conf$ mongosh --username opUserRW --password op42flow42 admin --port 27017
    rs1 [direct: primary] admin> rs.status()
    {
       ...
      members: [
        {
          _id: 0,
          name: 'opha-dev4.opmantek.net:27017',
          health: 1,
          state: 1,
          stateStr: 'PRIMARY',
          uptime: 17503,
          optime: { ts: Timestamp({ t: 1763526818, i: 9 }), t: Long('7') },
          optimeDate: ISODate('2025-11-19T04:33:38.000Z'),
          lastAppliedWallTime: ISODate('2025-11-19T04:33:38.225Z'),
          lastDurableWallTime: ISODate('2025-11-19T04:33:38.190Z'),
        },
        {
          _id: 1,
          name: 'opha-dev7.opmantek.net:27017',
          health: 1,
          state: 2,
          stateStr: 'SECONDARY',
          uptime: 17496,
          optime: { ts: Timestamp({ t: 1763526814, i: 1 }), t: Long('7') },
          optimeDurable: { ts: Timestamp({ t: 1763526814, i: 1 }), t: Long('7') },
          optimeDate: ISODate('2025-11-19T04:33:34.000Z'),
          optimeDurableDate: ISODate('2025-11-19T04:33:34.000Z'),
          lastAppliedWallTime: ISODate('2025-11-19T04:33:38.225Z'),
          lastDurableWallTime: ISODate('2025-11-19T04:33:38.225Z'),
          lastHeartbeat: ISODate('2025-11-19T04:33:36.300Z'),
          lastHeartbeatRecv: ISODate('2025-11-19T04:33:37.493Z'),
        },
        {
          _id: 2,
          name: 'opha-dev6.opmantek.net:27018',
          health: 1,
          state: 7,
          stateStr: 'ARBITER',
          uptime: 17496,
          lastHeartbeat: ISODate('2025-11-19T04:33:36.301Z'),
          lastHeartbeatRecv: ISODate('2025-11-19T04:33:36.290Z'),
        }
      ],
    

  2. Run the command sudo /usr/local/omk/bin/ophad verify on all the Peers/Primary.

    The last line “ophad.verify: ready for liftoff 🚀 “ to indicate the configuration is good.

     

    Code Block
    shankarn@opha-dev5:~$ sudo /usr/local/omk/bin/ophad verify
    [sudo] password for shankarn:
    ophad v0.0.52: agent
    Appending to file "/usr/local/omk/log/ophad.log"
    Settings -----------------------------------------
      * ClusterId: 783d7b91-6c64-4db9-a28f-6364a54b8505
      * OMKDatabase:
        * ConnectionTimeout: 5h33m20s
        * RetryTimeout: 3m0s
        * PingTimeout: 33m20s
        * QueryTimeout: 1h23m20s
        * Port: 27017
        * Server: localhost
        * MongoCluster: []
        * ReplicaSet: (blank)
        * Name: omk_shared
        * Username: opUserRW
        * Password: ******
        * WriteConcern: 1
        * Uri: (blank)
        * BatchSize: 0
        * BatchTimeout: 0
      * NMISDatabase:
        * ConnectionTimeout: 2m0s
        * RetryTimeout: 3m0s
        * PingTimeout: 20s
        * QueryTimeout: 1h23m20s
        * Port: 27017
        * Server: localhost
        * MongoCluster: []
        * ReplicaSet: (blank)
        * Name: nmisng
        * Username: opUserRW
        * Password: ******
        * WriteConcern: 1
        * Uri: (blank)
        * BatchSize: 50
        * BatchTimeout: 500
      * OpEventsDatabase:
        * ConnectionTimeout: 2m0s
        * RetryTimeout: 3m0s
        * PingTimeout: 20s
        * QueryTimeout: 5m0s
        * Port: 27017
        * Server: localhost
        * MongoCluster: []
        * ReplicaSet: (blank)
        * Name: opevents
        * Username: opUserRW
        * Password: ******
        * WriteConcern: 1
        * Uri: (blank)
        * BatchSize: 50
        * BatchTimeout: 500
      * OMK:
        * LogLevel: info
        * BindAddr: *
      * Directories:
        * Base: /usr/local/omk
        * Conf: /usr/local/omk/conf
        * Logs: /usr/local/omk/log
        * Var: /usr/local/omk/var
      * OPHA:
        * DBName: opha
        * StreamingApps: [nmis opevents]
        * Logfile: /usr/local/omk/log/ophad.log
        * MongoWatchFilters: []
        * StreamType: nats
        * AgentPort: 6000
        * NonActiveTimeout: 8m0s
        * ResumeTokenCollection: resume_token
        * OpHACliPath: /usr/local/omk/bin/opha-cli.pl
        * Compression: true
        * Role: Poller
      * Consumer: false
      * Producer: false
      * ConsumerPollerSet: (blank)
      * DebugEnabled: false
      * Redis:
        * RedisServer: localhost
        * RedisPort: 6379
        * RedisPassword: ******
        * RetryTimeout: 3m0s
        * RedisStreamLenCheckPeriod: 5
        * RedisProducerMaxStreamLength: 10000
        * MaxRetries: 180
        * RedisTLSEnabled: false
        * RedisTLSSkipVerify: false
        * RedisProducerDegradeTimeout: 10
        * RedisProducerFullDegradeTimeout: 10
      * Kafka:
        * Seeds: localhost:63616,localhost:63627,localhost:63629
        * RetryTimeout: 3m0s
        * MaxRetries: 180
      * Nats:
        * NatsServer: opha-dev4.opmantek.net
        * NatsCluster: []
        * NatsPort: 4222
        * NatsNumReplicas: 1
        * NatsUsername: omkadmin
        * NatsPassword: ******
        * RetryTimeout: 3m0s
        * NatsStreamLenCheckPeriod: 5
        * NatsProducerMaxMsgPerSubject: 1000000
        * NatsMaxAge: 604800
        * MaxRetries: 180
        * NatsTLSEnabled: false
        * NatsTLSCert: <path>
        * NatsTLSKey: <path>
        * NatsTLSSkipVerify: false
        * NatsProducerDegradeTimeout: 10
        * NatsProducerFullDegradeTimeout: 10
      * Authentication:
        * AuthTokenKeys: ******
    --------------------------------------------------
    2025-10-22T08:01:46.329+1100 [INFO]  ophad.verify: verify nmis9 mongodb connection with database: name=nmisng
    2025-10-22T08:01:46.451+1100 [INFO]  ophad.verify: MongoDB NMIS connect: maybe="found nodes collection in nmis9 ✅"
    2025-10-22T08:01:46.451+1100 [INFO]  ophad.verify: verify omk mongodb connection with database: name=opha
    2025-10-22T08:01:46.551+1100 [INFO]  ophad.verify: MongoDB OMK connect: maybe="found opstatus collection in omk database ✅"
    2025-10-22T08:01:46.575+1100 [INFO]  ophad.verify: Nats connect:
      result=
      | can connect to nats-server: opha-dev4.opmantek.net version: 2.11.9 ✅
      | we can connect to Nats-server ✅
    2025-10-22T08:01:46.575+1100 [INFO]  ophad.verify: ready for liftoff 🚀

     

...