...
Besides that there are a few potential causes for problems to examine.
...
Has the IP address of the opFlow server or virtual machine changed?
If so, change net flow configurations in the network devices to send to new IP address.
...
Verify that the flow collection daemon is running
In opFlow 3 you'll be warned of daemon problems on the main dashboard page, similar to the screenshot below:
...
If no nfcapd is alive, run sudo service nfdump start
.
...
Verify that opFlow's main daemon is running
opFlow requires that its opflowd
is running to periodically retrieve and process new flow data from the respective flow collector tool.
...
Code Block | ||||
---|---|---|---|---|
| ||||
sudo service opflowd start |
...
Verify that MongoDB is running
Without a working MongoDB opFlow can't operate; in all likelihood you will use a local MongoDB server, on the same machine as opFlow.
...
Like above, starting a missing mongod
instance is easy: sudo service mongod start
is the command you should use. Please note that mongod may refuse to start for a number of reasons (e.g. misconfiguration, lack of disk space, etc.); if the service start indicates failure you'll have to investigate using the MongoDB logs (which are usually in /var/log/mongodb/
).
...
Check that the data source folder configuration is consistent
opFlowd needs to know where to look for new flow data, and clearly the flow collector tool needs to know where to save data for consumers to find it.
...
Code Block |
---|
grep opflow_dir /usr/local/omk/conf/opCommon.nmis '<opflow_dir>' => '/var/lib/nfdump', cat /etc/default/nfdump /etc/sysconfig/nfdump #...at most one of these files exists; if not the default in /etc/init.d/nfdump will be used # in all cases the relevant line looks like this: DATA_BASE_DIR="/var/lib/nfdump" |
...
Check your diskspace (mainly opFlow 2)
Make sure where ever you are putting the flow data and the Mongo DB, you have quite alot of disk space; Flow data is very voluminous.
...
Code Block |
---|
df -h /data Filesystem Size Used Avail Use% Mounted on /dev/mapper/vg_data-lv_data 247G 86G 148G 37% /data |
...
Check Log Files
Review the log files in /usr/local/omk/log.
- opFlow.log
- common.log
- opDaemon.log
Run a purge manually (only opFlow 2)
Purge the raw flow binary flow data and the older database data, this assume you want to keep 7 days of flow binary data and it is located in /var/opflow.
Code Block |
---|
/usr/local/opmantek/bin/opflow_purge_raw_files.sh /var/opflow 7 /usr/local/opmantek/bin/opflowd.pl type=purge |
...
Are NetFlow packets arriving at the server?
You have verified that flowd/nfcapd and opflowd are running, but still you have no data on your dashboard. There are several things to check:
...
Code Block |
---|
/usr/local/omk/bin/nfdump -o raw -r nfcapd.201606090829 # prints every flow record in that file, followed by a short statistics section: Summary: total flows: 1562, total bytes: 1858493, total packets: 7904, avg bps: 7556, avg pps: 4, avg bpp: 235 Time window: 2016-06-09 08:28:23 - 2016-06-09 09:01:10 Total flows processed: 1562, Blocks skipped: 0, Bytes read: 131400 Sys: 0.052s flows/second: 29477.3 Wall: 0.219s flows/second: 7113.2 |
Are netflow packets sent where they are expected?
...
Verify Flow Data is Received
using tcpdump we can verify that flow data is being received by the server. This example uses the default opFlow UDP port of 9995. Specify the specific host that needs to be verified.
Code Block |
---|
[root@poller001 nfdump]# tcpdump -nn -i eth2 host 10.10.1.1 and port 9995
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on eth2, link-type EN10MB (Ethernet), capture size 65535 bytes
13:24:55.767037 IP 10.10.1.1.62757 > 10.215.1.7.9995: UDP, length 168
13:25:07.827152 IP 10.10.1.1.62757 > 10.215.1.7.9995: UDP, length 168 |
When we see output such as the example above we know this server is receiving flow data from the network device.
Check the Flow Data
The next step is to ensure the host in question is providing valid data that nfdump can process. Move to the /var/lib/nfdump directory and look for nfcapd files that end in a datestamp. The datestamp denotes the time the capture file was started. Select a file that is likely to contain samples from the host we with to verify and execute the following command.
Code Block |
---|
[root@poller001 nfdump]# nfdump -r nfcapd.201707111327 -o raw > ~/raw.txt |
Now view the new text file with less or a text editor. It will provide flow records such as the following. The 'ip router' field denotes the source router for this flow sample.
Code Block |
---|
Flow Record:
Flags = 0x00 FLOW, Unsampled
export sysid = 1
size = 76
first = 1499779596 [2017-07-11 22:26:36]
last = 1499779596 [2017-07-11 22:26:36]
msec_first = 447
msec_last = 447
src addr = 10.10.1.4
dst addr = 10.10.1.1
src port = 23232
dst port = 179
fwd status = 0
tcp flags = 0x02 ....S.
proto = 6 TCP
(src)tos = 192
(in)packets = 1
(in)bytes = 44
input = 4
output = 0
src as = 0
dst as = 0
src mask = 32 10.10.1.4/32
dst mask = 32 10.10.1.1/32
dst tos = 0
direction = 0
ip next hop = 0.0.0.0
ip router = 10.10.1.1
engine type = 0
engine ID = 0
received at = 1499747221750 [2017-07-11 13:27:01.750] |
Look for things are are not correct in the flow record. The following issues have been found in past support cases.
- input/output: These fields should be the SNMP index number of the input or output interfaces.
- first/last: This is a timestamp that the router assigns. It's important that the router time is in sync with opFlow time. opFlow uses this time to calculate statistics. For example, if the router time is an hour earlier than the server time, opFlow will not display the data until the server time catches up with the router time.
Are netflow packets sent where they are expected?
There is no strict standard for which (UDP) port netflow exporters and collectors should use.
...
In the example above you can see for what interfaces and when agents have supplied data. The cli tool also lets you disable agents or particular agent-input-output combinations.
...
Ignoring flow sources
When configurations are copied from one device to another flow configuration can come with them, this can lead to more flows being sent to opFlow than is expected. The best solution to this problem is to stop the device from sending flows, but this cannot always be done (or done in a timely manner).
...
with the desired agent ip address and in and out interface indices. If you omit the in_if
and out_if
arguments, all flow data from this agent is ignored; otherwise only flows that pass the specifed interfaces in the given direction are filtered out. Please note that deactivating an agent does not affect flows that have already been processed; only future inputs are filtered.
...
opFlow and opFlowSP are both
...
set under opCommon.nmis 'omkd' => 'load_applications'
Either opFlow or opFlowSP should be set, not both.
Otherwise, for example, opFlow uses the incorrect database in mongodb, 'flowsp' rather than 'flows'.
When making a change to this setting from opFlow to opFlowSP, and vice versa, restart the opflowd and omkd services:
Code Block |
---|
sudo service opflowd restart
sudo service omkd restart |