...
Ping test, The Ping tool is used to test whether a particular host is reachable across an IP network. A Ping measures the time it takes for packets to be sent from the local host to a destination computer and back.
Code Block ping x.x.x.x #add the ip address you need to reach
Traceroute, is a network diagnostic tool used to track in real-time the pathway taken by a packet on an IP network from source to destination, reporting the IP addresses of all the routers it pinged in between
Code Block traceroute <ip_Node> #add the ip address you need to reach
MTR, Mtr(my traceroute) is a command-line network diagnostic tool that provides the functionality of both the ping and traceroute commands
Code Block sudo mtr -r 8.8.8.8 [sample results below] HOST: endor Loss% Snt Last Avg Best Wrst StDev 1. 69.28.84.2 0.0% 10 0.4 0.4 0.3 0.6 0.1 2. 38.104.37.141 0.0% 10 1.2 1.4 1.0 3.2 0.7 3. te0-3-1-1.rcr21.dfw02.atlas. 0.0% 10 0.8 0.9 0.8 1.0 0.1 4. be2285.ccr21.dfw01.atlas.cog 0.0% 10 1.1 1.1 0.9 1.4 0.1 5. be2432.ccr21.mci01.atlas.cog 0.0% 10 10.8 11.1 10.8 11.5 0.2 6. be2156.ccr41.ord01.atlas.cog 0.0% 10 22.9 23.1 22.9 23.3 0.1 7. be2765.ccr41.ord03.atlas.cog 0.0% 10 22.8 22.9 22.8 23.1 0.1 8. 38.88.204.78 0.0% 10 22.9 23.0 22.8 23.9 0.4 9. 209.85.143.186 0.0% 10 22.7 23.7 22.7 31.7 2.8 10. 72.14.238.89 0.0% 10 23.0 23.9 22.9 32.0 2.9 11. 216.239.47.103 0.0% 10 50.4 61.9 50.4 92.0 11.9 12. 216.239.46.191 0.0% 10 32.7 32.7 32.7 32.8 0.1 13. ??? 100.0 10 0.0 0.0 0.0 0.0 0.0 14. google-public-dns-a.google.c 0.0% 10 32.7 32.7 32.7 32.8 0.0
- snmpwalk, is a Simple Network Management Protocol (SNMP) application present on the Security Management System (SMS) CLI that uses SNMP GETNEXT requests to query a network device for information. An object identifier (OID) may be given on the command line.
Code Block The following example CLI command will return the IPS temperature information: Command:snmpwalk -v 2c -c tinapc <IP address> 1.3.6.1.4.1.10734.3.5.2.5.5 Command Explanation: In this case the CLI command breaks down as following; snmpwalk = SNMP application -v 2c = specifies what SNMP version to use (1, 2c, 3) -c tinapc = specifies the community string. Note: The IPS has the SNMP read-only community string of "tinapc" <IP address> = specifies the IP address of the IPS device 1.3.6.1.4.1.10734.3.5.2.5.5 = OID parameter for the IPS temperature information Results: SNMPv2-SMI::enterprises.10734.3.5.2.5.5.1.0 = INTEGER: 27 SNMPv2-SMI::enterprises.10734.3.5.2.5.5.2.0 = INTEGER: 50 SNMPv2-SMI::enterprises.10734.3.5.2.5.5.3.0 = INTEGER: 55 SNMPv2-SMI::enterprises.10734.3.5.2.5.5.4.0 = INTEGER: 0 SNMPv2-SMI::enterprises.10734.3.5.2.5.5.5.0 = INTEGER: 85 Results Explanation: SNMPv2-SMI::enterprises.10734.3.5.2.5.5.1.0 = INTEGER: 27 = The chassis temperature (27° Celsius / 80.6° Fahrenheit) SNMPv2-SMI::enterprises.10734.3.5.2.5.5.2.0 = INTEGER: 50 = The major threshold value for chassis temperature (50° Celsius / 122° Fahrenheit) SNMPv2-SMI::enterprises.10734.3.5.2.5.5.3.0 = INTEGER: 55 = The critical threshold value of chassis temperature (55° Celsius / 131° Fahrenheit) SNMPv2-SMI::enterprises.10734.3.5.2.5.5.4.0 = INTEGER: 0 = The minimum value of the chassis temperature range ( 0° Celsius / 32° Fahrenheit) SNMPv2-SMI::enterprises.10734.3.5.2.5.5.5.0 = INTEGER: 85 = The maximum value of the chassis temperature range (85° Celsius / 185° Fahrenheit)
It is important to see that the device is pingable, do does not have latency, packet loss, and the SNMP data is been collected.
Server hardware requirements.
Introduccion
Top
Abrimos una consola y simplemente ejecutamos el comando:
Code Block |
---|
top |
Nos va a aparecer una interfaz en modo texto que se va a ir actualizando cada 3 segundos. Muestra un resumen del estado de nuestro sistema y la lista de procesos que se están ejecutando.
...
This section is crucial to identify or resolve device issues, you need to review some considerations depending on the number of nodes you will manage, the number of users that will be accessing the GUI's, how often does your data need to be updated? If updates are required every 5 minutes, then you will need to have the hardware to be able to accomplish these requirements, also the OS Requirements need to be well defined a good rule of thumb is to reserve 1 GB of RAM for the OS by default, High-speed drives for the data (SAN is ideal) with separate storage for mongo database, and temp files. Anywhere between 4-8 cores with a high-performing processor(s), 16-64 GB RAM should be performing well for 1k+ Nodes.
Using top/htop command
Top command shows all running processes in the server. It shows you the system information and the processes information just like up-time, average load, tasks running, no. of users logged in, no. of CPU processes, RAM utilisation and it lists all the processes running/utilised by the users in your server.
Code Block |
---|
top |
Code Block |
---|
top - 12:50:01 up 62 days, 22:56, 5 users, load average: 4.76, 8.03, 4.34 Tasks: 412 total, 1 running, 411 sleeping, 0 stopped, 15 zombie Cpu(s): 6.8%us, 3.8%sy, 0.2%ni, 74.4%id, 28.2%wa, 0.1%hi, 0.5%si, 0.0%st Mem: 20599548k total, 18622368k used, 1977180k free, 375212k buffers Swap: 6669720k total, 3536428k used, 3133292k free, 10767256k cached PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 26306 root 20 0 478m 257m 1900 S 3.9 1.3 0:08.21 nmis.pl 15522 root 20 0 626m 373m 2776 S 2.0 1.9 71:45.09 opeventsd.pl 27285 root 20 0 15280 1444 884 R 2.0 0.0 0:00.01 top 1 root 20 0 19356 308 136 S 0.0 0.0 1:07.65 init 2 root 20 0 0 0 0 S 0.0 0.0 0:02.14 kthreadd 3 root RT 0 0 0 0 S 0.0 0.0 17359:19 migration/0 4 root 20 0 0 0 0 S 0.0 0.0 252:25.86 ksoftirqd/0 5 root RT 0 0 0 0 S 0.0 0.0 0:00.00 stopper/0 6 root RT 0 0 0 0 S 0.0 0.0 2233:33 watchdog/0 7 root RT 0 0 0 0 S 0.0 0.0 340:35.60 migration/1 8 root RT 0 0 0 0 S 0.0 0.0 0:00.00 stopper/1 9 root 20 0 0 0 0 S 0.0 0.0 5:23.87 ksoftirqd/1 10 root RT 0 0 0 0 S 0.0 0.0 214:57.35 watchdog/1 |
Es importante entender la salida del comando. El análisis línea por línea y luego en cada columna mostrada.
...
1.First line
The very first line of the top command indicates in the order below.
Code Block |
---|
top - 12:50:01 up 62 days, 22:56, 5 users, load average: 4.76, 8.03, 4.34 |
En la primera línea nos muestra:
- Hora actual.
- Tiempo que ha estado el sistema encendido.
- Número de usuarios (root).
- Carga media en intervalos de 5, 10 y 15 minutos respectivamente.
2. Tareas
- current time (12:50:01)
- uptime of the machine (up 62 days, 22:56)
- users sessions logged in (5 users)
- average load on the system (load average: 4.76, 8.03, 4.34) the 3 values refer to the last minute, five minutes and 15 minutes
2. Second Row : task
The second row provides you the following information.
Code Block |
---|
Tasks: 412 total, 1 running, 411 sleeping, 0 stopped, 15 zombie |
La segunda línea muestra el total de tareas y procesos, los cuales pueden estar en diferentes estados.
...
- Total Processes running (412 total)
- Running Processes (1 running)
- Sleeping Processes (411 sleeping)
- Stopped Processes (0 stopped)
- Processes waiting to be stopped from the parent process (15 zombie)
Zombie Process: A process that has completed execution, but still has an entry in the process table. This entry still needs to allow the parent process to read its child exit status.
3. Estados de la CPU
Code Block |
---|
Cpu(s): 6.8%us, 3.8%sy, 0.2%ni, 74.4%id, 28.2%wa, 0.1%hi, 0.5%si, 0.0%st |
Esta línea nos muestra los porcentajes de uso del procesador diferenciado por el uso que se le de.
- us (usuario): tiempo de CPU de usuario.
- sy (sistema): tiempo de CPU del kernel.
- id (inactivo): tiempo de CPU en procesos inactivos.
- wa (en espera): tiempo de CPU en procesos en espera, en este ejemplo se puede visualizar que se tienen un porcentaje muy alto en este valor por lo que nos debemos de preocupar.
- hi (interrupciones de hardware): interrupciones de hardware.
- si (interrupciones de software): tiempo de CPU en interrupciones de software.
4. Memoria física
Code Block |
---|
Mem: 20599548k total, 18622368k used, 1977180k free, 375212k buffers |
- Memoria total.
- Memoria utilizada.
- Memoria libre.
- Memoria utilizada por buffer.
5. Memoria virtual
Code Block |
---|
Swap: 6669720k total, 3536428k used, 3133292k free, 10767256k cached |
- Memoria total.
- Memoria usada.
- Memoria libre.
- Memoria en caché.
6. Columnas
Ahora vamos a ver las diferentes columnas que nos encontramos al ejecutar el comando.
...
User processes of CPU in percentage(6.8%us)
System processes of CPU in percentage(3.8%sy)
Priority upgrade nice of CPU in percentage(0.2%ni)
Percentage of the CPU not used (74.4%id)
Processes waiting for I/O operations of CPU in percentage(28.2%wa) ####### This is not good for the server performance.
Serving hardware interrupts of CPU in percentage(0.1% hi — Hardware IRQ
Percentage of the CPU serving software interrupts (0.0% si — Software Interrupts
The amount of CPU ‘stolen’ from this virtual machine by the hypervisor for other tasks (such as running another virtual machine) will be 0 on desktop and server without Virtual machine. (0.0%st — Steal Time)
4. Memory
These rows will provide you the information about RAM usage. It shows you total memory in use, free, buffers cached.
Code Block |
---|
Mem: 20599548k total, 18622368k used, 1977180k free, 375212k buffers |
Code Block |
---|
Swap: 6669720k total, 3536428k used, 3133292k free, 10767256k cached |
5. Process List
There is a last row to discuss CPU usage which were running currently
Code Block |
---|
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 26306 root 20 0 478m 257m 1900 S 3.9 1.3 0:08.21 nmis.pl 8 root RT 0 0 0 0 S 0.0 0.0 0:00.00 stopper/1 15522 root 20 0 626m 373m 2776 S 2.0 1.9 71:45.09 opeventsd.pl 9 root 20 0 0 0 0 S 0.0 0.0 5:23.87 ksoftirqd/1 27285 root 20 0 15280 1444 884 R 2.0 0.0 0:00.01 top 10 root RT 0 0 0 0 S 0.0 0.0 214:57.35 watchdog/1 |
- PID
...
- – ID of the process(26306)
- USER – The user that is the owner of the process (root)
- PR – priority of the process (20)
- NI – The “NICE” value of the process (0)
- VIRT – virtual memory used by the process (478m)
- RES – physical memory used from the process (3.3g)
- SHR – shared memory of the process (1900)
- S – indicates the status of the process: S=sleep R=running Z=zombie (S)
- %CPU – This is the percentage of CPU used by this process (3.9)
- %MEM – This is the percentage of RAM used by the process (1.3)
- TIME+ –This is the total time of activity of this process (0:08.21)
- COMMAND – And this is the name of the process (exim)
It is important to monitor this commando to see if the server is working properly executing all the internal processes need.
Server configuration option.
...