Monitoring Linux Systems with Nagios

Feb 17, 2016 | Barred Owl Web, Technical

17 February 2016

Nagios is extremely versatile, and can monitor just about anything. I first tasted Nagios when I worked as an Operations Intern for a Drupal services.

In today’s post, I’m going to share some of my accumulated knowledge in using Nagios to monitor the infrastructure we manage through Barred Owl Web. In December 2015, I gave a presentation to the ChaDevOps Meetup Group on a Basic Introduction to Nagios. You can view all of my workshops & presentations at https://barredowlweb.com/knowledge-base/

Up until recently, I only used Nagios to monitor public services (namely, does a URL properly load, and is the server responsive to ICMP pings). Within the last 2 months, I’ve expanded my basic Nagios implementation to using NRPE for monitoring server load, memory usage, and postfix mail queues on various servers.

The Setup

As of this blog post, I run all of my infrastructure on CentOS. Most of the servers I manage are running either CentOS 6 or 7, although I still have a couple legacy CentOS 5 machines under my control. Instead of compiling Nagios from source (who wants to maintain that?), I’ve opted to use the EPEL repository.

Here’s my setup:

  • EPEL Repo (For CentOS 7, you can install it with `rpm -iUvh http://ftp.linux.ncsu.edu/pub/epel/7/x86_64/e/epel-release-7-5.noarch.rpm`)
  • After you do a `yum install nagios nagios-plugins-all nagios-nrpe`, you can find the relevant Nagios files as follows:
    • Main config and conf.d directory is in /etc/nagios/
    • Plugins are located in /usr/lib64/nagios/plugins
    • NRPE config is at /etc/nagios/nrpe.conf

The Monitoring

Here’s some of the things that I’m monitoring:

  • Checking for correct DNS values on various hosts
    • check_dns -H host [-s server] [-a expected-address] [-A] [-t timeout] [-w warn] [-c crit] — http://nagios-plugins.org/doc/man/check_dns.html
    • This doesn’t require NRPE, and is a simple check from the monitoring server. Here’s my service definition:

      define service{     host_name ns1.developcents.com     service_description DNS Check     check_command check_dns!ns1.developcents.com     contact_groups admins     max_check_attempts 3     check_interval 10     retry_interval 5     check_period 24×7     notification_interval 30     notification_period 24×7}

  • Checking to see if server load is reasonable
    • check_load [-r] -w WLOAD1,WLOAD5,WLOAD15 -c CLOAD1,CLOAD5,CLOAD15 — http://nagios-plugins.org/doc/man/check_load.html
    • This does require NRPE. Here’s my service definition on the monitoring server:

      define service{ host_name mail.developcents.com service_description Server Load contact_groups admins check_command check_nrpe!check_load check_interval 4 retry_interval 1 max_check_attempts 3 check_period 24×7 notification_period 24×7 }

    • And here’s my NRPE command (found in nrpe.conf) on the server that is being monitored:command[check_load]=/usr/lib64/nagios/plugins/check_load -w 15,10,5 -c 30,25,20
  • Checking the Mail Queue to make sure it’s not clogged
    • This is a 3rd party plugin not included in the default nagios-plugins-all package provided by EPEL. The plugin information is at https://exchange.nagios.org/directory/Plugins/Email-and-Groupware/Postfix/check_postfix_queue/details.
    • Here’s my service definition on the monitoring server:

      define service{ host_name mail.developcents.com service_description Mail Queue contact_groups admins check_command check_nrpe!check_queue check_interval 4 retry_interval 1 max_check_attempts 3 check_period 24×7 notification_period 24×7 }

    • And here’s my NRPE command (again, note that this goes into nrpe.conf on the server that is actually being monitored):command[check_queue]=/usr/lib64/nagios/plugins/check_postfix_queue -w 15 -c 30

I hope that this information is useful to someone! You can also find some of my Nagios-related questions & answers on ServerFault and StackOverflow:

  • My Question and answer on how to monitor URLs: http://stackoverflow.com/questions/9246557/monitoring-urls-with-nagios/
  • My Question and answer on how to monitor hosts with check_ping: http://stackoverflow.com/questions/26746404/nagios-monitoring-hosts-with-check-ping
  • My Answer to How to run a check from the CLI: http://serverfault.com/questions/339968/how-can-i-manually-run-a-nagios-check-from-the-command-line/339969#339969 (See my answer)

Want to share some of your Nagios knowledge? Leave a comment.

Want me to help you with your Nagios – or other sysadmin – needs? Contact us today.

Why Choose Barred Owl Web?

The Barred Owl Web team is technically proficient, extremely responsive and provides a high level of customer satisfaction.  We highly recommend Barred Owl Web for web development, technical, and customer support.
– Enrique Fiallo, Director of Technology, NET Institute

Barred Owl Web is the hosting company to call first for nonprofits. Their solutions-oriented, customer – and client – focused approach to web hosting provides agencies the ability to consistently and reliably get their messages out to those who need to hear it. You can count on Barred Owl Web to be responsive to the unique needs of your agency. Barred Owl Web’s customer service is exceptional, and it is kind. Contact them and see for yourself!
Rebecca Whelchel, Executive Director, Metropolitan Ministries (MetMin)

Barred Owl Web has always been responsive to our needs as a small nonprofit. They have helped us immensely with issues like Web server security updates and PCI compliance.
Evan Donovan, Web Developer, Tech Mission

Contact Us

423.693.4234
info@barredowlweb.com

P.O. Box 21514
Chattanooga, TN 37424

14 + 15 =