Nagios: Difference between revisions

From CLONWiki
Jump to navigation Jump to search
Boiarino (talk | contribs)
No edit summary
No edit summary
 
(2 intermediate revisions by 2 users not shown)
Line 26: Line 26:
   tar xvf nagios-2.6.tar
   tar xvf nagios-2.6.tar
   cd '/usr/local/src/nagios-2.6'
   cd '/usr/local/src/nagios-2.6'
  su nagios
                  NO !!! su nagios
   ./configure --prefix=/www/nagios2.6 --with-cgiurl=/nagios/cgi-bin --with-htmurl=/nagios --with-nagios-user=nagios --with-nagios-group=nagios --with-command-group=nagcmd
   ./configure --prefix=/www/nagios2.6 --with-cgiurl=/nagios/cgi-bin --with-htmurl=/nagios --with-nagios-user=nagios --with-nagios-group=nagios --with-command-group=nagcmd
      ??did on RHEL6:  ./configure --prefix=/www/nagios-3.4.1 --with-cgiurl=/nagios/cgi-bin --with-htmurl=/nagios --with-nagios-user=nagios --with-nagios-group=nagcmd --with-command-group=nagcmd
   make all
   make all
   make install
   make install

Latest revision as of 13:50, 28 September 2012

Nagios is main monitoring tool for CLON cluster.

Download from web following files and place them to '/usr/local/downloads':

 nagios-2.6.tar.gz
 nagios-images_0.3.tar.gz
 nagios-plugins-1.4.5.tar.gz
 nagiosmib-1.0.0.tar.gz (?????????)
 create user 'nagios', private group 'nagios'
 mkdir /www/nagios2.6
 chown nagios.nagios /www/nagios2.6

Add command file group and put appropriate users in (we assume that apache is running as user 'apache'):

 /usr/sbin/groupadd nagcmd
 /usr/sbin/usermod -G nagcmd apache
 /usr/sbin/usermod -G nagcmd nagios
 to check, see file /etc/group

Build and install:

 cp /usr/local/downloads/nagios-2.6.tar.gz /usr/local/src
 cd /usr/local/src
 gunzip nagios-2.6.tar.gz
 tar xvf nagios-2.6.tar
 cd '/usr/local/src/nagios-2.6'
                  NO !!! su nagios
 ./configure --prefix=/www/nagios2.6 --with-cgiurl=/nagios/cgi-bin --with-htmurl=/nagios --with-nagios-user=nagios --with-nagios-group=nagios --with-command-group=nagcmd
     ??did on RHEL6:  ./configure --prefix=/www/nagios-3.4.1 --with-cgiurl=/nagios/cgi-bin --with-htmurl=/nagios --with-nagios-user=nagios --with-nagios-group=nagcmd --with-command-group=nagcmd
 make all
 make install

Install init script /etc/init.d/nagios:

 make install-init (as 'root' !!!)

Modify /etc/init.d/nagios script as following (bug ?):

 ###NagiosRunFile=${prefix}/var/nagios.lock
 NagiosRunFile=${prefix}/var/run/nagios.pid


 #to install sample /etc directory
 make install-config
 #???This installs and configures permissions on the
 #???directory for holding the external command file
 make install-commandmode


Install plugins and nrpe - see corresponding sections.


Fix apache configuration file:

 add contents of /usr/local/src/nagios-2.6/sample-config/httpd.conf
 to /www/apache2.2.3/conf/httpd.conf

Copy 'etc' directory from old Nagios (if any) to /www/nagios2.6. Go through files cgi.cfg, nagios.cfg and private/* and fix pathes making them point to /www/nagios2.6.

Add several directories for output files:

 mkdir /www/nagios2.6/var/log
 mkdir /www/nagios2.6/var/run
 mkdir /www/nagios2.6/var/rw (?????)

Install icons:

 cd /usr/local/src/nagios-images-0.3/base
 cp * /www/nagios2.6/share/images/logos

To check configuration run following commands:

 /www/nagios2.6/bin/nagios -v /www/nagios2.6/etc/nagios.cfg

To start/stop (restart need to be fixed ..):

 /etc/init.d/nagios start/stop

Add Nagios to services:

 chkconfig --add nagios
 chkconfig --level 3 nagios off
 chkconfig --level 4 nagios off
 chkconfig --list nagios

NOTE: following command was executed to let browser to disable host checks; it should be investigated ...

 chown nagios:nagcmd /www/nagios2.6/var/log/nagios.cmd

ADDITIONAL INFO (COPIED FROM http://klickitat.ee.washington.edu/medg/software/nagios-install-notes.txt):

some notes on installing nagios - these are a supplement to the basic nagios documentation that comes with the software:

Download and install the nagios and nagios-plugin packages. For whatever reason www.nagios.org seems to be hosed now (7/23/2004), but look on google. There is also sourceforge.nagios.net, which seems to be another nagios homepage.

Create a nagios user. Compile and install the packages as described in the documentation. Redhat has all necessary libraries already installed. I just went with the defaults in the compilation. The default is for nagios to install itself in /usr/local/nagios; everything should be owned by user nagios.

To enable the web interface, edit httpd.conf to add the Alias and ScriptAlias directives as described in the nagios documentation. This works for both apache 1.3 and apache 2.0. Restart apache. At this point you should be able to go to http://www.whatever/nagios/ and see the nagios page and access the documentation. CGIs probably won't work.

You need to set up the config files; this is the real heart of installing nagios and unfortunately is much easier to show than to describe. The first step is to copy the *.cfg-sample files that should be in /usr/local/nagios/etc to *.cfg. Then you need to edit these files to describe your setup.

Basically, hosts.cfg describes the hosts you want to monitor, services.cfg describes the services you want to monitor on each host, checkcommands.cfg is the check commands used by services.cfg to check the services; if you want to check a service you probably have to add a command to do so; contacts.cfg is the people who will be contacted in case of a problem, contactgroups is the groups of people, hostgroups.cfg is the groups of hosts (rrsl-machines, for example). nagios.cfg is the master config file. Probably you can get by just by copying and pasting the stuff already in these files and tweaking it.

To add a new machine you will need to edit hosts.cfg (add the machine), hostgroups.cfg (put it in a hostgroup), services.cfg (add the services to be checked on the machine).

To add a new administrator, you will need to edit contacts.cfg (add the new person) and contactgroups.cfg (put them in a contact group or create one).

In misccommands.cfg I needed to change /usr/bin/mail to /bin/mail on redhat -- but not on slackware! Otherwise it was not able to mail messages.

At this point you can check your config using the '/usr/local/nagios/bin/nagios -v /usr/local/nagios/etc/nagios.cfg' command, which does a check of the configuration and will warn if there are errors. Fix the errors and repeat until it is happy.

Now you need to enable authorization so that the cgi scripts will work. To do this first create the .htaccess file in the nagios/sbin directory as described in the documentation - it must be world-readable.

Next create the htpasswd.users file in the nagios/etc directory as described in the documentation - it must be world-readable! I added only one user: rrsl

At this point you should be able to start nagios using: /usr/local/nagios/bin/nagios /usr/local/nagios/etc/nagios.cfg

After nagios is started you should be able to go to the web page and use the cgis to display info. The final step is enabling the external command cgis, which let you change the behavior of nagios from the web. To do this you have to follow the steps in the documentation to enable external commands This involves enabling external commands in the config file and specifying an external commands file....

The permissions on the file seem to be a source of problems. To enable external commands you have to first create a group containing the nagios user and the user httpd runs as (apache for us). Then you create the directory /usr/local/nagios/var/rw with permissions: drwxrwsr-x 2 nagios nagioscmd 4096 Aug 3 13:55 rw

That's rwx permissions for the user and rws permissions for the group chmod gu+rwx nagios.nagioscmd rw chmod g+s rw Then you have to restart both apache and nagios, or it doesn't work!

The documentation describes some other gotchas..

To run checks for services on remote machines, you need to set up ssh to log in without a password. To do this run $ ssh-keygen -t rsa to create a public/private key pair. copy the public key into the ~/.ssh/authorized_keys file of the user nagios on the remote host. This file must only be rw by nagios or ssh will not work. The directory .ssh must also be only rwx by nagios. Then you should be able to ssh to the remote host as nagios without giving a password.

Copy the plugins you want to run on the remote host to the remote host, and then set up checks in checkcommands.cfg, and services.cfg. See the check-host-radar command for an example. The plugins can be any kind of program.

Many plugins were timing out after ten seconds when checking on heimdal and umtanum. Although there claimed to be a command line option to change this, in practice there was not. Therefore, I changed the source to set the timeout to 30sec and recompiled the plugins. This appears to work. (7/26/04)

nagios comes with the file /etc/rc.d/init.d/nagios, which is a script for starting nagios from rc.local or from the command line as root (or by sudo). This seems to be the best way of starting or stopping the program.

sudo /etc/rc.d/init.d/nagios start sudo /etc/rc.d/init.d/nagios stop etc.

nagios has the ability to acknowledge a host condition, so if a host is down, you can "acknowledge" through the web interface, and it will stop sending email unless the host changes state. This is useful. You can also disable notification for a service, which is handy


Adding a user to the web interface:

1) edit cgi.cfg to add the user to the actions that you want them to do
2) edit sbin/.htaccess to add the user to the list of ok users, eg:
   require user rrsl radar
   for users rrsl and radar being able to access the web interface
   Keep in mind this file must be world readable...
3) issue htpasswd /usr/local/nagios/etc/htpasswd.users <new user> as root, to
   create the new user and password
4) stop nagios
5) restart the web server
6  start nagios

It should work, you should be able to log in and do stuff.