Nagios: Difference between revisions
No edit summary |
No edit summary |
||
(5 intermediate revisions by 3 users not shown) | |||
Line 26: | Line 26: | ||
tar xvf nagios-2.6.tar | tar xvf nagios-2.6.tar | ||
cd '/usr/local/src/nagios-2.6' | cd '/usr/local/src/nagios-2.6' | ||
NO !!! su nagios | |||
./configure --prefix=/www/nagios2.6 --with-cgiurl=/nagios/cgi-bin --with-htmurl=/nagios --with-nagios-user=nagios --with-nagios-group=nagios --with-command-group=nagcmd | ./configure --prefix=/www/nagios2.6 --with-cgiurl=/nagios/cgi-bin --with-htmurl=/nagios --with-nagios-user=nagios --with-nagios-group=nagios --with-command-group=nagcmd | ||
??did on RHEL6: ./configure --prefix=/www/nagios-3.4.1 --with-cgiurl=/nagios/cgi-bin --with-htmurl=/nagios --with-nagios-user=nagios --with-nagios-group=nagcmd --with-command-group=nagcmd | |||
make all | make all | ||
make install | make install | ||
Line 49: | Line 50: | ||
Install plugins and nrpe - see corresponding sections | Install plugins and nrpe - see corresponding sections. | ||
Line 110: | Line 90: | ||
chown nagios:nagcmd /www/nagios2.6/var/log/nagios.cmd | chown nagios:nagcmd /www/nagios2.6/var/log/nagios.cmd | ||
'''ADDITIONAL INFO (COPIED FROM http://klickitat.ee.washington.edu/medg/software/nagios-install-notes.txt):''' | |||
some notes on installing nagios - these are a supplement to the basic | |||
nagios documentation that comes with the software: | |||
Download and install the nagios and nagios-plugin packages. For whatever reason | |||
www.nagios.org seems to be hosed now (7/23/2004), but look on google. There | |||
is also sourceforge.nagios.net, which seems to be another nagios homepage. | |||
Create a nagios user. Compile and install the packages as described in the | |||
documentation. Redhat | |||
has all necessary libraries already installed. I just went with the defaults | |||
in the compilation. The default is for nagios to install itself in | |||
/usr/local/nagios; everything should be owned by user nagios. | |||
To enable the web interface, edit httpd.conf to add the Alias and | |||
ScriptAlias directives as | |||
described in the nagios documentation. This works for both apache 1.3 and | |||
apache 2.0. Restart apache. At this point you should be able to go to | |||
http://www.whatever/nagios/ and see the nagios page and access the | |||
documentation. CGIs probably won't work. | |||
You need to set up the config files; this is the real heart of installing | |||
nagios and unfortunately is much easier to show than to describe. The first | |||
step is to copy the *.cfg-sample files that should be in /usr/local/nagios/etc | |||
to *.cfg. Then you need to edit these files to describe your setup. | |||
Basically, hosts.cfg describes the hosts you want to monitor, services.cfg | |||
describes the services you want to monitor on each host, checkcommands.cfg | |||
is the check commands used by services.cfg to check the services; if you want | |||
to check a service you probably have to add a command to do so; contacts.cfg | |||
is the people who will be contacted in case of a problem, contactgroups | |||
is the groups of people, hostgroups.cfg is the groups of hosts (rrsl-machines, | |||
for example). nagios.cfg is the master config file. Probably you can get | |||
by just by copying and pasting the stuff already in these files and tweaking | |||
it. | |||
To add a new machine you will need to edit hosts.cfg (add the machine), | |||
hostgroups.cfg (put it in a hostgroup), services.cfg (add the services to | |||
be checked on the machine). | |||
To add a new administrator, you will need to edit contacts.cfg (add the new | |||
person) and contactgroups.cfg (put them in a contact group or create one). | |||
In misccommands.cfg I needed to change /usr/bin/mail to /bin/mail on redhat | |||
-- but not on slackware! Otherwise it was not able to mail messages. | |||
At this point you can check your config using the | |||
'/usr/local/nagios/bin/nagios -v /usr/local/nagios/etc/nagios.cfg' command, | |||
which does a check of the configuration and will warn if there are errors. | |||
Fix the errors and repeat until it is happy. | |||
Now you need to enable authorization so that the cgi scripts will work. To | |||
do this first create the | |||
.htaccess file in the nagios/sbin directory as described in the | |||
documentation - it must be world-readable. | |||
Next create the htpasswd.users file in the nagios/etc directory as described in | |||
the documentation - it must be world-readable! I added only one user: rrsl | |||
At this point you should be able to start nagios using: | |||
/usr/local/nagios/bin/nagios /usr/local/nagios/etc/nagios.cfg | |||
After nagios is started you should be able to go to the web page and | |||
use the cgis to display info. The final step is enabling the external command | |||
cgis, which let you change the behavior of nagios from the web. To do this | |||
you have to follow the steps in the documentation to enable external commands | |||
This involves enabling external commands in the config file and specifying | |||
an external commands file.... | |||
The permissions on the file seem to be a source of problems. | |||
To enable external commands you have to first create a group containing | |||
the nagios user and the user httpd runs as (apache for us). Then you create | |||
the directory /usr/local/nagios/var/rw with permissions: | |||
drwxrwsr-x 2 nagios nagioscmd 4096 Aug 3 13:55 rw | |||
That's rwx permissions for the user and rws permissions for the group | |||
chmod gu+rwx nagios.nagioscmd rw | |||
chmod g+s rw | |||
Then you have to restart both apache and nagios, or it doesn't work! | |||
The documentation describes some other gotchas.. | |||
To run checks for services on remote machines, you need to set up ssh to | |||
log in without a password. To do this run | |||
$ ssh-keygen -t rsa | |||
to create a public/private key pair. copy the public key into the | |||
~/.ssh/authorized_keys file of the user nagios on the remote host. This file | |||
must only be rw by nagios or ssh will not work. The directory .ssh must also | |||
be only rwx by nagios. Then you should be able | |||
to ssh to the remote host as nagios without giving a password. | |||
Copy the plugins you want to run on the remote host to the remote host, | |||
and then set up checks in checkcommands.cfg, and services.cfg. See the | |||
check-host-radar command for an example. The plugins can be any kind of | |||
program. | |||
Many plugins were timing out after ten seconds when checking on heimdal and | |||
umtanum. Although there claimed to be a command line option to change this, | |||
in practice there was not. Therefore, I changed the source to set the timeout | |||
to 30sec and recompiled the plugins. This appears to work. (7/26/04) | |||
nagios comes with the file /etc/rc.d/init.d/nagios, which is a script for | |||
starting nagios from rc.local or from the command line as root (or by | |||
sudo). This seems to be the best way of starting or stopping the program. | |||
sudo /etc/rc.d/init.d/nagios start | |||
sudo /etc/rc.d/init.d/nagios stop | |||
etc. | |||
nagios has the ability to acknowledge a host condition, so if a host | |||
is down, you can "acknowledge" through the web interface, and it will | |||
stop sending email unless the host changes state. This is useful. | |||
You can also disable notification for a service, which is handy | |||
Adding a user to the web interface: | |||
1) edit cgi.cfg to add the user to the actions that you want them to do | |||
2) edit sbin/.htaccess to add the user to the list of ok users, eg: | |||
require user rrsl radar | |||
for users rrsl and radar being able to access the web interface | |||
Keep in mind this file must be world readable... | |||
3) issue htpasswd /usr/local/nagios/etc/htpasswd.users <new user> as root, to | |||
create the new user and password | |||
4) stop nagios | |||
5) restart the web server | |||
6 start nagios | |||
It should work, you should be able to log in and do stuff. |
Latest revision as of 13:50, 28 September 2012
Nagios is main monitoring tool for CLON cluster.
Download from web following files and place them to '/usr/local/downloads':
nagios-2.6.tar.gz nagios-images_0.3.tar.gz nagios-plugins-1.4.5.tar.gz nagiosmib-1.0.0.tar.gz (?????????)
create user 'nagios', private group 'nagios' mkdir /www/nagios2.6 chown nagios.nagios /www/nagios2.6
Add command file group and put appropriate users in (we assume that apache is running as user 'apache'):
/usr/sbin/groupadd nagcmd /usr/sbin/usermod -G nagcmd apache /usr/sbin/usermod -G nagcmd nagios to check, see file /etc/group
Build and install:
cp /usr/local/downloads/nagios-2.6.tar.gz /usr/local/src cd /usr/local/src gunzip nagios-2.6.tar.gz tar xvf nagios-2.6.tar cd '/usr/local/src/nagios-2.6' NO !!! su nagios ./configure --prefix=/www/nagios2.6 --with-cgiurl=/nagios/cgi-bin --with-htmurl=/nagios --with-nagios-user=nagios --with-nagios-group=nagios --with-command-group=nagcmd ??did on RHEL6: ./configure --prefix=/www/nagios-3.4.1 --with-cgiurl=/nagios/cgi-bin --with-htmurl=/nagios --with-nagios-user=nagios --with-nagios-group=nagcmd --with-command-group=nagcmd make all make install
Install init script /etc/init.d/nagios:
make install-init (as 'root' !!!)
Modify /etc/init.d/nagios script as following (bug ?):
###NagiosRunFile=${prefix}/var/nagios.lock NagiosRunFile=${prefix}/var/run/nagios.pid
#to install sample /etc directory make install-config
#???This installs and configures permissions on the #???directory for holding the external command file make install-commandmode
Install plugins and nrpe - see corresponding sections.
Fix apache configuration file:
add contents of /usr/local/src/nagios-2.6/sample-config/httpd.conf to /www/apache2.2.3/conf/httpd.conf
Copy 'etc' directory from old Nagios (if any) to /www/nagios2.6. Go through files cgi.cfg, nagios.cfg and private/* and fix pathes making them point to /www/nagios2.6.
Add several directories for output files:
mkdir /www/nagios2.6/var/log mkdir /www/nagios2.6/var/run mkdir /www/nagios2.6/var/rw (?????)
Install icons:
cd /usr/local/src/nagios-images-0.3/base cp * /www/nagios2.6/share/images/logos
To check configuration run following commands:
/www/nagios2.6/bin/nagios -v /www/nagios2.6/etc/nagios.cfg
To start/stop (restart need to be fixed ..):
/etc/init.d/nagios start/stop
Add Nagios to services:
chkconfig --add nagios chkconfig --level 3 nagios off chkconfig --level 4 nagios off chkconfig --list nagios
NOTE: following command was executed to let browser to disable host checks; it should be investigated ...
chown nagios:nagcmd /www/nagios2.6/var/log/nagios.cmd
ADDITIONAL INFO (COPIED FROM http://klickitat.ee.washington.edu/medg/software/nagios-install-notes.txt):
some notes on installing nagios - these are a supplement to the basic nagios documentation that comes with the software:
Download and install the nagios and nagios-plugin packages. For whatever reason www.nagios.org seems to be hosed now (7/23/2004), but look on google. There is also sourceforge.nagios.net, which seems to be another nagios homepage.
Create a nagios user. Compile and install the packages as described in the documentation. Redhat has all necessary libraries already installed. I just went with the defaults in the compilation. The default is for nagios to install itself in /usr/local/nagios; everything should be owned by user nagios.
To enable the web interface, edit httpd.conf to add the Alias and ScriptAlias directives as described in the nagios documentation. This works for both apache 1.3 and apache 2.0. Restart apache. At this point you should be able to go to http://www.whatever/nagios/ and see the nagios page and access the documentation. CGIs probably won't work.
You need to set up the config files; this is the real heart of installing nagios and unfortunately is much easier to show than to describe. The first step is to copy the *.cfg-sample files that should be in /usr/local/nagios/etc to *.cfg. Then you need to edit these files to describe your setup.
Basically, hosts.cfg describes the hosts you want to monitor, services.cfg describes the services you want to monitor on each host, checkcommands.cfg is the check commands used by services.cfg to check the services; if you want to check a service you probably have to add a command to do so; contacts.cfg is the people who will be contacted in case of a problem, contactgroups is the groups of people, hostgroups.cfg is the groups of hosts (rrsl-machines, for example). nagios.cfg is the master config file. Probably you can get by just by copying and pasting the stuff already in these files and tweaking it.
To add a new machine you will need to edit hosts.cfg (add the machine), hostgroups.cfg (put it in a hostgroup), services.cfg (add the services to be checked on the machine).
To add a new administrator, you will need to edit contacts.cfg (add the new person) and contactgroups.cfg (put them in a contact group or create one).
In misccommands.cfg I needed to change /usr/bin/mail to /bin/mail on redhat -- but not on slackware! Otherwise it was not able to mail messages.
At this point you can check your config using the '/usr/local/nagios/bin/nagios -v /usr/local/nagios/etc/nagios.cfg' command, which does a check of the configuration and will warn if there are errors. Fix the errors and repeat until it is happy.
Now you need to enable authorization so that the cgi scripts will work. To do this first create the .htaccess file in the nagios/sbin directory as described in the documentation - it must be world-readable.
Next create the htpasswd.users file in the nagios/etc directory as described in the documentation - it must be world-readable! I added only one user: rrsl
At this point you should be able to start nagios using: /usr/local/nagios/bin/nagios /usr/local/nagios/etc/nagios.cfg
After nagios is started you should be able to go to the web page and use the cgis to display info. The final step is enabling the external command cgis, which let you change the behavior of nagios from the web. To do this you have to follow the steps in the documentation to enable external commands This involves enabling external commands in the config file and specifying an external commands file....
The permissions on the file seem to be a source of problems. To enable external commands you have to first create a group containing the nagios user and the user httpd runs as (apache for us). Then you create the directory /usr/local/nagios/var/rw with permissions: drwxrwsr-x 2 nagios nagioscmd 4096 Aug 3 13:55 rw
That's rwx permissions for the user and rws permissions for the group chmod gu+rwx nagios.nagioscmd rw chmod g+s rw Then you have to restart both apache and nagios, or it doesn't work!
The documentation describes some other gotchas..
To run checks for services on remote machines, you need to set up ssh to log in without a password. To do this run $ ssh-keygen -t rsa to create a public/private key pair. copy the public key into the ~/.ssh/authorized_keys file of the user nagios on the remote host. This file must only be rw by nagios or ssh will not work. The directory .ssh must also be only rwx by nagios. Then you should be able to ssh to the remote host as nagios without giving a password.
Copy the plugins you want to run on the remote host to the remote host, and then set up checks in checkcommands.cfg, and services.cfg. See the check-host-radar command for an example. The plugins can be any kind of program.
Many plugins were timing out after ten seconds when checking on heimdal and umtanum. Although there claimed to be a command line option to change this, in practice there was not. Therefore, I changed the source to set the timeout to 30sec and recompiled the plugins. This appears to work. (7/26/04)
nagios comes with the file /etc/rc.d/init.d/nagios, which is a script for starting nagios from rc.local or from the command line as root (or by sudo). This seems to be the best way of starting or stopping the program.
sudo /etc/rc.d/init.d/nagios start sudo /etc/rc.d/init.d/nagios stop etc.
nagios has the ability to acknowledge a host condition, so if a host is down, you can "acknowledge" through the web interface, and it will stop sending email unless the host changes state. This is useful. You can also disable notification for a service, which is handy
Adding a user to the web interface:
1) edit cgi.cfg to add the user to the actions that you want them to do 2) edit sbin/.htaccess to add the user to the list of ok users, eg: require user rrsl radar for users rrsl and radar being able to access the web interface Keep in mind this file must be world readable... 3) issue htpasswd /usr/local/nagios/etc/htpasswd.users <new user> as root, to create the new user and password 4) stop nagios 5) restart the web server 6 start nagios
It should work, you should be able to log in and do stuff.