Nagios安装详解对于运维人员来说,获悉服务器状况的信息非常重要,针对监控服务,比较好的有cacti和nagios,cacti使用更简单一些,cacti主要是用来采集信息,CPU、MEM、流量等信息更详细,nagios主要用来是报警功能,他配置比cacti麻烦一些,信息没有cacti采集的完全,但是有cacti不具备的报警功能,nagios会有一个报警的机制,当触发了这个报警机制时,会发送邮件或者短消息;如果能做到二者的结合,无疑是最好的选择;这篇文章主要是介绍如何安装nagios监控平台、添加被监控主机(Linux和Windows),可以满足日常之中的报警功能,后续会将cacti的更详细安装发表出来,以及nagios和cacti的结合先去基本需要安装的包nagios-3.2.1.tar.gznrpe-2.12.tar.gzhttpd-2.2.21.tar.bz2nagios-plugins-1.4.14.tar.gz第一步:安装apache解压tarjxvfhttpd-2.2.21.tar.bz2编译安装apache,将一些需要的参数加上,需要cgi的支持./configure--prefix=/usr/local/apache--enable-so--enable-track-vars--enable-rewrite--with-zlib--enable-mods-shard=most--enable-cgi--enable-cgid--with-suexec-caller=apache确认无error信息后make&&makeinstallOk,安装好后先不用启动,等下还需要修改http.conf的文件第二步:安装nagios因为nagios的安装很简单,主要是nagios的插件太多,安装nagios的插件和配置比较麻烦,需要细心一点1.先建立nagios的用户名和密码useraddnagios创建一个组,用于从web接口执行nagios的外部命令,将nagios和apache用户加入到组中usermod-Gnagcmdnagiosusermod-Gnagcmdapache2.安装nagios解压tarzxvfnagios-3.2.1.tar.gz编译安装./configure--prefix=/usr/local/nagios--with-command-group=nagcmd--with-httpd-conf=/usr/local/apache/confmakeallmakeinstallmakeinstall-initmakeinstall-commandmodemakeinstall-configmakeinstall-webconfOk,到nagios目录下,ls一下binetclibexecsbinsharevar说明nagios安装ok到apacheconf下,会看到多出了nagios.conf配置文件,这个文件是将apache与nagios结合的文件,将其追加到http文件中catnagios.conf>>httpd.confAuthUserFile/usr/local/nagios/etc/htpasswd.users这个是登录到nagios需要验证的用户名和密码[root@nagios-serverconf]#/usr/local/apache/bin/htpasswd-c/usr/local/nagios/etc/htpasswd.userstestNewpassword:Re-typenewpassword:Addingpasswordforusertest添加用户test用户,web访问时会用到的更改属主,属组chown-Rapache:apache/usr/local/apache/chown-Rnagios:nagcmd/usr/local/nagios/3.安装nagios-plugintarzxvfnagios-plugins-1.4.14.tar.gz编译安装,这个是nagios的一个插件程序./configure--prefix=/usr/local/nagios--with-nagios-user=nagios--with-nagios-group=nagiosmake&&makeinstall之后到nagios目录下多了一个libexec,这个目录下面是nagios插件程序先启动apache和nagios/usr/local/apache/bin/apachectl-kstart启动nagios之前,先检查使用有Warnings和errors/usr/local/nagios/bin/nagios-v/usr/local/nagios/etc/nagios.cfg结果:TotalWarnings:0TotalErrors:0Ok,将nagios启动,然后使用web访问下,看是否可以看到主界面[root@nagios-serverconf]#servicenagiosstartStartingnagios:done.Ok,Web访问nagios,方法:http://IP/nagios这里需要输入刚新加的test用户如果发现这样的error或者是显示出了php的源代码文件,这是因为缺少来php程序的支持,安装php以及修改http的配置文件Ok,还需要修改http的配置文件,因为nagios下游cgi的文件和php的文件,从nagios3.0以后都是需要php的支持,不然的话,nagios的访问都是源代码:解决办法:安装php程序包解压包:tarzxvfphp-5.3.6.tar.gz./configure--prefix=/usr/local/php--with-apxs2=/usr/local/apache/bin/apxs--with-libxml-dir--with-png-dir--with-jpeg-dir--with-zlib--with-freetype-dir--with-gd-dir--enable-mbstring=allMakeMaketestTEST6175/8798[ext/standard/tests/file/005_variation.phpt]Makeinstall这时如果你的libtool不一致的话,会出现error/usr/local/apache/modules/libphp5.so":没有那个文件或目录解决办法:将系统目前的libtool卸载掉[root@nagios-serverbuild]#rpm-qa|greplibtoollibtool-ltdl-2.2.6-15.5.el6.x86_64libtool-2.2.6-15.5.el6.x86_64之后rpm-elibtool-2.2.6-15.5.el6.x86_64--nodeps将apache安装目录下build下的libtool复制到php编译安装目录(解压后的目录)下cp-rflibtool/home/software/php-5.3.6再makeclean重新执行./configureemakemakeinstall成功后,修改apache的配置文件:UserapacheGroupapacheDirectoryIndexindex.phpindex.htmlAddTypeapplication/x-httpd-php.php.phtmlAddHandlercgi-script.cgi之后重启apache/usr/local/apache/bin/apachectl-krestart再重新访问下网页发现OK了,nagios和apache的结合没有问题了,剩下的就是nagios的插件和监控设置了第三步:配置nagios监控主机的配置文件1.修改nagios配置文件cfg_file=/usr/local/nagios/etc/objects/localhost.cfg(这个是默认针对nagios监控主机的配置,没添加一台主机,都要在这个文件中配置)cfg_dir=/usr/local/nagios/etc/servers(被监控主机的监控配置文件目录)剩下的switch、routers、printer现在又没有用到修改这两个就好2.修改联系人配置文件contacts.cfgdefinecontact{contact_nametest;aliasadmin;service_notification_period24x7;host_notification_period24x7;service_notification_optionsw,u,c,r;host_notification_optionsd,u,r;service_notification_commandsnotify-service-by-email;host_notification_commandsnotify-host-by-email;emailfrank@51coolbar.com;}如果多个人的话,copy一下,将email更改下即可definecontactgroup{contactgroup_nameadminsaliasNagiosAdministratorsmemberstest}这个是组,members可以添加多个,","(逗号)隔开3.修改localhost文件definehost{host_name192.168.20.221aliaslocal-serviceaddress192.168.20.221contact_groupsadminsmax_check_attempts5notification_interval200notification_optionsd,u,r}剩下的保持不变,l4.修改时间,保持默认的即可definetimeperiod{timeperiod_name24x7alias24HoursADay,7DaysAWeeksunday00:00-24:00monday00:00-24:00tuesday00:00-24:00wednesday00:00-24:00thursday00:00-24:00friday00:00-24:00saturday00:00-24:00}5.修改command添加一行,其他的保持不变#checknrpedefinecommand{command_namecheck_nrpecommand_line$USER1$/check_nrpe-H$HOSTADDRESS$-c$ARG1$}Ok了,更改这些会把nagios平台基本都搭建起来了,使用/usr/local/nagios/bin/nagios-v/usr/local/nagios/etc/nagios.cfg检查nagios那一个配置文件是否有error或者warning信息TotalWarnings:0TotalErrors:0如果有警告或者error信息,需要去排除,确认无误后,将nagios启动servicenagiosstartStartingnagios:done.第四步:安装NRPE因为我要为nagiosserver安装nrpe,就不需要安装nagios-plugin了,如果在是被监控端上安装nrpe,需要先添加nagios用户名,和安装nagios-plugin./configure--prefix=/usr/local/nagios/makeallmakeinstall-pluginmakeinstall-daemonmakeinstall-daemon-config去编辑nrpe的配置文件allowed_hosts=127.0.0.1这个是nagiosserver的IPcommand[check_users]=/usr/local/nagios/libexec/check_users-w3-c6command[check_load]=/usr/local/nagios/libexec/check_load-w9,7,6-c20,15,10command[check_disk]=/usr/local/nagios/libexec/check_disk-w20%-c15%-p/-uGBcommand[check_zombie_procs]=/usr/local/nagios/libexec/check_procs-w1-c3-sZcommand[check_total_procs]=/usr/local/nagios/libexec/check_procs-w150-c200command[check_ping]=/usr/local/nagios/libexec/check_ping-w100,20%-c200,50%command[check_swap]=/usr/local/nagios/libexec/check_swap-w20%-c10%测试本机nrpe是否ok[root@server221etc]#/usr/local/nagios/libexec/check_nrpe-HlocalhostNRPEv2.12将nrpe启动/usr/local/nagios/bin/nrpe-c/usr/local/nagios/etc/nrpe.cfg-d检查是否启动ok[root@server221etc]#netstat-anpt|grepnrpetcp000.0.0.0:56660.0.0.0:*LISTEN8049/nrpe现在,通过web就可以看到nagiosserver监控的信息