Nagios是一款企业级开源免费的监控工具,其关注点在于保证服务的正常运行,并且在服务发生问题时提供报警机制。

1.实验环境

Nagios服务端:10.20.2.233

Nagios监控端:web1(10.20.2.2.235)、web2(10.20.2.236)

2.Nagios服务端部署

1)安装nagios依赖软件包

通过yum方式快速安装Nagios所需的依赖软件包

yum-yinstallgdgd-developensslopenssl-develhttpdphpgccglibcglibc-commonmakenet-snmpwget

2)创建nagios账户与组

配置时使用--with-nagios-user和--with-nagios-group指定以该账号的身份运行Nagios。

useraddnagios

3)×××地址

Nagios:

http://superb-sea2.dl.sourceforge.net/project/nagios/nagios-4.x/nagios-4.2.1/nagios-4.2.1.tar.gz

Nagios-plugin:

https://nagios-plugins.org/download/nagios-plugins-2.1.2.tar.gz

Nrpe:

http://pilotfiber.dl.sourceforge.net/project/nagios/nrpe-3.x/nrpe-3.0.1.tar.gz

4)Nagios的安装

tar-zxfnagios-4.2.1.tar.gz-C/usr/localcd/usr/localcdnagios-4.2.1/./configure--with-nagios-user=nagios--with-nagios-group=nagiosmakeallmakeinstall#安装主程序,CGI以及HTML文件makeinstall-init#安装启动脚本/etc/init.d/nagiosmakeinstall-commandmode#安装与配置目录权限makeinstall-config#安装配置文件模板#由于nagios最终将以web的形式进行管理与监控,安装过程中使用makeinstall-webconf命令将生成Apache附加配置文件/etc/httpd/conf.d/nagios.confmakeinstall-webconf

5)Nagios插件安装

tar-zxfnagios-plugins-2.1.2.tar.gz-C/usr/localcd/usr/local/nagios-plugins-2.1.2/./configure--prefix=/usr/local/nagiosmakemakeinstalltar-zxfnrpe-3.0.1.tar.gz-C/usr/local/cd/usr/localcdnrpe-3.0.1/./configure--prefix=/usr/local/nagiosmakeallmakeinstall-pluginmakeinstall-daemonmakeinstall-daemon-configchownnagios:nagios-R/usr/local/nagios

6)禁用selinux并关闭防火墙

setenforce0serviceiptablesstop

7)创建web访问账户

htpasswd-c/usr/local/etc/htpasswd.userstomcat

8)启动nagios

/etc/init.d/httpdstart/etc/init.d/nagiosstart

9)修改nagios配置文件

主配置文件:nagios.cfg

主配置文件中使用cfg_file配置项加载其他配置文件,为了方便管理,将两台监控主机创建不同的配置文件,10.20.2.235配置文件名为web1.cfg,10.20.2.236配置文件名为web2.cfg

vi/usr/local/nagios/etc/nagios.cfgcfg_file=/usr/local/nagios/etc/objects/commands.cfgcfg_file=/usr/local/nagios/etc/objects/contacts.cfgcfg_file=/usr/local/nagios/etc/objects/timeperiods.cfgcfg_file=/usr/local/nagios/etc/objects/templates.cfg#Definitionsformonitoringthelocal(Linux)hostcfg_file=/usr/local/nagios/etc/objects/localhost.cfg#下面两个配置文件需要手动创建出来,用于监控两台web服务器cfg_file=/usr/local/nagios/etc/web1.cfgcfg_file=/usr/local/nagios/etc/web2.cfg……

修改CGI配置文件(cgi.cfg),需要将访问web页面的账号加入进来

vi/usr/local/nagios/etc/cgi.cfguse_authentication=1authorized_for_system_information=nagiosadmin,tomcatauthorized_for_configuration_information=nagiosadmin,tomcatauthorized_for_system_commands=nagiosadmin,tomcatauthorized_for_all_services=nagiosadmin,tomcatauthorized_for_all_hosts=nagiosadmin,tomcatauthorized_for_all_service_commands=nagiosadmin,tomcatauthorized_for_all_host_commands=nagiosadmin,tomcat……

修改命令配置文件(commands.cfg),该文件定义具体的命令实现方式,如发送报警邮件具体使用什么工具、邮件内容格式定义。

vi/usr/local/nagios/etc/objects/commands.cfg……definecommand{command_namecheck-host-alivecommand_line$USER1$/check_ping-H$HOSTADDRESS$-w3000.0,80%-c5000.0,100%-p5}……#以下内容需要手动添加,用于进行远程主机监控,需要安装nrpe软件包definecommand{command_namecheck_nrpecommand_line$USER1$/check_nrpe-H$HOSTADDRESS$-c$ARG1$}

修改nrpe配置文件(nrpe.cfg),用于监控远程主机所需要的命令

vi/usr/local/nagios/etc/nrpe.cfgcommand[check_users]=/usr/local/nagios/libexec/check_users-w5-c10command[check_load]=/usr/local/nagios/libexec/check_load-w15,10,5-c30,25,20command[check_hda1]=/usr/local/nagios/libexec/check_disk-w20%-c10%-p/dev/hda1command[check_zombie_procs]=/usr/local/nagios/libexec/check_procs-w5-c10-sZcommand[check_total_procs]=/usr/local/nagios/libexec/check_procs-w150-c200#下面一行为手动添加command[check_disk]=/usr/local/nagios/libexec/check_disk-w20%-c10%……

修改监控主机配置文件(localhost.cfg),该文件用于设置如何监控本机服务器资源。

vi/usr/local/nagios/etc/objects/localhost.cfg……definehost{uselinux-server;Nameofhosttemplatetouse;Thishostdefinitionwillinheritallvariablesthataredefined;in(orinheritedby)thelinux-serverhosttemplatedefinition.host_namelocalhostaliaslocalhostaddress127.0.0.1}……definehostgroup{hostgroup_namelinux-servers;ThenameofthehostgroupaliasLinuxServers;Longnameofthegroupmemberslocalhost;Commaseparatedlistofhoststhatbelongtothisgroup}……

创建远程监控配置文件web1.cfg与web2.cfg,用于监控远程服务器系统资源与服务,可以使用localhost.cfg作为参考模板。下面列出web1.cfg的所有内容,web2.cfg只需要参考web1.cfg的内容修改主机名称、IP地址以及主机名称即可。

definehost{uselinux-server;Nameofhosttemplatetouse;Thishostdefinitionwillinheritallvariablesthataredefined;in(orinheritedby)thelinux-serverhosttemplatedefinition.host_nameweb1aliastest.comaddress10.20.2.235}definehostgroup{hostgroup_namewebs;ThenameofthehostgroupaliasLinuxServers;Longnameofthegroupmembersweb1;Commaseparatedlistofhoststhatbelongtothisgroup}defineservice{usegeneric-service;Nameofservicetemplatetousehost_nameweb1service_descriptionPINGcheck_commandcheck_ping!100.0,20%!500.0,60%notifications_enabled1}defineservice{usegeneric-service;Nameofservicetemplatetousehost_nameweb1service_descriptionSys_Loadcheck_commandcheck_nrpe!check_loadnotifications_enabled1}defineservice{usegeneric-service;Nameofservicetemplatetousehost_nameweb1service_descriptionCurrentUserscheck_commandcheck_nrpe!check_usersnotifications_enabled1}defineservice{usegeneric-service;Nameofservicetemplatetousehost_nameweb1service_descriptionTotalProcessescheck_commandcheck_nrpe!check_total_procsnotifications_enabled1}defineservice{usegeneric-service;Nameofservicetemplatetousehost_nameweb1service_descriptionSSHcheck_commandcheck_sshnotifications_enabled1}defineservice{usegeneric-service;Nameofservicetemplatetousehost_nameweb1service_descriptionHTTPcheck_commandcheck_httpnotifications_enabled1}

10)重新加载nagios配置

其他配置文件不需修改,可以直接使用,重启nagios,重新加载所有的配置

/etc/init.d/nagiosrestart3.Nagios监控端部署

下面以web1为例,web2与web1操作一致

1)yum安装nagios插件需依赖的软件包

yum-yinstallopensslopenssl-devel

2)创建nagios用户和组

useradd-s/sbin/nologinnagios

3)安装Nagios-plugin

tar-zxfnagios-plugins-2.1.2.tar.gz-C/usr/localcd/usr/local/cdnagios-plugins-2.1.2/./configuremakemakeinstall

4)安装Nrpe

tar-zxfnrpe-3.0.1.tar.gz-C/usr/localcd/usr/local/nrpe-3.0.1/./configuremakeallmakeinstall-pluginmakeinstall-daemonmakeisntall-daemon-configchown-Rnagios:nagios/usr/local/nagios

5)修改nrpe配置文件

cp/usr/local/nrpe-3.0.1/sample-config/nrpe.cfg/usr/local/nagios/etc/vi/usr/local/nagios/etc/nrpe.cfg……allowed_hosts=127.0.0.1,10.20.2.233……command[check_users]=/usr/local/nagios/libexec/check_users-w5-c10command[check_load]=/usr/local/nagios/libexec/check_load-w15,10,5-c30,25,20command[check_hda1]=/usr/local/nagios/libexec/check_disk-w20%-c10%-p/dev/hda1command[check_zombie_procs]=/usr/local/nagios/libexec/check_procs-w5-c10-sZcommand[check_total_procs]=/usr/local/nagios/libexec/check_procs-w150-c200#下面一行为手动添加command[check_disk]=/usr/local/nagios/libexec/check_disk-w20%-c10%

6)禁用selinux并关闭防火墙

setenforce0serviceiptablesstop

7)启动nrpe

/usr/local/nagios/bin/nrpe-c/usr/local/nagios/etc/nrpe.cfg-d4.验证并进行监控

1)验证监控端的nrpe

管理员在Nagios服务端通过check_nrpe检测被监控端相关的性能参数,单独使用check_nrpe可以检测被监控端的nrpe版本号

[root@testetc]#/usr/local/nagios/libexec/check_nrpe-H10.20.2.235NRPEv3.0.1[root@testetc]#/usr/local/nagios/libexec/check_nrpe-H10.20.2.236NRPEv3.0.1[root@testetc]#/usr/local/nagios/libexec/check_nrpe-H10.20.2.237connecttoaddress10.20.2.237port5666:Connectionrefusedconnecttohost10.20.2.237port5666:Connectionrefused

2)访问web页面进行监控

以上信息已经能够检测到被监控端的nrpe,此时可以通过浏览器进行访问:

http://10.20.2.233/nagios