星期二, 10月 06, 2015

Nagios with openSUSE 13.1 - 監控 Management Port 與設定 parent 關係

目的: 監控 目前專案還有自己的設備

螢幕快照 2015-09-25 下午6.44.31.png

主要監控伺服器的 Management Port, 適用於財產保管人是自己, 但是系統是別的同事維運
另外透過建立 switch 以及設定 parent ( 父物件 ), 讓網路 map 更清楚

OS: openSUSE 13.1

# zypper   install   nagios   nagios-plugins

# htpasswd2   -c   /etc/nagios/htpasswd.users   nagiosadmin

# chkconfig  nagios  --list
# chkconfig  nagios  on
# chkconfig  nagios  --list

重新啟動.apache2 並設定開機啟動
#rcapache2  restart
#rcapache2  status
#systemctl    is-enabled   apache2.service
#systemctl    enable   apache2.service   
#systemctl    is-enabled   apache2.service

6.啟動Nagios
#rcnagios   start

#yast2 firewall
開啟 http 服務

#vi   /etc/nagios/objects/localhost.cfg
註解 HTTP, linux-servers 群組 以及調整 Total Process
# 2014/1/8 edit by sakana, temp disable HTTP monitor
#define service{
#        use         local-service         ; Name of service template to use
#        host_name                       localhost
#        service_description             HTTP
#       check_command                   check_http
#       notifications_enabled           0
#        }

# Define an optional hostgroup for Linux machines
#
#define hostgroup{
#        hostgroup_name  linux-servers ; The name of the hostgroup
#        alias           Linux Servers ; Long name of the group
#        members         localhost     ; Comma separated list of hosts that belong to this group
#        }


# 2014/1/8 edit by sakana change check_local_procs from 250 to 400, 400 to 800
define service{
       use                             local-service         ; Name of service template to use
       host_name                       localhost
       service_description             Total Processes
       check_command                   check_local_procs!400!800!RSZDT
       }

解除http警告狀態

檢視 Nagios 設定有沒有問題
#nagios  -v   /etc/nagios/nagios.cfg

重新啟動 Nagios
#rcnagios   restart

安裝 nagios-nrpe 套件
#yast  -i  nagios-nrpe  nagios-plugins-nrpe

下載之前自己建立的範本檔案 ( 中心firewall有擋, 我是用 scp 過去的  )

覆蓋且移動原來的設定檔
# mv   server-templates20131227.cfg    /etc/nagios/objects/templates.cfg
( 中心firewall有擋, 我是用 scp 過去的  )

覆蓋且移動原來的設定檔
# mv   server-commands.cfg   /etc/nagios/objects/commands.cfg
建立之後存放 Server 與 一般工作站的設定檔目錄
#mkdir   /etc/nagios/servers
#mkdir   /etc/nagios/pcs
#mkdir   /etc/nagios/racks

取得事先寫好的 linuxPublic.cfg 複製給公用服務使用並複製到 /etc/nagios/servers目錄 

( 中心firewall有擋, 我是用 scp 過去的  )
#mv   linuxPublic.cfg   /etc/nagios/objects/

修改 /etc/nagios/nagios.cfg 內的設定 讓 *.cfg 設定載入, 可以監控nagios Clinet

#vi   /etc/nagios/nagios.cfg
加入
cfg_dir=/etc/nagios/servers
cfg_dir=/etc/nagios/pcs
cfg_dir=/etc/nagios/racks

註解 /etc/nagios/objects/windows.cfg 內的 windows-servers, 因為已經定義到  templates.cfg

# vi   /etc/nagios/objects/windows.cfg
#define hostgroup{
#        hostgroup_name  windows-servers ; The name of the hostgroup
#       alias           Windows Servers ; Long name of the group
#        }

修改 templates.cfg 加入 rack-host 主機範本以及群組
# vi   /etc/nagios/objects/templates.cfg


# Rack host definition template - This is NOT a real host, just a template!
# 這個host 範本是為了只想監控實體主機 管理IP, 但是系統管理人不是自己的template
define host{
       name                            rack-host       ; The name of this host template
       use                             generic-host    ; This template inherits other values from the generic-host template
       check_period                    24x7            ; By default, Linux hosts are checked round the clock
       check_interval                  5               ; Actively check the host every 5 minutes
       retry_interval                  1               ; Schedule host check retries at 1 minute intervals
       max_check_attempts              10              ; Check each Linux host 10 times (max)
       check_command                   check-host-alive ; Default command to check Linux hosts
       notification_period             workhours       ; Linux admins hate to be woken up, so we only notify during the day
                                                       ; Note that the notification_period variable is being overridden from
                                                       ; the value that is inherited from the generic-host template!
       notification_interval           120             ; Resend notifications every 2 hours
       notification_options            d,u,r           ; Only send notifications for specific host states
       contact_groups                  admins          ; Notifications get sent to the admins by default
       hostgroups                      rack-group      ; Host groups that Rack server should be a member of 這邊是額外加入
       register                        0               ; DONT REGISTER THIS DEFINITION - ITS NOT A REAL HOST, JUST A TEMPLATE!
       }

# 這一段是新增的區段, 為了Rack Host 群組 而設定---- start-------------------------------------------------------------

# Define host group for Rack hosts
define hostgroup{
       hostgroup_name  rack-group      ; The name of the hostgroup
       alias           Rack Hosts Group      ; Long name of the group
       }

# 這一段是新增的區段, 為了Rack Host 群組 而設定---- end  -------------------------------------------------------------

複製範本檔案來修改為 rackHost.cfg
# cp  /etc/nagios/objects/linuxPublic.cfg   /etc/nagios/objects/rackHost.cfg

修改檔案內容
# vi  /etc/nagios/objects/rackHost.cfg
註解所有的 service
修改 套用的範本
define host{
       use             rack-host       ; Inherit default values from a template 這邊要參照/etc/nagios/objects/templates.cfg 內的名稱
       host_name       S_b11_9U        ; The name we're giving to this host 主機名稱(簡稱)(機房位置_機櫃位置_起始U數)
       alias           B11 Rack at 9 ~ 10U; A longer name associated with the host 主機名稱(長)
       address         192.168.1.3     ; IP address of the host 這邊請修改為實際上的IP
       }


複製範本檔案來建立我們想監控的機器(請自行命名 xxx.cfg)
# cp  /etc/nagios/objects/rackHost.cfg   /etc/nagios/racks/S_C13_12U.cfg

修改
# vi  /etc/nagios/racks/S_C13_12U.cfg

修改主機名稱對應實際狀況
修改  address 後面的IP, 對應到可以存取的 IP

address     x.y.z.w

檢視 Nagios 設定有沒有問題
#nagios  -v   /etc/nagios/nagios.cfg

這邊可能會有警告, 因為 rack 主機我們只檢查主機不檢查服務

重新啟動 Nagios
#rcnagios   restart

** Nagios Map icon **

直接使用預設的 icon ( /usr/share/nagios/images/logos 下 )

修改 templates.cfg
# vi   /etc/nagios/objects/templates.cfg
加入 icon 相關設定, rack-host 比照辦理
define host{
       name                            linux-server    ; The name of this host template
       use                             generic-host    ; This template inherits other values from the generic-host template
       check_period                    24x7            ; By default, Linux hosts are checked round the clock
       check_interval                  5               ; Actively check the host every 5 minutes
       retry_interval                  1               ; Schedule host check retries at 1 minute intervals
       max_check_attempts              10              ; Check each Linux host 10 times (max)
       check_command                   check-host-alive ; Default command to check Linux hosts
       notification_period             workhours       ; Linux admins hate to be woken up, so we only notify during the day
                                                       ; Note that the notification_period variable is being overridden from
                                                       ; the value that is inherited from the generic-host template!
       notification_interval           120             ; Resend notifications every 2 hours
       notification_options            d,u,r           ; Only send notifications for specific host states
       contact_groups                  admins          ; Notifications get sent to the admins by default
       hostgroups                      linux-servergroup ; Host groups that Linux servers should be a member of
       icon_image                      linux40.gif
       icon_image                      linux
       statusmap_image                 linux40.gd2
       register                        0               ; DONT REGISTER THIS DEFINITION - ITS NOT A REAL HOST, JUST A TEMPLATE!
       }

** switch **
目前還沒有碰 snmp, switch 也不想弄得太麻煩, 只是想在 map 上顯示階層

複製原來的範本來修改, 只想知道 switch 有沒有活著
# cp  /etc/nagios/objects/switch.cfg   /etc/nagios/objects/switchSimple.cfg

# vi   /etc/nagios/objects/switchSimple.cfg
註解所有服務
註解 switches 群組, 準備放 templates.cfg
修改主機設定, 自動加入 switch-group 群組
#define hostgroup{
#       hostgroup_name  switches                ; The name of the hostgroup
#       alias           Network Switches        ; Long name of the group
#       }

define host{
       use             generic-switch          ; Inherit default values from a template
       host_name       linksys-srw224p         ; The name we're giving to this switch(機房位置_機櫃位置_起始U數)
       alias           Linksys SRW224P Switch  ; A longer name associated with the switch
       address         192.168.1.253           ; IP address of the switch
       hostgroups      switch-group            ; Host groups this switch is associated with
       }


#vi   /etc/nagios/objects/templates.cfg
加入 switch-group 群組
# 這一段是新增的區段, 為了switch 群組 而設定---- start-------------------------------------------------------------

define hostgroup{
      hostgroup_name  switch-group               ; The name of the hostgroup
      alias           Network Switches        ; Long name of the group
      }

# 這一段是新增的區段, 為了switch 群組 而設定---- end  -------------------------------------------------------------

建立資料夾
# mkdir  /etc/nagios/switches

# vi   /etc/nagios/nagios.cfg
加入
cfg_dir=/etc/nagios/switches

複製範本 到 switch 目錄
# cp  /etc/nagios/objects/switchSimple.cfg   /etc/nagios/switches/S_B11_32U.cfg

# vi  /etc/nagios/switches/S_B11_32U.cfg
修改 主機名稱, address
define host{
       use             generic-switch          ; Inherit default values from a template
       host_name       S_B11_32U               ; The name we're giving to this switch(機房位置_機櫃位置_起始U數)
       alias           Sinchu B11 32U - Cisco Catalyst 2960x   ; A longer name associated with the switch
       address         192.168.1.124           ; IP address of the switch
       hostgroups      switch-group            ; Host groups this switch is associated with
       }

接下來修改 rack 設定檔
在 host  設定 加上 parents 設定父物件
例如
define host{
       use             rack-host       ; Inherit default values from a template 這邊要參照/etc/nagios/objects/templates.cfg 內的名稱
       host_name       S_B11_11U       ; The name we're giving to this host 主機名稱(簡稱)(機房位置_機櫃位置_起始U數)
       alias           Shinchu B11 11U - Cisco UCS C220M4; A longer name associated with the host 主機名稱(長)
       address         192.168.1.111   ; IP address of the host 這邊請修改為實際上的IP
       parents         S_B11_32U
       }

成果如下

螢幕快照 2015-09-25 下午6.44.31.png

沒有留言: