SlideShare une entreprise Scribd logo
1  sur  41
Nagios XI
Best Practices
By Troy Lea
tlea@nagios.com
About Me
•Tech Support Contractor for Nagios
Enterprises
•Based in Australia
•Typically cover UTC+10 from 9am to 5pm
•Nagios & XI Dev (Box293)
•Nagios MVP3
What's Covered In This Talk
•Getting the most from Nagios XI
•Time saving information
•Configuration practices
•Object definitions
•Backend setup
•Performance enhancements
Nagios XI
Server Internals
Nagios XI License Entitlements
•XI license entitles 3 instances:
•Production
•Test & Dev (T&D)
•Disaster Recovery (DR)
•License activation is tied to IP Address
of each XI host
Whats Monitoring Nagios XI?
•How would you know your XI server died?
•“Nagios XI Server” Monitoring Wizard
•DR instance monitors production instance
•Production instance is UP & HEALTHY
•Production Instance monitors DR instance
•DR instance is UP & HEALTHY
localhost services
•Do you know how your XI server is
performing?
•Basic local services are included in XI base
•You should ideally be monitoring:
•Service Status (check_init_service)
•crond, httpd, mysql, ndo2db, npcd, ntpd,
postgresql, snmptrapd, snmptt
localhost services
•File Counts (check_file_count)
•NPCD Perfdata spool directory
•xidpe spool directory
•Check results folder
•snmptt spool folder
•nagios user account has not expired
•(check_pass_expire.pl)
localhost services
•root mailbox size
•(box293_check_mbox)
•MySQL / MariaDB
•Database tables crashed?
•(box293_check_mysql_table_status)
•Date/Time correct?
•(box293_check_mysql_date)
localhost services
•Overall Load (check_load)
•Memory Free – Physical (check_memory)
•Swap Usage (check_swap)
•Disk Free (check_disk)
Date and Timezone!
•Configure Timezone
•Admin > Manage System Config
•Sync with trusted time source
•VM? Don’t sync with hypervisor!
•Can be the source of confusing
problems
CPU
•CPU Cores vs Speed!
•Not everything is multi-threaded
•3.4 GHz vs 2.2 GHz
•Number of cores is still important
•Refer to XI hardware requirements
Memory
•Enough memory to cope in a major outage
•Event handlers consume memory quickly (+GB
in a matter of minutes in a major outage)
•Have at least 50% more memory than needed
•Refer to XI hardware requirements
RAM Disk
•Lots of little files created/deleted/updated
•Using a RAM Disk:
•Reduces disk I/O & load
•Speeds up processing of performance data
•Speeds up processing of spooled check results
•Speeds up nagios restarts
•Refer to official procedure
Solid State Disk (SSD)
•Greatly improves overall performance
•Compliments RAM Disk
•Helps read/writes with:
•Logs
•Database
•Performance Graphs
•Reports
SSD vs RAID ?
•SSD beats* a spinning disk RAID set
•*Depends on how much money you have
•Still need to RAID1 SSD for redundancy!
•SSD may not give you the required capacity
•3.8TB SAS SSD now available
!!!
rrdcached
•Enabling rrdcached accumulates the
spooled performance data, after x amount of
time it is processed into backend RRD files
•Reduces Disk I/O
•Can be a delay in data appearing in graphs
•Refer to official procedure
Offloaded MySQL / MariaDB
•Data constantly written to databases
•Historical and Configuration
•Offload to separate server to reduce load
•Don't forget to monitor offloaded server!!!
•Disk/CPU/Memory/Tables/Service
•Refer to earlier slides
•Refer to official procedure
Mod-Gearman
•Used for offloading plugins to workers
•Plugins need to be installed on all workers
•Be aware of plugins that use /tmp files!
•XI 2014 onwards uses Core 4
•Core 4 has it's own workers (only local
workers)
•nagios.cfg “check_workers” option
•Refer to official procedure
Disaster Recovey
•Failover and High Availability Solutions for
Nagios XI
•Andy Brist - NWC2014 – Failover & HA
•What is really important in disaster?
•Plan and test
Backups!!!
•Admin > System Backups
•Schedule backups of XI
•Location can be local, FTP, SSH
•Remote location recommended
•Manual Backups
•Local Backup Archives via Admin menu
•/usr/local/nagiosxi/scripts/backup_xi.sh
Restoring Backups
•Official Backup and Restore procedure
•Brings system back online with ease
•Great for migrating from old XI to new XI
•Also good for:
•DR
•Test & Dev
Configuration
Intervals - Host vs Services
•Host down HARD = service notifications
suppressed
•What happens when host and services
use the same check intervals?
•Unnecessary Notifications get sent :(
•Make host go down HARD quicker than
it’s services!
Service Dependencies
•When a master service goes down:
•Prevents notifications from being sent
•Prevents service checks from execution
•Make master service go down HARD
quicker than dependent services!
•Otherwise dependencies are pointless
•Master service e.g. - Ping or NRPE Version
Disable Service Checks ?
•host_down_disable_service_checks
•Nagios Core 4.1.x feature (XI 5)
•System wide setting
•Reduces load on XI host
•Think of it as automatic service dependencies
on their own hosts
•Service dependencies ignored if host is down
Check Intervals - Be Realistic
•Does it need to be checked every 5
minutes?
•Disk Free Space – every 60 minutes perhaps?
•Too long = no performance data
•Different intervals to spread the load
•3, 5, 7 minute intervals
•58, 60, 62 minute intervals
Notification & Check Intervals
•Nagios determines if it is allowed to send a
notification every service HARD state
•e.g. 15 minute check and 60 minute notification
•Internal scheduling may cause 14min 55sec
to pass, 4 x 14:55 = 59min 40sec … it’s <
60min!
•Notification not sent until 75min!
•Scheduling is geared +/- to reduce load!
Use Hostgroups!
•Assign ONE service to a hostgroup of
common servers
•Windows Servers
•Linux Servers
•Consistent monitoring, standards enforced!
•Directive changes - all hosts get updated
•Reduces management overhead
Use Contact Groups!
•Use contact groups in all definitions
•Makes it easy when staff join/leave
•Just add/remove the contact from groups
•Reduces administrative overhead
•Enforces your company policy
•Similar principle to host groups
Configuration Wizards
•Pros
•Great for getting up and running quickly
•No need to learn how a plugin works
•Cons
•Creates individual services
•More work later when enforcing “standards”
Templates
•Common settings applied to objects
•Helps enforce standards
•Reduces administrative overhead
•Layer multiple templates
•Can be additive or ignore inheritance
•XI Config Wizard objects use templates
•Example of common icmp check
User Macros – resources.cfg
•$USERx$ macros are good for common
items like a username or password
•Allows passwords with a ! exclamation
mark
•Values not visible in object definitions
•$USER1$
•/usr/local/nagios/libexec
Custom Object Variables
•Allows you to create your own variables
•Can be defined in host or service objects
•E.G. hosts have their own check_nt
password
•Define _CHECK_NT_PASSWORD in host object
•In command definitions reference it as:
•$_HOSTCHECK_NT_PASSWORD$
•VERY POWERFULL!
Other
MTRG Clean Configs
•Your MRTG configs may be collecting
more than what you think
•/etc/mrtg/conf.d/*.cfg files
•Created by Network Switch / Router Wizard
•Comment out unused ports
•About 37 lines per port
•Comment out unused non-interfaces
(VLANs)
Plugins – Compiled vs Scripts
•Compiled runs quicker
•Official nagios-plugins are compiled
•“Custom modifications” require re-compiling
•Scripts run slower, consume more
resources
•Perl plugins known to consume +CPU +RAM
•“nice” can reduce impact of plugins
•Check Profiler component by box293
Backend API - Read Only User
•API provides you with URLs for use in third
party products without needing user/pass
•Requires a user account to be created
•Account should be READ ONLY
Performance Data Tool
•Component developed by box293
•Allows you to manipulate RRD files
•Great for merging RRD data
•Can also delete old RRD files for old services
•View raw data in tables
•Find it in the Nagios Exchange
Thank you!
What Is Your Best Practice?
Any Questions?
end
done
fi esac
)
}
;
od
until
.

Contenu connexe

Tendances

Oracle-DB: Performance Analysis with Panorama
Oracle-DB: Performance Analysis with PanoramaOracle-DB: Performance Analysis with Panorama
Oracle-DB: Performance Analysis with PanoramaPeter Ramm
 
Multiple Shared Processor Pools In Power Systems
Multiple Shared Processor Pools In Power SystemsMultiple Shared Processor Pools In Power Systems
Multiple Shared Processor Pools In Power SystemsAndrey Klyachkin
 
Linux Systems Performance 2016
Linux Systems Performance 2016Linux Systems Performance 2016
Linux Systems Performance 2016Brendan Gregg
 
Linux LVM Logical Volume Management
Linux LVM Logical Volume ManagementLinux LVM Logical Volume Management
Linux LVM Logical Volume ManagementManolis Kartsonakis
 
YOW2018 Cloud Performance Root Cause Analysis at Netflix
YOW2018 Cloud Performance Root Cause Analysis at NetflixYOW2018 Cloud Performance Root Cause Analysis at Netflix
YOW2018 Cloud Performance Root Cause Analysis at NetflixBrendan Gregg
 
KVM High Availability Regardless of Storage - Gabriel Brascher, VP of Apache ...
KVM High Availability Regardless of Storage - Gabriel Brascher, VP of Apache ...KVM High Availability Regardless of Storage - Gabriel Brascher, VP of Apache ...
KVM High Availability Regardless of Storage - Gabriel Brascher, VP of Apache ...ShapeBlue
 
tow nodes Oracle 12c RAC on virtualbox
tow nodes Oracle 12c RAC on virtualboxtow nodes Oracle 12c RAC on virtualbox
tow nodes Oracle 12c RAC on virtualboxjustinit
 
Red Hat Enterprise Virtualization
Red Hat Enterprise VirtualizationRed Hat Enterprise Virtualization
Red Hat Enterprise Virtualizationhipark
 
Linux performance tuning & stabilization tips (mysqlconf2010)
Linux performance tuning & stabilization tips (mysqlconf2010)Linux performance tuning & stabilization tips (mysqlconf2010)
Linux performance tuning & stabilization tips (mysqlconf2010)Yoshinori Matsunobu
 
High Availability Content Caching with NGINX
High Availability Content Caching with NGINXHigh Availability Content Caching with NGINX
High Availability Content Caching with NGINXNGINX, Inc.
 
Embedded Recipes 2019 - Remote update adventures with RAUC, Yocto and Barebox
Embedded Recipes 2019 - Remote update adventures with RAUC, Yocto and BareboxEmbedded Recipes 2019 - Remote update adventures with RAUC, Yocto and Barebox
Embedded Recipes 2019 - Remote update adventures with RAUC, Yocto and BareboxAnne Nicolas
 
Oracle db architecture
Oracle db architectureOracle db architecture
Oracle db architectureSimon Huang
 
Introduction to Linux Kernel by Quontra Solutions
Introduction to Linux Kernel by Quontra SolutionsIntroduction to Linux Kernel by Quontra Solutions
Introduction to Linux Kernel by Quontra SolutionsQUONTRASOLUTIONS
 
Domino Server Health - Monitoring and Managing
 Domino Server Health - Monitoring and Managing Domino Server Health - Monitoring and Managing
Domino Server Health - Monitoring and ManagingGabriella Davis
 
How IBM's Massive POWER9 UNIX Servers Benefit from InfluxDB and Grafana Techn...
How IBM's Massive POWER9 UNIX Servers Benefit from InfluxDB and Grafana Techn...How IBM's Massive POWER9 UNIX Servers Benefit from InfluxDB and Grafana Techn...
How IBM's Massive POWER9 UNIX Servers Benefit from InfluxDB and Grafana Techn...DevOps.com
 
Linux Servidor Proxy(squid)
Linux Servidor Proxy(squid)Linux Servidor Proxy(squid)
Linux Servidor Proxy(squid)elliando dias
 
Tuning kafka pipelines
Tuning kafka pipelinesTuning kafka pipelines
Tuning kafka pipelinesSumant Tambe
 
NGINX: High Performance Load Balancing
NGINX: High Performance Load BalancingNGINX: High Performance Load Balancing
NGINX: High Performance Load BalancingNGINX, Inc.
 

Tendances (20)

Oracle-DB: Performance Analysis with Panorama
Oracle-DB: Performance Analysis with PanoramaOracle-DB: Performance Analysis with Panorama
Oracle-DB: Performance Analysis with Panorama
 
Multiple Shared Processor Pools In Power Systems
Multiple Shared Processor Pools In Power SystemsMultiple Shared Processor Pools In Power Systems
Multiple Shared Processor Pools In Power Systems
 
60 Admin Tips
60 Admin Tips60 Admin Tips
60 Admin Tips
 
Linux Systems Performance 2016
Linux Systems Performance 2016Linux Systems Performance 2016
Linux Systems Performance 2016
 
Linux LVM Logical Volume Management
Linux LVM Logical Volume ManagementLinux LVM Logical Volume Management
Linux LVM Logical Volume Management
 
YOW2018 Cloud Performance Root Cause Analysis at Netflix
YOW2018 Cloud Performance Root Cause Analysis at NetflixYOW2018 Cloud Performance Root Cause Analysis at Netflix
YOW2018 Cloud Performance Root Cause Analysis at Netflix
 
KVM High Availability Regardless of Storage - Gabriel Brascher, VP of Apache ...
KVM High Availability Regardless of Storage - Gabriel Brascher, VP of Apache ...KVM High Availability Regardless of Storage - Gabriel Brascher, VP of Apache ...
KVM High Availability Regardless of Storage - Gabriel Brascher, VP of Apache ...
 
tow nodes Oracle 12c RAC on virtualbox
tow nodes Oracle 12c RAC on virtualboxtow nodes Oracle 12c RAC on virtualbox
tow nodes Oracle 12c RAC on virtualbox
 
Red Hat Enterprise Virtualization
Red Hat Enterprise VirtualizationRed Hat Enterprise Virtualization
Red Hat Enterprise Virtualization
 
Linux performance tuning & stabilization tips (mysqlconf2010)
Linux performance tuning & stabilization tips (mysqlconf2010)Linux performance tuning & stabilization tips (mysqlconf2010)
Linux performance tuning & stabilization tips (mysqlconf2010)
 
High Availability Content Caching with NGINX
High Availability Content Caching with NGINXHigh Availability Content Caching with NGINX
High Availability Content Caching with NGINX
 
Embedded Recipes 2019 - Remote update adventures with RAUC, Yocto and Barebox
Embedded Recipes 2019 - Remote update adventures with RAUC, Yocto and BareboxEmbedded Recipes 2019 - Remote update adventures with RAUC, Yocto and Barebox
Embedded Recipes 2019 - Remote update adventures with RAUC, Yocto and Barebox
 
Oracle db architecture
Oracle db architectureOracle db architecture
Oracle db architecture
 
Nfs
NfsNfs
Nfs
 
Introduction to Linux Kernel by Quontra Solutions
Introduction to Linux Kernel by Quontra SolutionsIntroduction to Linux Kernel by Quontra Solutions
Introduction to Linux Kernel by Quontra Solutions
 
Domino Server Health - Monitoring and Managing
 Domino Server Health - Monitoring and Managing Domino Server Health - Monitoring and Managing
Domino Server Health - Monitoring and Managing
 
How IBM's Massive POWER9 UNIX Servers Benefit from InfluxDB and Grafana Techn...
How IBM's Massive POWER9 UNIX Servers Benefit from InfluxDB and Grafana Techn...How IBM's Massive POWER9 UNIX Servers Benefit from InfluxDB and Grafana Techn...
How IBM's Massive POWER9 UNIX Servers Benefit from InfluxDB and Grafana Techn...
 
Linux Servidor Proxy(squid)
Linux Servidor Proxy(squid)Linux Servidor Proxy(squid)
Linux Servidor Proxy(squid)
 
Tuning kafka pipelines
Tuning kafka pipelinesTuning kafka pipelines
Tuning kafka pipelines
 
NGINX: High Performance Load Balancing
NGINX: High Performance Load BalancingNGINX: High Performance Load Balancing
NGINX: High Performance Load Balancing
 

Similaire à Nagios XI Best Practices

be the captain of your connections deployment
be the captain of your connections deploymentbe the captain of your connections deployment
be the captain of your connections deploymentSharon James
 
Got Problems? Let's Do a Health Check
Got Problems? Let's Do a Health CheckGot Problems? Let's Do a Health Check
Got Problems? Let's Do a Health CheckLuis Guirigay
 
Soccnx10: Best and worst practices deploying IBM Connections
Soccnx10: Best and worst practices deploying IBM ConnectionsSoccnx10: Best and worst practices deploying IBM Connections
Soccnx10: Best and worst practices deploying IBM Connectionspanagenda
 
Best And Worst Practices Deploying IBM Connections
Best And Worst Practices Deploying IBM ConnectionsBest And Worst Practices Deploying IBM Connections
Best And Worst Practices Deploying IBM ConnectionsLetsConnect
 
Fixing Domino Server Sickness
Fixing Domino Server SicknessFixing Domino Server Sickness
Fixing Domino Server SicknessGabriella Davis
 
Moving Windows Applications to the Cloud
Moving Windows Applications to the CloudMoving Windows Applications to the Cloud
Moving Windows Applications to the CloudRightScale
 
Pre and post tips to installing sql server correctly
Pre and post tips to installing sql server correctlyPre and post tips to installing sql server correctly
Pre and post tips to installing sql server correctlyAntonios Chatzipavlis
 
SharePoint 2013 Performance Analysis - Robi Vončina
SharePoint 2013 Performance Analysis - Robi VončinaSharePoint 2013 Performance Analysis - Robi Vončina
SharePoint 2013 Performance Analysis - Robi VončinaSPC Adriatics
 
Sergey Dzyuban "To Build My Own Cloud with Blackjack…"
Sergey Dzyuban "To Build My Own Cloud with Blackjack…"Sergey Dzyuban "To Build My Own Cloud with Blackjack…"
Sergey Dzyuban "To Build My Own Cloud with Blackjack…"Fwdays
 
Pascal benois performance_troubleshooting-spsbe18
Pascal benois performance_troubleshooting-spsbe18Pascal benois performance_troubleshooting-spsbe18
Pascal benois performance_troubleshooting-spsbe18BIWUG
 
Adm07 The Health Check Extravaganza for IBM Social and Collaboration Environm...
Adm07 The Health Check Extravaganza for IBM Social and Collaboration Environm...Adm07 The Health Check Extravaganza for IBM Social and Collaboration Environm...
Adm07 The Health Check Extravaganza for IBM Social and Collaboration Environm...Kim Greene
 
Citrix Synergy 2014: Going the CloudPlatform Way
Citrix Synergy 2014: Going the CloudPlatform WayCitrix Synergy 2014: Going the CloudPlatform Way
Citrix Synergy 2014: Going the CloudPlatform WayIliyas Shirol
 
April, 2021 OpenNTF Webinar - Domino Administration Best Practices
April, 2021 OpenNTF Webinar - Domino Administration Best PracticesApril, 2021 OpenNTF Webinar - Domino Administration Best Practices
April, 2021 OpenNTF Webinar - Domino Administration Best PracticesHoward Greenberg
 
Still All on One Server: Perforce at Scale
Still All on One Server: Perforce at Scale Still All on One Server: Perforce at Scale
Still All on One Server: Perforce at Scale Perforce
 
Apache Performance Tuning: Scaling Out
Apache Performance Tuning: Scaling OutApache Performance Tuning: Scaling Out
Apache Performance Tuning: Scaling OutSander Temme
 
Citrix Synergy 2014 - Syn233 Building and operating a Dev Ops cloud: best pra...
Citrix Synergy 2014 - Syn233 Building and operating a Dev Ops cloud: best pra...Citrix Synergy 2014 - Syn233 Building and operating a Dev Ops cloud: best pra...
Citrix Synergy 2014 - Syn233 Building and operating a Dev Ops cloud: best pra...Citrix
 
Nagios Conference 2014 - Mike Merideth - The Art and Zen of Managing Nagios w...
Nagios Conference 2014 - Mike Merideth - The Art and Zen of Managing Nagios w...Nagios Conference 2014 - Mike Merideth - The Art and Zen of Managing Nagios w...
Nagios Conference 2014 - Mike Merideth - The Art and Zen of Managing Nagios w...Nagios
 
SQL Explore 2012: P&T Part 1
SQL Explore 2012: P&T Part 1SQL Explore 2012: P&T Part 1
SQL Explore 2012: P&T Part 1sqlserver.co.il
 
ECMDay2015 - Kent Agerlund – Configuration Manager 2012 – A Site Review
ECMDay2015 - Kent Agerlund – Configuration Manager 2012 – A Site ReviewECMDay2015 - Kent Agerlund – Configuration Manager 2012 – A Site Review
ECMDay2015 - Kent Agerlund – Configuration Manager 2012 – A Site ReviewKenny Buntinx
 
MySQL Performance Tuning at COSCUP 2014
MySQL Performance Tuning at COSCUP 2014MySQL Performance Tuning at COSCUP 2014
MySQL Performance Tuning at COSCUP 2014Ryusuke Kajiyama
 

Similaire à Nagios XI Best Practices (20)

be the captain of your connections deployment
be the captain of your connections deploymentbe the captain of your connections deployment
be the captain of your connections deployment
 
Got Problems? Let's Do a Health Check
Got Problems? Let's Do a Health CheckGot Problems? Let's Do a Health Check
Got Problems? Let's Do a Health Check
 
Soccnx10: Best and worst practices deploying IBM Connections
Soccnx10: Best and worst practices deploying IBM ConnectionsSoccnx10: Best and worst practices deploying IBM Connections
Soccnx10: Best and worst practices deploying IBM Connections
 
Best And Worst Practices Deploying IBM Connections
Best And Worst Practices Deploying IBM ConnectionsBest And Worst Practices Deploying IBM Connections
Best And Worst Practices Deploying IBM Connections
 
Fixing Domino Server Sickness
Fixing Domino Server SicknessFixing Domino Server Sickness
Fixing Domino Server Sickness
 
Moving Windows Applications to the Cloud
Moving Windows Applications to the CloudMoving Windows Applications to the Cloud
Moving Windows Applications to the Cloud
 
Pre and post tips to installing sql server correctly
Pre and post tips to installing sql server correctlyPre and post tips to installing sql server correctly
Pre and post tips to installing sql server correctly
 
SharePoint 2013 Performance Analysis - Robi Vončina
SharePoint 2013 Performance Analysis - Robi VončinaSharePoint 2013 Performance Analysis - Robi Vončina
SharePoint 2013 Performance Analysis - Robi Vončina
 
Sergey Dzyuban "To Build My Own Cloud with Blackjack…"
Sergey Dzyuban "To Build My Own Cloud with Blackjack…"Sergey Dzyuban "To Build My Own Cloud with Blackjack…"
Sergey Dzyuban "To Build My Own Cloud with Blackjack…"
 
Pascal benois performance_troubleshooting-spsbe18
Pascal benois performance_troubleshooting-spsbe18Pascal benois performance_troubleshooting-spsbe18
Pascal benois performance_troubleshooting-spsbe18
 
Adm07 The Health Check Extravaganza for IBM Social and Collaboration Environm...
Adm07 The Health Check Extravaganza for IBM Social and Collaboration Environm...Adm07 The Health Check Extravaganza for IBM Social and Collaboration Environm...
Adm07 The Health Check Extravaganza for IBM Social and Collaboration Environm...
 
Citrix Synergy 2014: Going the CloudPlatform Way
Citrix Synergy 2014: Going the CloudPlatform WayCitrix Synergy 2014: Going the CloudPlatform Way
Citrix Synergy 2014: Going the CloudPlatform Way
 
April, 2021 OpenNTF Webinar - Domino Administration Best Practices
April, 2021 OpenNTF Webinar - Domino Administration Best PracticesApril, 2021 OpenNTF Webinar - Domino Administration Best Practices
April, 2021 OpenNTF Webinar - Domino Administration Best Practices
 
Still All on One Server: Perforce at Scale
Still All on One Server: Perforce at Scale Still All on One Server: Perforce at Scale
Still All on One Server: Perforce at Scale
 
Apache Performance Tuning: Scaling Out
Apache Performance Tuning: Scaling OutApache Performance Tuning: Scaling Out
Apache Performance Tuning: Scaling Out
 
Citrix Synergy 2014 - Syn233 Building and operating a Dev Ops cloud: best pra...
Citrix Synergy 2014 - Syn233 Building and operating a Dev Ops cloud: best pra...Citrix Synergy 2014 - Syn233 Building and operating a Dev Ops cloud: best pra...
Citrix Synergy 2014 - Syn233 Building and operating a Dev Ops cloud: best pra...
 
Nagios Conference 2014 - Mike Merideth - The Art and Zen of Managing Nagios w...
Nagios Conference 2014 - Mike Merideth - The Art and Zen of Managing Nagios w...Nagios Conference 2014 - Mike Merideth - The Art and Zen of Managing Nagios w...
Nagios Conference 2014 - Mike Merideth - The Art and Zen of Managing Nagios w...
 
SQL Explore 2012: P&T Part 1
SQL Explore 2012: P&T Part 1SQL Explore 2012: P&T Part 1
SQL Explore 2012: P&T Part 1
 
ECMDay2015 - Kent Agerlund – Configuration Manager 2012 – A Site Review
ECMDay2015 - Kent Agerlund – Configuration Manager 2012 – A Site ReviewECMDay2015 - Kent Agerlund – Configuration Manager 2012 – A Site Review
ECMDay2015 - Kent Agerlund – Configuration Manager 2012 – A Site Review
 
MySQL Performance Tuning at COSCUP 2014
MySQL Performance Tuning at COSCUP 2014MySQL Performance Tuning at COSCUP 2014
MySQL Performance Tuning at COSCUP 2014
 

Plus de Nagios

Jesse Olson - Nagios Log Server Architecture Overview
Jesse Olson - Nagios Log Server Architecture OverviewJesse Olson - Nagios Log Server Architecture Overview
Jesse Olson - Nagios Log Server Architecture OverviewNagios
 
Trevor McDonald - Nagios XI Under The Hood
Trevor McDonald  - Nagios XI Under The HoodTrevor McDonald  - Nagios XI Under The Hood
Trevor McDonald - Nagios XI Under The HoodNagios
 
Sean Falzon - Nagios - Resilient Notifications
Sean Falzon - Nagios - Resilient NotificationsSean Falzon - Nagios - Resilient Notifications
Sean Falzon - Nagios - Resilient NotificationsNagios
 
Marcus Rochelle - Landis+Gyr - Monitoring with Nagios Enterprise Edition
Marcus Rochelle - Landis+Gyr - Monitoring with Nagios Enterprise EditionMarcus Rochelle - Landis+Gyr - Monitoring with Nagios Enterprise Edition
Marcus Rochelle - Landis+Gyr - Monitoring with Nagios Enterprise EditionNagios
 
Janice Singh - Writing Custom Nagios Plugins
Janice Singh - Writing Custom Nagios PluginsJanice Singh - Writing Custom Nagios Plugins
Janice Singh - Writing Custom Nagios PluginsNagios
 
Mike Weber - Nagios and Group Deployment of Service Checks
Mike Weber - Nagios and Group Deployment of Service ChecksMike Weber - Nagios and Group Deployment of Service Checks
Mike Weber - Nagios and Group Deployment of Service ChecksNagios
 
Mike Guthrie - Revamping Your 10 Year Old Nagios Installation
Mike Guthrie - Revamping Your 10 Year Old Nagios InstallationMike Guthrie - Revamping Your 10 Year Old Nagios Installation
Mike Guthrie - Revamping Your 10 Year Old Nagios InstallationNagios
 
Bryan Heden - Agile Networks - Using Nagios XI as the platform for Monitoring...
Bryan Heden - Agile Networks - Using Nagios XI as the platform for Monitoring...Bryan Heden - Agile Networks - Using Nagios XI as the platform for Monitoring...
Bryan Heden - Agile Networks - Using Nagios XI as the platform for Monitoring...Nagios
 
Matt Bruzek - Monitoring Your Public Cloud With Nagios
Matt Bruzek - Monitoring Your Public Cloud With NagiosMatt Bruzek - Monitoring Your Public Cloud With Nagios
Matt Bruzek - Monitoring Your Public Cloud With NagiosNagios
 
Lee Myers - What To Do When Nagios Notification Don't Meet Your Needs.
Lee Myers - What To Do When Nagios Notification Don't Meet Your Needs.Lee Myers - What To Do When Nagios Notification Don't Meet Your Needs.
Lee Myers - What To Do When Nagios Notification Don't Meet Your Needs.Nagios
 
Eric Loyd - Fractal Nagios
Eric Loyd - Fractal NagiosEric Loyd - Fractal Nagios
Eric Loyd - Fractal NagiosNagios
 
Marcelo Perazolo, Lead Software Architect, IBM Corporation - Monitoring a Pow...
Marcelo Perazolo, Lead Software Architect, IBM Corporation - Monitoring a Pow...Marcelo Perazolo, Lead Software Architect, IBM Corporation - Monitoring a Pow...
Marcelo Perazolo, Lead Software Architect, IBM Corporation - Monitoring a Pow...Nagios
 
Thomas Schmainda - Tracking Boeing Satellites With Nagios - Nagios World Conf...
Thomas Schmainda - Tracking Boeing Satellites With Nagios - Nagios World Conf...Thomas Schmainda - Tracking Boeing Satellites With Nagios - Nagios World Conf...
Thomas Schmainda - Tracking Boeing Satellites With Nagios - Nagios World Conf...Nagios
 
Nagios World Conference 2015 - Scott Wilkerson Opening
Nagios World Conference 2015 - Scott Wilkerson OpeningNagios World Conference 2015 - Scott Wilkerson Opening
Nagios World Conference 2015 - Scott Wilkerson OpeningNagios
 
Nrpe - Nagios Remote Plugin Executor. NRPE plugin for Nagios Core
Nrpe - Nagios Remote Plugin Executor. NRPE plugin for Nagios CoreNrpe - Nagios Remote Plugin Executor. NRPE plugin for Nagios Core
Nrpe - Nagios Remote Plugin Executor. NRPE plugin for Nagios CoreNagios
 
Nagios Log Server - Features
Nagios Log Server - FeaturesNagios Log Server - Features
Nagios Log Server - FeaturesNagios
 
Nagios Network Analyzer - Features
Nagios Network Analyzer - FeaturesNagios Network Analyzer - Features
Nagios Network Analyzer - FeaturesNagios
 
Nagios Conference 2014 - Dorance Martinez Cortes - Customizing Nagios
Nagios Conference 2014 - Dorance Martinez Cortes - Customizing NagiosNagios Conference 2014 - Dorance Martinez Cortes - Customizing Nagios
Nagios Conference 2014 - Dorance Martinez Cortes - Customizing NagiosNagios
 
Nagios Conference 2014 - Mike Weber - Nagios Rapid Deployment Options
Nagios Conference 2014 - Mike Weber - Nagios Rapid Deployment OptionsNagios Conference 2014 - Mike Weber - Nagios Rapid Deployment Options
Nagios Conference 2014 - Mike Weber - Nagios Rapid Deployment OptionsNagios
 
Nagios Conference 2014 - Eric Mislivec - Getting Started With Nagios Core
Nagios Conference 2014 - Eric Mislivec - Getting Started With Nagios CoreNagios Conference 2014 - Eric Mislivec - Getting Started With Nagios Core
Nagios Conference 2014 - Eric Mislivec - Getting Started With Nagios CoreNagios
 

Plus de Nagios (20)

Jesse Olson - Nagios Log Server Architecture Overview
Jesse Olson - Nagios Log Server Architecture OverviewJesse Olson - Nagios Log Server Architecture Overview
Jesse Olson - Nagios Log Server Architecture Overview
 
Trevor McDonald - Nagios XI Under The Hood
Trevor McDonald  - Nagios XI Under The HoodTrevor McDonald  - Nagios XI Under The Hood
Trevor McDonald - Nagios XI Under The Hood
 
Sean Falzon - Nagios - Resilient Notifications
Sean Falzon - Nagios - Resilient NotificationsSean Falzon - Nagios - Resilient Notifications
Sean Falzon - Nagios - Resilient Notifications
 
Marcus Rochelle - Landis+Gyr - Monitoring with Nagios Enterprise Edition
Marcus Rochelle - Landis+Gyr - Monitoring with Nagios Enterprise EditionMarcus Rochelle - Landis+Gyr - Monitoring with Nagios Enterprise Edition
Marcus Rochelle - Landis+Gyr - Monitoring with Nagios Enterprise Edition
 
Janice Singh - Writing Custom Nagios Plugins
Janice Singh - Writing Custom Nagios PluginsJanice Singh - Writing Custom Nagios Plugins
Janice Singh - Writing Custom Nagios Plugins
 
Mike Weber - Nagios and Group Deployment of Service Checks
Mike Weber - Nagios and Group Deployment of Service ChecksMike Weber - Nagios and Group Deployment of Service Checks
Mike Weber - Nagios and Group Deployment of Service Checks
 
Mike Guthrie - Revamping Your 10 Year Old Nagios Installation
Mike Guthrie - Revamping Your 10 Year Old Nagios InstallationMike Guthrie - Revamping Your 10 Year Old Nagios Installation
Mike Guthrie - Revamping Your 10 Year Old Nagios Installation
 
Bryan Heden - Agile Networks - Using Nagios XI as the platform for Monitoring...
Bryan Heden - Agile Networks - Using Nagios XI as the platform for Monitoring...Bryan Heden - Agile Networks - Using Nagios XI as the platform for Monitoring...
Bryan Heden - Agile Networks - Using Nagios XI as the platform for Monitoring...
 
Matt Bruzek - Monitoring Your Public Cloud With Nagios
Matt Bruzek - Monitoring Your Public Cloud With NagiosMatt Bruzek - Monitoring Your Public Cloud With Nagios
Matt Bruzek - Monitoring Your Public Cloud With Nagios
 
Lee Myers - What To Do When Nagios Notification Don't Meet Your Needs.
Lee Myers - What To Do When Nagios Notification Don't Meet Your Needs.Lee Myers - What To Do When Nagios Notification Don't Meet Your Needs.
Lee Myers - What To Do When Nagios Notification Don't Meet Your Needs.
 
Eric Loyd - Fractal Nagios
Eric Loyd - Fractal NagiosEric Loyd - Fractal Nagios
Eric Loyd - Fractal Nagios
 
Marcelo Perazolo, Lead Software Architect, IBM Corporation - Monitoring a Pow...
Marcelo Perazolo, Lead Software Architect, IBM Corporation - Monitoring a Pow...Marcelo Perazolo, Lead Software Architect, IBM Corporation - Monitoring a Pow...
Marcelo Perazolo, Lead Software Architect, IBM Corporation - Monitoring a Pow...
 
Thomas Schmainda - Tracking Boeing Satellites With Nagios - Nagios World Conf...
Thomas Schmainda - Tracking Boeing Satellites With Nagios - Nagios World Conf...Thomas Schmainda - Tracking Boeing Satellites With Nagios - Nagios World Conf...
Thomas Schmainda - Tracking Boeing Satellites With Nagios - Nagios World Conf...
 
Nagios World Conference 2015 - Scott Wilkerson Opening
Nagios World Conference 2015 - Scott Wilkerson OpeningNagios World Conference 2015 - Scott Wilkerson Opening
Nagios World Conference 2015 - Scott Wilkerson Opening
 
Nrpe - Nagios Remote Plugin Executor. NRPE plugin for Nagios Core
Nrpe - Nagios Remote Plugin Executor. NRPE plugin for Nagios CoreNrpe - Nagios Remote Plugin Executor. NRPE plugin for Nagios Core
Nrpe - Nagios Remote Plugin Executor. NRPE plugin for Nagios Core
 
Nagios Log Server - Features
Nagios Log Server - FeaturesNagios Log Server - Features
Nagios Log Server - Features
 
Nagios Network Analyzer - Features
Nagios Network Analyzer - FeaturesNagios Network Analyzer - Features
Nagios Network Analyzer - Features
 
Nagios Conference 2014 - Dorance Martinez Cortes - Customizing Nagios
Nagios Conference 2014 - Dorance Martinez Cortes - Customizing NagiosNagios Conference 2014 - Dorance Martinez Cortes - Customizing Nagios
Nagios Conference 2014 - Dorance Martinez Cortes - Customizing Nagios
 
Nagios Conference 2014 - Mike Weber - Nagios Rapid Deployment Options
Nagios Conference 2014 - Mike Weber - Nagios Rapid Deployment OptionsNagios Conference 2014 - Mike Weber - Nagios Rapid Deployment Options
Nagios Conference 2014 - Mike Weber - Nagios Rapid Deployment Options
 
Nagios Conference 2014 - Eric Mislivec - Getting Started With Nagios Core
Nagios Conference 2014 - Eric Mislivec - Getting Started With Nagios CoreNagios Conference 2014 - Eric Mislivec - Getting Started With Nagios Core
Nagios Conference 2014 - Eric Mislivec - Getting Started With Nagios Core
 

Dernier

RACHEL-ANN M. TENIBRO PRODUCT RESEARCH PRESENTATION
RACHEL-ANN M. TENIBRO PRODUCT RESEARCH PRESENTATIONRACHEL-ANN M. TENIBRO PRODUCT RESEARCH PRESENTATION
RACHEL-ANN M. TENIBRO PRODUCT RESEARCH PRESENTATIONRachelAnnTenibroAmaz
 
Genshin Impact PPT Template by EaTemp.pptx
Genshin Impact PPT Template by EaTemp.pptxGenshin Impact PPT Template by EaTemp.pptx
Genshin Impact PPT Template by EaTemp.pptxJohnree4
 
PHYSICS PROJECT BY MSC - NANOTECHNOLOGY
PHYSICS PROJECT BY MSC  - NANOTECHNOLOGYPHYSICS PROJECT BY MSC  - NANOTECHNOLOGY
PHYSICS PROJECT BY MSC - NANOTECHNOLOGYpruthirajnayak525
 
THE COUNTRY WHO SOLVED THE WORLD_HOW CHINA LAUNCHED THE CIVILIZATION REVOLUTI...
THE COUNTRY WHO SOLVED THE WORLD_HOW CHINA LAUNCHED THE CIVILIZATION REVOLUTI...THE COUNTRY WHO SOLVED THE WORLD_HOW CHINA LAUNCHED THE CIVILIZATION REVOLUTI...
THE COUNTRY WHO SOLVED THE WORLD_HOW CHINA LAUNCHED THE CIVILIZATION REVOLUTI...漢銘 謝
 
The Ten Facts About People With Autism Presentation
The Ten Facts About People With Autism PresentationThe Ten Facts About People With Autism Presentation
The Ten Facts About People With Autism PresentationNathan Young
 
Early Modern Spain. All about this period
Early Modern Spain. All about this periodEarly Modern Spain. All about this period
Early Modern Spain. All about this periodSaraIsabelJimenez
 
SBFT Tool Competition 2024 -- Python Test Case Generation Track
SBFT Tool Competition 2024 -- Python Test Case Generation TrackSBFT Tool Competition 2024 -- Python Test Case Generation Track
SBFT Tool Competition 2024 -- Python Test Case Generation TrackSebastiano Panichella
 
Quality by design.. ppt for RA (1ST SEM
Quality by design.. ppt for  RA (1ST SEMQuality by design.. ppt for  RA (1ST SEM
Quality by design.. ppt for RA (1ST SEMCharmi13
 
Dutch Power - 26 maart 2024 - Henk Kras - Circular Plastics
Dutch Power - 26 maart 2024 - Henk Kras - Circular PlasticsDutch Power - 26 maart 2024 - Henk Kras - Circular Plastics
Dutch Power - 26 maart 2024 - Henk Kras - Circular PlasticsDutch Power
 
Presentation for the Strategic Dialogue on the Future of Agriculture, Brussel...
Presentation for the Strategic Dialogue on the Future of Agriculture, Brussel...Presentation for the Strategic Dialogue on the Future of Agriculture, Brussel...
Presentation for the Strategic Dialogue on the Future of Agriculture, Brussel...Krijn Poppe
 
The 3rd Intl. Workshop on NL-based Software Engineering
The 3rd Intl. Workshop on NL-based Software EngineeringThe 3rd Intl. Workshop on NL-based Software Engineering
The 3rd Intl. Workshop on NL-based Software EngineeringSebastiano Panichella
 
Simulation-based Testing of Unmanned Aerial Vehicles with Aerialist
Simulation-based Testing of Unmanned Aerial Vehicles with AerialistSimulation-based Testing of Unmanned Aerial Vehicles with Aerialist
Simulation-based Testing of Unmanned Aerial Vehicles with AerialistSebastiano Panichella
 
Call Girls In Aerocity 🤳 Call Us +919599264170
Call Girls In Aerocity 🤳 Call Us +919599264170Call Girls In Aerocity 🤳 Call Us +919599264170
Call Girls In Aerocity 🤳 Call Us +919599264170Escort Service
 
Mathan flower ppt.pptx slide orchids ✨🌸
Mathan flower ppt.pptx slide orchids ✨🌸Mathan flower ppt.pptx slide orchids ✨🌸
Mathan flower ppt.pptx slide orchids ✨🌸mathanramanathan2005
 
Anne Frank A Beacon of Hope amidst darkness ppt.pptx
Anne Frank A Beacon of Hope amidst darkness ppt.pptxAnne Frank A Beacon of Hope amidst darkness ppt.pptx
Anne Frank A Beacon of Hope amidst darkness ppt.pptxnoorehahmad
 
PAG-UNLAD NG EKONOMIYA na dapat isaalang alang sa pag-aaral.
PAG-UNLAD NG EKONOMIYA na dapat isaalang alang sa pag-aaral.PAG-UNLAD NG EKONOMIYA na dapat isaalang alang sa pag-aaral.
PAG-UNLAD NG EKONOMIYA na dapat isaalang alang sa pag-aaral.KathleenAnnCordero2
 
SaaStr Workshop Wednesday w/ Kyle Norton, Owner.com
SaaStr Workshop Wednesday w/ Kyle Norton, Owner.comSaaStr Workshop Wednesday w/ Kyle Norton, Owner.com
SaaStr Workshop Wednesday w/ Kyle Norton, Owner.comsaastr
 
miladyskindiseases-200705210221 2.!!pptx
miladyskindiseases-200705210221 2.!!pptxmiladyskindiseases-200705210221 2.!!pptx
miladyskindiseases-200705210221 2.!!pptxCarrieButtitta
 
Gaps, Issues and Challenges in the Implementation of Mother Tongue Based-Mult...
Gaps, Issues and Challenges in the Implementation of Mother Tongue Based-Mult...Gaps, Issues and Challenges in the Implementation of Mother Tongue Based-Mult...
Gaps, Issues and Challenges in the Implementation of Mother Tongue Based-Mult...marjmae69
 
INDIAN GCP GUIDELINE. for Regulatory affair 1st sem CRR
INDIAN GCP GUIDELINE. for Regulatory  affair 1st sem CRRINDIAN GCP GUIDELINE. for Regulatory  affair 1st sem CRR
INDIAN GCP GUIDELINE. for Regulatory affair 1st sem CRRsarwankumar4524
 

Dernier (20)

RACHEL-ANN M. TENIBRO PRODUCT RESEARCH PRESENTATION
RACHEL-ANN M. TENIBRO PRODUCT RESEARCH PRESENTATIONRACHEL-ANN M. TENIBRO PRODUCT RESEARCH PRESENTATION
RACHEL-ANN M. TENIBRO PRODUCT RESEARCH PRESENTATION
 
Genshin Impact PPT Template by EaTemp.pptx
Genshin Impact PPT Template by EaTemp.pptxGenshin Impact PPT Template by EaTemp.pptx
Genshin Impact PPT Template by EaTemp.pptx
 
PHYSICS PROJECT BY MSC - NANOTECHNOLOGY
PHYSICS PROJECT BY MSC  - NANOTECHNOLOGYPHYSICS PROJECT BY MSC  - NANOTECHNOLOGY
PHYSICS PROJECT BY MSC - NANOTECHNOLOGY
 
THE COUNTRY WHO SOLVED THE WORLD_HOW CHINA LAUNCHED THE CIVILIZATION REVOLUTI...
THE COUNTRY WHO SOLVED THE WORLD_HOW CHINA LAUNCHED THE CIVILIZATION REVOLUTI...THE COUNTRY WHO SOLVED THE WORLD_HOW CHINA LAUNCHED THE CIVILIZATION REVOLUTI...
THE COUNTRY WHO SOLVED THE WORLD_HOW CHINA LAUNCHED THE CIVILIZATION REVOLUTI...
 
The Ten Facts About People With Autism Presentation
The Ten Facts About People With Autism PresentationThe Ten Facts About People With Autism Presentation
The Ten Facts About People With Autism Presentation
 
Early Modern Spain. All about this period
Early Modern Spain. All about this periodEarly Modern Spain. All about this period
Early Modern Spain. All about this period
 
SBFT Tool Competition 2024 -- Python Test Case Generation Track
SBFT Tool Competition 2024 -- Python Test Case Generation TrackSBFT Tool Competition 2024 -- Python Test Case Generation Track
SBFT Tool Competition 2024 -- Python Test Case Generation Track
 
Quality by design.. ppt for RA (1ST SEM
Quality by design.. ppt for  RA (1ST SEMQuality by design.. ppt for  RA (1ST SEM
Quality by design.. ppt for RA (1ST SEM
 
Dutch Power - 26 maart 2024 - Henk Kras - Circular Plastics
Dutch Power - 26 maart 2024 - Henk Kras - Circular PlasticsDutch Power - 26 maart 2024 - Henk Kras - Circular Plastics
Dutch Power - 26 maart 2024 - Henk Kras - Circular Plastics
 
Presentation for the Strategic Dialogue on the Future of Agriculture, Brussel...
Presentation for the Strategic Dialogue on the Future of Agriculture, Brussel...Presentation for the Strategic Dialogue on the Future of Agriculture, Brussel...
Presentation for the Strategic Dialogue on the Future of Agriculture, Brussel...
 
The 3rd Intl. Workshop on NL-based Software Engineering
The 3rd Intl. Workshop on NL-based Software EngineeringThe 3rd Intl. Workshop on NL-based Software Engineering
The 3rd Intl. Workshop on NL-based Software Engineering
 
Simulation-based Testing of Unmanned Aerial Vehicles with Aerialist
Simulation-based Testing of Unmanned Aerial Vehicles with AerialistSimulation-based Testing of Unmanned Aerial Vehicles with Aerialist
Simulation-based Testing of Unmanned Aerial Vehicles with Aerialist
 
Call Girls In Aerocity 🤳 Call Us +919599264170
Call Girls In Aerocity 🤳 Call Us +919599264170Call Girls In Aerocity 🤳 Call Us +919599264170
Call Girls In Aerocity 🤳 Call Us +919599264170
 
Mathan flower ppt.pptx slide orchids ✨🌸
Mathan flower ppt.pptx slide orchids ✨🌸Mathan flower ppt.pptx slide orchids ✨🌸
Mathan flower ppt.pptx slide orchids ✨🌸
 
Anne Frank A Beacon of Hope amidst darkness ppt.pptx
Anne Frank A Beacon of Hope amidst darkness ppt.pptxAnne Frank A Beacon of Hope amidst darkness ppt.pptx
Anne Frank A Beacon of Hope amidst darkness ppt.pptx
 
PAG-UNLAD NG EKONOMIYA na dapat isaalang alang sa pag-aaral.
PAG-UNLAD NG EKONOMIYA na dapat isaalang alang sa pag-aaral.PAG-UNLAD NG EKONOMIYA na dapat isaalang alang sa pag-aaral.
PAG-UNLAD NG EKONOMIYA na dapat isaalang alang sa pag-aaral.
 
SaaStr Workshop Wednesday w/ Kyle Norton, Owner.com
SaaStr Workshop Wednesday w/ Kyle Norton, Owner.comSaaStr Workshop Wednesday w/ Kyle Norton, Owner.com
SaaStr Workshop Wednesday w/ Kyle Norton, Owner.com
 
miladyskindiseases-200705210221 2.!!pptx
miladyskindiseases-200705210221 2.!!pptxmiladyskindiseases-200705210221 2.!!pptx
miladyskindiseases-200705210221 2.!!pptx
 
Gaps, Issues and Challenges in the Implementation of Mother Tongue Based-Mult...
Gaps, Issues and Challenges in the Implementation of Mother Tongue Based-Mult...Gaps, Issues and Challenges in the Implementation of Mother Tongue Based-Mult...
Gaps, Issues and Challenges in the Implementation of Mother Tongue Based-Mult...
 
INDIAN GCP GUIDELINE. for Regulatory affair 1st sem CRR
INDIAN GCP GUIDELINE. for Regulatory  affair 1st sem CRRINDIAN GCP GUIDELINE. for Regulatory  affair 1st sem CRR
INDIAN GCP GUIDELINE. for Regulatory affair 1st sem CRR
 

Nagios XI Best Practices

  • 1. Nagios XI Best Practices By Troy Lea tlea@nagios.com
  • 2. About Me •Tech Support Contractor for Nagios Enterprises •Based in Australia •Typically cover UTC+10 from 9am to 5pm •Nagios & XI Dev (Box293) •Nagios MVP3
  • 3. What's Covered In This Talk •Getting the most from Nagios XI •Time saving information •Configuration practices •Object definitions •Backend setup •Performance enhancements
  • 5. Nagios XI License Entitlements •XI license entitles 3 instances: •Production •Test & Dev (T&D) •Disaster Recovery (DR) •License activation is tied to IP Address of each XI host
  • 6. Whats Monitoring Nagios XI? •How would you know your XI server died? •“Nagios XI Server” Monitoring Wizard •DR instance monitors production instance •Production instance is UP & HEALTHY •Production Instance monitors DR instance •DR instance is UP & HEALTHY
  • 7. localhost services •Do you know how your XI server is performing? •Basic local services are included in XI base •You should ideally be monitoring: •Service Status (check_init_service) •crond, httpd, mysql, ndo2db, npcd, ntpd, postgresql, snmptrapd, snmptt
  • 8. localhost services •File Counts (check_file_count) •NPCD Perfdata spool directory •xidpe spool directory •Check results folder •snmptt spool folder •nagios user account has not expired •(check_pass_expire.pl)
  • 9. localhost services •root mailbox size •(box293_check_mbox) •MySQL / MariaDB •Database tables crashed? •(box293_check_mysql_table_status) •Date/Time correct? •(box293_check_mysql_date)
  • 10. localhost services •Overall Load (check_load) •Memory Free – Physical (check_memory) •Swap Usage (check_swap) •Disk Free (check_disk)
  • 11. Date and Timezone! •Configure Timezone •Admin > Manage System Config •Sync with trusted time source •VM? Don’t sync with hypervisor! •Can be the source of confusing problems
  • 12. CPU •CPU Cores vs Speed! •Not everything is multi-threaded •3.4 GHz vs 2.2 GHz •Number of cores is still important •Refer to XI hardware requirements
  • 13. Memory •Enough memory to cope in a major outage •Event handlers consume memory quickly (+GB in a matter of minutes in a major outage) •Have at least 50% more memory than needed •Refer to XI hardware requirements
  • 14. RAM Disk •Lots of little files created/deleted/updated •Using a RAM Disk: •Reduces disk I/O & load •Speeds up processing of performance data •Speeds up processing of spooled check results •Speeds up nagios restarts •Refer to official procedure
  • 15. Solid State Disk (SSD) •Greatly improves overall performance •Compliments RAM Disk •Helps read/writes with: •Logs •Database •Performance Graphs •Reports
  • 16. SSD vs RAID ? •SSD beats* a spinning disk RAID set •*Depends on how much money you have •Still need to RAID1 SSD for redundancy! •SSD may not give you the required capacity •3.8TB SAS SSD now available !!!
  • 17. rrdcached •Enabling rrdcached accumulates the spooled performance data, after x amount of time it is processed into backend RRD files •Reduces Disk I/O •Can be a delay in data appearing in graphs •Refer to official procedure
  • 18. Offloaded MySQL / MariaDB •Data constantly written to databases •Historical and Configuration •Offload to separate server to reduce load •Don't forget to monitor offloaded server!!! •Disk/CPU/Memory/Tables/Service •Refer to earlier slides •Refer to official procedure
  • 19. Mod-Gearman •Used for offloading plugins to workers •Plugins need to be installed on all workers •Be aware of plugins that use /tmp files! •XI 2014 onwards uses Core 4 •Core 4 has it's own workers (only local workers) •nagios.cfg “check_workers” option •Refer to official procedure
  • 20. Disaster Recovey •Failover and High Availability Solutions for Nagios XI •Andy Brist - NWC2014 – Failover & HA •What is really important in disaster? •Plan and test
  • 21. Backups!!! •Admin > System Backups •Schedule backups of XI •Location can be local, FTP, SSH •Remote location recommended •Manual Backups •Local Backup Archives via Admin menu •/usr/local/nagiosxi/scripts/backup_xi.sh
  • 22. Restoring Backups •Official Backup and Restore procedure •Brings system back online with ease •Great for migrating from old XI to new XI •Also good for: •DR •Test & Dev
  • 24. Intervals - Host vs Services •Host down HARD = service notifications suppressed •What happens when host and services use the same check intervals? •Unnecessary Notifications get sent :( •Make host go down HARD quicker than it’s services!
  • 25. Service Dependencies •When a master service goes down: •Prevents notifications from being sent •Prevents service checks from execution •Make master service go down HARD quicker than dependent services! •Otherwise dependencies are pointless •Master service e.g. - Ping or NRPE Version
  • 26. Disable Service Checks ? •host_down_disable_service_checks •Nagios Core 4.1.x feature (XI 5) •System wide setting •Reduces load on XI host •Think of it as automatic service dependencies on their own hosts •Service dependencies ignored if host is down
  • 27. Check Intervals - Be Realistic •Does it need to be checked every 5 minutes? •Disk Free Space – every 60 minutes perhaps? •Too long = no performance data •Different intervals to spread the load •3, 5, 7 minute intervals •58, 60, 62 minute intervals
  • 28. Notification & Check Intervals •Nagios determines if it is allowed to send a notification every service HARD state •e.g. 15 minute check and 60 minute notification •Internal scheduling may cause 14min 55sec to pass, 4 x 14:55 = 59min 40sec … it’s < 60min! •Notification not sent until 75min! •Scheduling is geared +/- to reduce load!
  • 29. Use Hostgroups! •Assign ONE service to a hostgroup of common servers •Windows Servers •Linux Servers •Consistent monitoring, standards enforced! •Directive changes - all hosts get updated •Reduces management overhead
  • 30. Use Contact Groups! •Use contact groups in all definitions •Makes it easy when staff join/leave •Just add/remove the contact from groups •Reduces administrative overhead •Enforces your company policy •Similar principle to host groups
  • 31. Configuration Wizards •Pros •Great for getting up and running quickly •No need to learn how a plugin works •Cons •Creates individual services •More work later when enforcing “standards”
  • 32. Templates •Common settings applied to objects •Helps enforce standards •Reduces administrative overhead •Layer multiple templates •Can be additive or ignore inheritance •XI Config Wizard objects use templates •Example of common icmp check
  • 33. User Macros – resources.cfg •$USERx$ macros are good for common items like a username or password •Allows passwords with a ! exclamation mark •Values not visible in object definitions •$USER1$ •/usr/local/nagios/libexec
  • 34. Custom Object Variables •Allows you to create your own variables •Can be defined in host or service objects •E.G. hosts have their own check_nt password •Define _CHECK_NT_PASSWORD in host object •In command definitions reference it as: •$_HOSTCHECK_NT_PASSWORD$ •VERY POWERFULL!
  • 35. Other
  • 36. MTRG Clean Configs •Your MRTG configs may be collecting more than what you think •/etc/mrtg/conf.d/*.cfg files •Created by Network Switch / Router Wizard •Comment out unused ports •About 37 lines per port •Comment out unused non-interfaces (VLANs)
  • 37. Plugins – Compiled vs Scripts •Compiled runs quicker •Official nagios-plugins are compiled •“Custom modifications” require re-compiling •Scripts run slower, consume more resources •Perl plugins known to consume +CPU +RAM •“nice” can reduce impact of plugins •Check Profiler component by box293
  • 38. Backend API - Read Only User •API provides you with URLs for use in third party products without needing user/pass •Requires a user account to be created •Account should be READ ONLY
  • 39. Performance Data Tool •Component developed by box293 •Allows you to manipulate RRD files •Great for merging RRD data •Can also delete old RRD files for old services •View raw data in tables •Find it in the Nagios Exchange
  • 40. Thank you! What Is Your Best Practice? Any Questions?

Notes de l'éditeur

  1. Hello everyone and welcome to my talk on Nagios XI Best Practices. First a little about me. I’ve been an independent contractor for Nagios Enterprises since June 2014. I provide tech support for our customers through our support system, and in the forums. Generally speaking when the USA techs are finishing their day, I’m starting mine. I’ve been using Nagios since 2009 when Nagios XI was released. I develop various Nagios and Nagios XI related projects in my spare time, you will see them published under the box293 handle. Since then I’ve spoken at several conferences and have been lucky enough to have received the MVP award three times.
  2. This talk is going to be about best practices. I will cover a range of topics such as how to get the most out of XI, things you wish you knew, and configuration practices. Strictly speaking, a best practice is a flexible statement. Depending on your environment and your needs some topics won’t apply and other topics may be exactly what you are after. The information in this talk is a reflection of Nagios XI deployments in the wild. Additionally, monitoring is not all about metrics and thresholds, it can also be helpful to ensure standards are enforced. If a setting gets changed you can be notified about it instead of having to track it down during one of your troubleshooting adventures. This presentation will be available after the conference to download, so don’t feel like you have to write down everything on the screen … however I do admit to taking photos of slides in presentations as sometimes I just can’t wait that long. Hopefully you should be able to walk away from this talk learning something … if not you should come and work for Nagios Enterprises!
  3. We’ll start off with looking at the Nagios XI server and how it is configured in the back end.
  4. To get the ball rolling, what does a license of XI entail? Three running instances of Nagios XI are allowed. The caveat to this is that only your Production system can be used for actual monitoring. We are flexible and understand your needs to actively have a test system that runs alongside production. This allows you to implement new checks into production with the confidence that they will work. Disaster recovery is another important factor for customers, once again you can have this instance up and running as part of your license.
  5. Probably the most important part of a monitoring system is knowing that it’s actually working. Nagios XI comes with a “Nagios XI Server” monitoring wizard to ensure everything is A-OK. However there is no point in monitoring itself, I mean if it’s down you’re not going to hear about it. Utilize your DR instance to monitor the production instance. This way, if the production instance goes down, you’ll receive alerts about it. The same applies for monitoring the DR instance, make sure production monitors the DR instance to make sure it’s healthy. By using the “Nagios XI Server” wizard to monitor the other instance, you can have confidence knowing that when something goes wrong, you’ll really hear about it.
  6. Nagios XI comes bundled with a handful of localhost services, however there are some addition localhost services we think you should add. There are a lot of moving parts to an XI server, so we think these are the most important items to be monitoring. For services, this list of services should be monitored to ensure they are running. The snmptrapd and snmptt services are not installed on an XI server by default, but are when the official SNMP Trap procedure has been followed.
  7. Nagios XI generates a LOT of small files, constantly. These files are created/deleted/updated every millisecond. The more you are monitoring, the more disk I/O this generates. Sometimes a service can stop and once this happens certain folders start to spool these files and before you know it there could be 100,000 unprocessed files on your system. This can quickly exhaust the free inodes on your disk. These directories should be monitored to make sure the files don’t increase past a certain number. If they do, you’ll be on top of it before it becomes a major problem. In some customer installations, it’s possible that the nagios user account expires. This isn’t always that obvious to troubleshoot, so checking that it hasn’t expired is a good precautionary measure.
  8. If you’re not a Linux person then you probably don’t know about the system mailbox. This is a local mail system on the linux server where messages are sometimes sent. Certain components used in Nagios XI such as MRTG will send messages to this mailbox when it has a problem. An incorrect MRTG configuration can cause a message to be sent every five minutes as this is when MRTG runs. That’s about 288 messages a day. Over time the root mailbox can grow GB in size causing issue. I wrote a plugin which can report on this and let you know when it gets too big. The MySQL or MariaDB databases are important to the system. The change in name has to do with a split in the OpenSource community and MariaDB is present in CentOS and Redhat 7. If the tables are crashed and go undetected, this can have a severe impact on the system and you may not be storing important data and it may cause strange problems. It just so happens that I wrote a plugin that will alert you if this happens. Also, another problem can occur if the database engine runs on a different timezone to the local system. Once again here’s a plugin you can use to monitor this circumstance, which in this case is more of an auditing check.
  9. Load is important to keep tabs on, some services like NPCD will stop running when the system load exceeds a defined threshold. Physical Memory free is important and is tied to swap usage. If the system runs out of physical memory and starts swapping to disk, the system performance will be greatly impacted. Disk free space is very important. If you have different volumes mounted then you should be monitoring each one of these.
  10. Make sure your timezone is configured correctly AND it is synced with a trusted time source. This can cause issues with databases, log files, performance data.
  11. Having the right amount of CPU cores is important but so too is the speed of those cores. Not all plugins and processes are multi-threaded, so a higher speed CPU is going to benefit. A 3.4GHz CPU will do a lot more than a 2.2GHz one.
  12. How much memory do you need on an XI system? When all the hosts and services in XI are healthy, the amount of memory used is far less compared to a major system outage. When XI fires off event handlers they consume memory, if there is a major outage and a lot of event handlers are being executed, a lot of memory is being consumed. It doesn’t take long for 6GB of memory to be used. Generally speaking you should have at least 50% more memory than needed.
  13. While on the topic of memory, configuring Nagios XI with a RAM Disk is highly recommended as the number of monitored objects increase. The more things you are monitoring the more disk I/O occurs. By directing this traffic a RAM Disk, the time it takes for that I/O operation to complete is drastically faster. We have an official procedure for you to follow to implement this.
  14. A solid state disk will also provide a dramatic performance to your Nagios XI server. Keep in mind, a RAM Disk is still recommended as you want to minimize the amount of writes to the SSD.
  15. RAID allows for much larger disk capacities than SSD can provide, however it would be very hard for a spinning disk RAID set to beat the performance of SSD. Keep in mind if you implement SSD you should implement RAID1 sets for redundancy purposes.
  16. rrdcached is a way of accumulating the received performance data and then processing it in a batch job. It helps with larger installations and can reduce I/O, however it can also result with performance graphs lagging behind the realtime results. We have an official procedure for you to follow to implement this.
  17. On larger installations there can be a lot more data being written to the databases, which in turn can result in a lot of CPU usage directed away from actual monitoring. Offloading to a separate server will remove this CPU usage from your monitoring server. Of course, make sure you monitor the offloaded server! We have an official procedure for you to follow to implement this.
  18. Mod-Gearman is a way of offloading the plugin execution to separate workers instead of the monitoring engine doing it. A worker can be on the XI server itself OR on external hosts. On external hosts, it requires all the plugins to be installed that are going to be executed. Also, be aware of plugins that create temporary files, these don’t work well if the plugins are moving about the workers. You can also use host groups for directing checks to only be executed by specific workers, this is handy for multi-site setups. XI 2014 uses Core 4.x which now has it’s own workers. Using Mod-Gearman on an XI 2014 server just for the purpose of a local worker is not required, however if you need external workers then Mod-Gearman is the solution for you. We have an official procedure for you to follow to implement this.
  19. I won’t go into this topic in detail as Andy Brist did a great talk on it at last years conference. What you need to define is what is import to you in a disaster. Once you have clearly defined goals and outcomes you can plan appropriately and test.
  20. Have you scheduled your backups in Nagios XI? Storing them on storage that is not locale to the XI file system is important, make sure you can get to your backups if your XI server dies.
  21. The backup and restore procedure is very straight forward and allows for a full recovery of your Nagios XI system. Another good use of it is to migrate XI from one server to another.
  22. Now lets look at Nagios configuring practices.
  23. When a host goes down HARD, it will prevent service notifications from being sent, saving unnecessary alerts. A common mistake when setting up your monitoring intervals is to leave the host intervals the same as the service intervals. What this can lead to is the hosts service’s going into a HARD critical state before the HOST does. By making sure the HOST goes into a HARD down state before services ensures the service notifications will be suppressed.
  24. When a host goes down, the services still get executed and can result in services in an unknown or critical state. Nagios suppresses any notifications however they still appear as critical in the interface. Sometimes a host can be up but the monitoring agent can be down. An example of this is an NRPE agent. By using service dependencies, if the master service goes down, you prevent notifications be sent OR prevent checks from being executed. Either option simply pushes the next check or notification to the next interval. However, very similar to the previous slide, make sure your master service goes down HARD before the dependent services, use different check intervals or retries.
  25. An upcoming feature of Nagios Core 4.1.0 is a configuration directive called host_down_disable_service_checks. It’s best described as automatic service dependencies on their own hosts. This applies across the board, it is not granular. Can reduce the load on the XI host as plugins will not be executed if this host is down. Keep in mind that if the host is down then any defined service dependencies will be ignored.
  26. It can be very easy to setup your monitoring with the same intervals across the board. This can lead to peaks and troughs in load on the XI server as a lot of checks can occur in the same time windows. Have a think about what you are monitoring and how often do you really need to check it. Something like disk usage rarely runs out quickly, you can monitor this every hour and be confident you’ll be notified about the free disk space running low in a reasonable time. However if you are going to make it every hour, why not every 58 minutes or 61 minutes? Try to spread the load out a bit.
  27. Sometimes larger check intervals can have an adverse affect on notification intervals. The monitoring engine determines if it should send a notification every time a check result is received. Due to how the internal scheduling works, you might fall short of the notification window by a small time period like 20 seconds. This means it might be another 15 minutes until the next check is run, that’s when the notification will be sent.
  28. Using hostgroups in your service definitions is one of the most powerful features of Nagios. Common services generally have the same threshold for all hosts. Instead creating individual services for each host you monitor, a service can be assigned to multiple hosts using a hostgroup. What this means is that you only need to have the service defined once, and when you want to tweak the thresholds, you only need to change it in one location and all hosts will receive the updated thresholds. So if you have a host group called windows_servers, whenever you add a new windows server it’s just a matter of adding that server to the hostgroup and voila, that host gets that bunch of common checks. This is great for consistent monitoring and it ensures standards get applied.
  29. One of the most common support questions we get asked is how to add or remove a contact to a bunch of objects? If you don’t have the enterprise edition license then you don’t get access to the bulk modification tool that allows you to do this. However that approach is flawed. It’s very easy to make mistakes and before you know it a notification was not sent to the correct people. Using a contact group is a much better method. It’s so much easier to go in and add or remove a contact from a contactgroup and instantly all the objects that use this group will be updated. Even if there is only one member to a contact group it still makes administration so much easier. If you’ve not activated the trial of enterprise edition, this is a great way make use of the bulk modification tool to implement contactgroups and remove the individual contacts. Once your standard is in place, administration will be so much easier.
  30. Configuration wizards is how I really first got involved in Nagios XI. I really liked how you could step through monitoring a particular device and at the end of it all the configurations were created for you. I really saw that as an ice breaker for people who are new to nagios, you didn’t need to learn how a plugin worked or how to create a command definition followed by service definitions, you just pointed and clicked. The downside to wizards is that they create a lot of services. In a large scale monitoring environment, you might use services that are applied to hostgroups which reduces administrative overhead. Using wizards doesn’t really apply to these environments however they are a great primer for setting up initial services, from there you can modify them in CCM.
  31. Templates are very powerful when used for the right purposes. A really good example is how the XI Configuration Wizards use templates for the host objects. The host object template has a standard icmp up/down check. This means if you ever wanted to change the thresholds, you could change the template and then all hosts using that template will get the updated check. You can use multiple templates in a layering fashion. As Nagios core reads the object definition, it looks at the first template and obtains the settings. It then looks at the next template and layers those settings over the top of the previous settings. This continues and builds the final object. Object directives can be set to inherit that setting from a template, or ignore it. Other settings can be additive, like hosts, hostgroups, contacts, contactgroups. For example you might have a master template that defines the base settings all services should use. However you have a bunch of service checks that require a specific time period. Create a separate template that uses this time period and put that template at the top of the chain. The final service object that is created will use the specific time period. You can even create an empty template that uses a combination of other templates, this way you can use the master templates across all your objects and easily add / remove other templates to the master template, in turn reducing your administrative overhead. Be careful not to add more administrative overhead though.
  32. User macros are a way of storing and referencing common items such as usernames and passwords. Because you are referencing the objects as a macro, the actual value is not visible in the object definitions. It also allows special characters to be used like an exclamation mark. Normally when an exclamation mark is used in a command_name directive, it’s purpose is to split up the different arguments, so by storing it in a user macro it works around the problem.
  33. Custom object variables are one of the lesser known features of Nagios. It allows you to define you own variables to use in your object definitions, this makes Nagios very flexible. A good example is if each windows host had it’s own custom check_nt password. What you can do is store that password in the host object and then from your service objects you can reference the password. It also means that you can still have just one command that can be used my many hosts, reducing administrative overhead.
  34. Finally here are some other items which you may be interested in.
  35. In Nagios XI, the Network Switch / Router wizard uses MRTG to collect the monitoring data from the network device. The configuration files for MRTG are created with the program cfgmaker. While you may have selected to only monitor a handful of ports from your network device, MRTG will collect data from all the interfaces. This creates extra network traffic and I/O. You can edit these MRTG config files and comment out the ports for which you do not need data to be collected for. Each port consists of about 37 lines in total. Also you’ll find non-interfaces like VLANs, these can also be commented out, unless of course you want to monitor them.
  36. Plugins can either be complied or scripts, or perhaps a combination of both. The plugins that are installed as part of the nagios-plugins package are compiled. Basically this means that if you want to modify them you need to modify the source code and recompile them, which can be tedious. However these plugs generally consume less resources. A lot of plugins you can download off the Nagios Exchange are script based, like bash, perl, python etc. These plugins can be modified on the fly, but can consume a lot more resources which in turn can increase the load on your XI server. One method of reducing the impact of plugins is to pre-pend them with the nice command. Using nice will run a plugin at a lower priority, it will still consume the same amount of resources but not at the expense of other more important processes. Using Mod-Gearman is also another option to offload resource intensive plugins. Try the Check Profiler component I created, this gives a quick report of how long plugins take to run and also the latency.
  37. The Nagios XI backend API allows you to generate URLs to use in third party products to access Nagios without requiring a username or password. This requires a user account to be created to generate the URLs. Its important to create this user account as a read only account to prevent any unintended access to Nagios XI
  38. The Performance Data Tool is a component that I developed that allows you to manipulate and interrogate rrd files. It’s particularly useful for merging performance data from one service to another. Perhaps a service was renamed and the data is now stored in a different RRD file, this lets you merge the old data into the new file. Another great feature is that you can view the performance data in a table format, which sometimes is more useful than graphs. Finally it can be handy for finding old rrd files which could be deleted from the system to reclaim disk space. You can download it from the Nagios Exchange.
  39. That’s about it. How about some questions or perhaps tell us what your best practice is.