This document discusses monitoring best practices and examples. It recommends proactively monitoring systems and services rather than reacting to issues. This allows for improved customer satisfaction and justification for changes. The document explores options for what to monitor, factors to consider like resources, and examples of state and process monitoring using tools like Nagios, PowerShell, and vRealize. Best practices discussed include automating monitoring, setting sane thresholds, and generating only actionable alerts.
2. WHY MONITOR?
• Proactive IT vs. Reactive IT
• Metrics collection
• Change justification
• Increased internal/external customer satisfaction
3. WHAT DO WE CHOOSE?
https://en.wikipedia.org/wiki/Comparison_of_network_monitoring_systems
4. WHAT DO WE CHOOSE?
• Many shapes, sizes and costs
• What are you planning to monitor?
• Do I need collated historical data?
• How much time? Money? Resources?
• Often inversely related
• One size fits all or multiple systems?
• Be prepared for mediocrity and workarounds
6. TWO TYPES OF MONITORING
• State Monitoring
• Where are we right now?
• CPU usage, memory usage, disk space, etc.
• Process Monitoring
• Logical chain of steps to complete a task
• Student registration
• Website content updates
• Do I have an internet connection?
7. EXAMPLES
• Monitor a Windows service with PowerShell Scheduled Task
Function Watch-ServiceStatus {
Param( [string]$Name )
$From = "Service Status Notification <noreply@domain.com>"
$To = "jsmith@domain.com"
$SmtpServer = "mail.domain.com"
$ServiceStatus = ( Get-Service $Name ).Status
If ( $ServiceStatus -ne "Running" ) {
Send-MailMessage -To $To -From $From -SmtpServer $SmtpServer
-Subject "Warning: $Name service is not running"
}
}
Watch-ServiceStatus -Name Netlogon -Notify $true
11. BEST PRACTICES
• Work with application owners to develop sane thresholds
• Be prepared for thresholds to change
• Automation!
• Configuring monitoring is a tedious task to complete by hand
• Configuration management
• Create modular and reusable template systems
windows 2012r2 prod print-server
12. BEST PRACTICES CONTINUED
• Generate only actionable alerts
• Avoid being “The Boy Who Cried Wolf” and alert fatigue
• Be accountable to alerts
• Digest raw data into something useable
• We’re still working on this one!