IP SLA with Syslog Alerting
Most network outages are not detected by the router that experiences them — they are detected by an end user who calls the help desk, or by a NOC engineer refreshing a dashboard. IP SLA (Internet Protocol Service Level Agreement) changes this by turning the router itself into an active probe: it continuously sends synthetic test traffic to a target, measures reachability and performance, and declares the target either reachable or unreachable based on those measurements. Object Tracking watches the IP SLA result and translates probe outcomes into a binary state — Up or Down. EEM applets subscribe to that state and fire the moment a transition occurs, generating syslog alerts, capturing diagnostics, and sending email notifications — all before any human has opened a dashboard.
This lab assembles that complete monitoring pipeline on a single router. It covers four progressively more capable probe types: ICMP echo for basic reachability, HTTP for application-layer connectivity, UDP jitter for voice-quality monitoring, and DNS for resolver availability. Each probe is wired to a tracking object and an EEM applet that generates an alert on failure and a recovery notification when the target returns. The result is a lightweight, self-contained WAN monitoring system that requires no external NMS, no SNMP collector, and no subscription.
Before starting, ensure you understand IP SLA probe types and tracking at IP SLA Configuration & Tracking. For the EEM applet architecture used in this lab, see EEM — Embedded Event Manager Scripting. For forwarding alerts to a central syslog server, see Syslog Configuration and Syslog Server Configuration. For understanding syslog severity levels referenced in the EEM actions, see Syslog Severity Levels.
1. IP SLA + Track + EEM — The Full Pipeline
How the Three Components Work Together
┌─────────────────────────────────────────────────────────────┐
│ LAYER 1: IP SLA PROBE │
│ Sends synthetic test traffic to the target on a schedule │
│ Records: RTT, packet loss, jitter, return code │
│ Declares: operation success or failure │
│ │
│ ip sla 1 │
│ icmp-echo 203.0.113.1 source-interface GigabitEthernet0/0 │
│ frequency 30 │
│ ip sla schedule 1 life forever start-time now │
└───────────────────────┬─────────────────────────────────────┘
│ passes result (success/fail)
▼
┌─────────────────────────────────────────────────────────────┐
│ LAYER 2: OBJECT TRACKING │
│ Watches the IP SLA operation result │
│ Maintains binary state: Up (probe succeeding) or │
│ Down (probe failing) │
│ Applies reachability or threshold criteria │
│ │
│ track 1 ip sla 1 reachability │
└───────────────────────┬─────────────────────────────────────┘
│ notifies subscribers on state change
▼
┌─────────────────────────────────────────────────────────────┐
│ LAYER 3: EEM APPLET — event track 1 state down │
│ Fires the instant track 1 transitions to Down │
│ Actions: │
│ 1. Send syslog CRITICAL alert to log buffer + server │
│ 2. Capture show ip sla statistics to flash │
│ 3. Capture show ip route to flash │
│ 4. Send email to NOC team │
│ Companion applet on state up: │
│ 1. Send syslog NOTICE — target recovered │
└─────────────────────────────────────────────────────────────┘
│
▼
Syslog server / NOC inbox
(alert arrives within seconds
of probe declaring target down)
IP SLA Probe Types Covered in this Lab
| Probe Type | IOS Keyword | What It Measures | Requires Responder? | Typical Use Case |
|---|---|---|---|---|
| ICMP Echo | icmp-echo |
Round-trip time (RTT) and reachability to any IP-addressable target | No — target only needs to respond to ICMP (ping) | WAN gateway reachability, ISP monitoring, basic link health |
| UDP Jitter | udp-jitter |
RTT, jitter (delay variation), packet loss, and out-of-order delivery — full VoIP quality metrics | Yes — Cisco IP SLA Responder must be enabled on the target router | Voice quality monitoring, MPLS SLA verification, QoS validation |
| HTTP | http get |
HTTP GET response time and HTTP return code — application-layer reachability | No — any web server | Web application availability, DNS + HTTP end-to-end testing |
| DNS | dns |
DNS resolution time and success/failure for a specific hostname | No — standard DNS server | DNS resolver availability monitoring, split-horizon DNS validation |
| TCP Connect | tcp-connect |
TCP three-way handshake completion time to a specific IP and port | No — any TCP server (port 80, 443, 22, etc.) | Application port availability — verify SSH, HTTPS, or custom services are accepting connections |
Tracking Object Types
| Track Type | IOS Syntax | State Goes Down When | Use Case |
|---|---|---|---|
| Reachability | track N ip sla N reachability |
The IP SLA operation returns a failure (timeout, unreachable, non-OK return code) | Binary up/down monitoring — did the probe succeed or fail? |
| State | track N ip sla N state |
The IP SLA operation result changes from its baseline value (over-threshold or under-threshold) | Threshold-based monitoring — did RTT exceed the configured threshold? |
2. Lab Topology & Monitoring Plan
NetsTuts_R1 is a dual-WAN edge router with a primary ISP (Gi0/0) and backup ISP (Gi0/1). Four monitoring probes will be deployed: a WAN gateway ICMP probe on each ISP, a UDP jitter probe to the branch office, and an HTTP probe to a critical internal web server. Each has a dedicated tracking object and a pair of EEM applets (down alert + up recovery).
┌─────────────────────────────────────────────────────────────────┐ │ NetsTuts_R1 │ │ Gi0/0 ── 203.0.113.2 ── ISP-A Gateway: 203.0.113.1 │ │ Gi0/1 ── 198.51.100.2 ── ISP-B Gateway: 198.51.100.1 │ │ Gi0/2 ── 10.0.0.1/24 ── LAN │ └─────────────────────────────────────────────────────────────────┘ IP SLA Monitoring Plan: ┌──────┬──────────────┬──────────────────────────────┬─────────┐ │ SLA# │ Probe Type │ Target │ Track # │ ├──────┼──────────────┼──────────────────────────────┼─────────┤ │ 1 │ ICMP echo │ ISP-A Gateway 203.0.113.1 │ 1 │ │ 2 │ ICMP echo │ ISP-B Gateway 198.51.100.1 │ 2 │ │ 3 │ UDP jitter │ Branch Router 10.10.0.1 │ 3 │ │ 4 │ HTTP get │ Web Server http://10.0.0.50│ 4 │ └──────┴──────────────┴──────────────────────────────┴─────────┘ EEM Applet Pairs (down alert + up recovery): ┌─────────────────────────────┬────────────────────────────────┐ │ ISPA-GW-DOWN / ISPA-GW-UP │ Track 1 state transitions │ │ ISPB-GW-DOWN / ISPB-GW-UP │ Track 2 state transitions │ │ BRANCH-JITTER-DOWN / -UP │ Track 3 state transitions │ │ WEBSERVER-DOWN / -UP │ Track 4 state transitions │ └─────────────────────────────┴────────────────────────────────┘
3. Step 1 — EEM Prerequisites and Environment Variables
Configure the global EEM prerequisites before writing any applets. These settings are shared across all four monitoring pairs.
NetsTuts_R1>en NetsTuts_R1#conf t ! ══════════════════════════════════════════════════════════ ! EEM CLI execution user — required for action cli command ! ══════════════════════════════════════════════════════════ NetsTuts_R1(config)#username eem-user privilege 15 secret EEM$ecret99 NetsTuts_R1(config)#event manager session cli username eem-user ! ══════════════════════════════════════════════════════════ ! EEM environment variables — centralised parameters ! ══════════════════════════════════════════════════════════ NetsTuts_R1(config)#event manager environment _hostname NetsTuts_R1 NetsTuts_R1(config)#event manager environment _email_server 10.0.0.25 NetsTuts_R1(config)#event manager environment _email_from [email protected] NetsTuts_R1(config)#event manager environment _email_to [email protected] NetsTuts_R1(config)#event manager environment _log_dir flash:/sla-logs/ ! ── Create the log directory on flash ───────────────────── NetsTuts_R1#mkdir flash:/sla-logs Create directory filename [sla-logs]? [Enter] Created dir flash:/sla-logs ! ── Verify NTP is synchronised — timestamps must be accurate ! ── for syslog correlation during outage post-mortems ───── NetsTuts_R1#show ntp status | include Clock Clock is synchronized, stratum 2, reference is 10.0.0.200
show ntp
status before deploying probes. For NTP configuration, see
NTP Synchronisation. For the full
EEM prerequisites explanation, see
EEM — Embedded Event
Manager Scripting.
4. Step 2 — ICMP Echo Probes for WAN Gateway Monitoring
ICMP echo probes send periodic pings from a specific source interface
to the ISP gateway. Using source-interface ensures the
probe tests the actual WAN path and is sourced from the correct
interface — not just the best-path reachability from the router's
routing table.
! ══════════════════════════════════════════════════════════ ! IP SLA 1 — ISP-A gateway reachability (primary WAN) ! ══════════════════════════════════════════════════════════ NetsTuts_R1(config)#ip sla 1 NetsTuts_R1(config-ip-sla)# icmp-echo 203.0.113.1 source-interface GigabitEthernet0/0 NetsTuts_R1(config-ip-sla-echo)# frequency 30 NetsTuts_R1(config-ip-sla-echo)# timeout 5000 NetsTuts_R1(config-ip-sla-echo)# threshold 2000 NetsTuts_R1(config-ip-sla-echo)# tag ISP-A-GW-MONITOR NetsTuts_R1(config-ip-sla-echo)#exit NetsTuts_R1(config)#ip sla schedule 1 life forever start-time now ! ══════════════════════════════════════════════════════════ ! IP SLA 2 — ISP-B gateway reachability (backup WAN) ! ══════════════════════════════════════════════════════════ NetsTuts_R1(config)#ip sla 2 NetsTuts_R1(config-ip-sla)# icmp-echo 198.51.100.1 source-interface GigabitEthernet0/1 NetsTuts_R1(config-ip-sla-echo)# frequency 30 NetsTuts_R1(config-ip-sla-echo)# timeout 5000 NetsTuts_R1(config-ip-sla-echo)# threshold 2000 NetsTuts_R1(config-ip-sla-echo)# tag ISP-B-GW-MONITOR NetsTuts_R1(config-ip-sla-echo)#exit NetsTuts_R1(config)#ip sla schedule 2 life forever start-time now
frequency 30
sends one probe every 30 seconds — a good balance between detection
speed and ICMP overhead on the WAN link. timeout 5000
declares the probe a failure if no response is received within 5,000
milliseconds (5 seconds). threshold 2000 marks the RTT
as over-threshold when it exceeds 2,000 ms — this feeds the
track N ip sla N state tracking type, which alerts on
latency degradation even before complete packet loss occurs.
source-interface is critical — without it, IOS sources
the probe from the best available exit interface. If the WAN link
fails but a LAN route to the gateway still exists (rare but possible),
the probe would succeed via the LAN — giving a false healthy result
for the WAN interface specifically.
Tracking Objects for ICMP Probes
! ── Track 1: reachability of ISP-A gateway ──────────────── NetsTuts_R1(config)#track 1 ip sla 1 reachability NetsTuts_R1(config-track)# delay down 10 up 10 NetsTuts_R1(config-track)#exit ! ── Track 2: reachability of ISP-B gateway ──────────────── NetsTuts_R1(config)#track 2 ip sla 2 reachability NetsTuts_R1(config-track)# delay down 10 up 10 NetsTuts_R1(config-track)#exit
delay down 10 up 10 setting introduces a 10-second
hold-down in both directions. Without this, a single dropped probe
(possible due to a momentary burst of congestion or a brief ICMP
rate-limit on the ISP router) would immediately flip the tracking
object to Down and fire the EEM alert. With delay down 10,
the tracking object only transitions to Down after the probe has
been failing continuously for 10 seconds — approximately one missed
probe cycle at frequency 30. The delay up 10
prevents a flapping link from generating rapid successive Down/Up
alert pairs before it has truly stabilised. For detailed IP SLA and
tracking configuration, see
IP SLA Configuration
& Tracking.
5. Step 3 — UDP Jitter Probe for Voice Quality Monitoring
UDP jitter probes measure the metrics that matter for voice and video quality: jitter (delay variation), packet loss, and out-of-order delivery. Unlike ICMP echo, UDP jitter requires a Cisco IP SLA Responder on the target device. The responder timestamps probe packets with hardware-precision clocks, enabling accurate one-way delay measurement.
Configure the Responder on the Branch Router
! ── On Branch_Router — enable IP SLA Responder ──────────── Branch_Router>en Branch_Router#conf t Branch_Router(config)#ip sla responder Branch_Router(config)#ip sla responder udp-echo ipaddress 10.10.0.1 port 5000 Branch_Router(config)#end Branch_Router#wr
Configure the UDP Jitter Probe on NetsTuts_R1
! ══════════════════════════════════════════════════════════ ! IP SLA 3 — UDP jitter to branch (voice quality) ! ══════════════════════════════════════════════════════════ NetsTuts_R1(config)#ip sla 3 NetsTuts_R1(config-ip-sla)# udp-jitter 10.10.0.1 5000 source-ip 10.0.0.1 \ source-port 5001 num-packets 20 interval 20 NetsTuts_R1(config-ip-sla-jitter)# frequency 60 NetsTuts_R1(config-ip-sla-jitter)# timeout 5000 NetsTuts_R1(config-ip-sla-jitter)# threshold 150 NetsTuts_R1(config-ip-sla-jitter)# rtt-threshold 100 NetsTuts_R1(config-ip-sla-jitter)# mos-threshold 3.60 NetsTuts_R1(config-ip-sla-jitter)# tag BRANCH-VOIP-MONITOR NetsTuts_R1(config-ip-sla-jitter)#exit NetsTuts_R1(config)#ip sla schedule 3 life forever start-time now ! ── Track 3: state of jitter probe ─────────────────────── ! ── Uses "state" not "reachability" — detects threshold ! ── violations even without complete packet loss ────────── NetsTuts_R1(config)#track 3 ip sla 3 state NetsTuts_R1(config-track)# delay down 15 up 30 NetsTuts_R1(config-track)#exit
num-packets 20 test packets
spaced interval 20 milliseconds apart — simulating a
stream of RTP voice packets. rtt-threshold 100 marks
the operation over-threshold if the average RTT exceeds 100 ms —
the ITU-T G.114 recommendation for one-way voice delay is 150 ms,
so a 100 ms RTT threshold provides early warning before voice quality
degrades. mos-threshold 3.60 generates a threshold
violation if the calculated MOS (Mean Opinion Score) drops below
3.60 — a MOS below 3.5 is considered unacceptable for VoIP.
For QoS configuration that protects voice traffic on the WAN link,
see QoS Overview.
delay up 30 is longer than delay down 15
because voice quality must stabilise for 30 seconds before declaring
recovery — a brief improvement followed by another degradation would
otherwise generate rapid Down/Up alert pairs.
6. Step 4 — HTTP and DNS Application Probes
ICMP and UDP probes test Layer 3/4 connectivity. HTTP and DNS probes test the application layer — a server can be pingable while its web service or DNS resolver is down. These probes catch application failures that ICMP monitoring misses entirely.
! ══════════════════════════════════════════════════════════ ! IP SLA 4 — HTTP GET to internal web server ! ══════════════════════════════════════════════════════════ NetsTuts_R1(config)#ip sla 4 NetsTuts_R1(config-ip-sla)# http get http://10.0.0.50/health source-ip 10.0.0.1 NetsTuts_R1(config-ip-sla-http)# frequency 60 NetsTuts_R1(config-ip-sla-http)# timeout 10000 NetsTuts_R1(config-ip-sla-http)# threshold 5000 NetsTuts_R1(config-ip-sla-http)# tag WEBSERVER-HTTP-MONITOR NetsTuts_R1(config-ip-sla-http)#exit NetsTuts_R1(config)#ip sla schedule 4 life forever start-time now ! ── Track 4: reachability of HTTP probe ─────────────────── NetsTuts_R1(config)#track 4 ip sla 4 reachability NetsTuts_R1(config-track)# delay down 15 up 15 NetsTuts_R1(config-track)#exit ! ══════════════════════════════════════════════════════════ ! IP SLA 5 — DNS resolution test ! ══════════════════════════════════════════════════════════ NetsTuts_R1(config)#ip sla 5 NetsTuts_R1(config-ip-sla)# dns netstuts.com name-server 10.0.0.53 \ source-ip 10.0.0.1 NetsTuts_R1(config-ip-sla-dns)# frequency 60 NetsTuts_R1(config-ip-sla-dns)# timeout 5000 NetsTuts_R1(config-ip-sla-dns)# tag DNS-RESOLVER-MONITOR NetsTuts_R1(config-ip-sla-dns)#exit NetsTuts_R1(config)#ip sla schedule 5 life forever start-time now ! ── Track 5: reachability of DNS probe ─────────────────── NetsTuts_R1(config)#track 5 ip sla 5 reachability NetsTuts_R1(config-track)# delay down 15 up 15 NetsTuts_R1(config-track)#exit
/health — a lightweight status
endpoint that returns HTTP 200 if the application is running. A full
page fetch would work but wastes bandwidth on every probe cycle.
If the web server is responding but the application is down (returning
HTTP 500), the IP SLA HTTP probe detects this because it checks
for a successful HTTP response code — not just TCP connectivity.
The DNS probe resolves netstuts.com against the specific
internal DNS server 10.0.0.53 — it will fail if the
DNS server is unreachable or if the resolver cannot resolve the name,
but will succeed even if external DNS is down (as long as the internal
resolver is healthy). For static routing configuration that uses
tracking objects for WAN failover, see
Static Routing Configuration.
7. Step 5 — EEM Applets: Down Alert and Up Recovery Pairs
Each monitoring target needs two applets — one that fires when the tracking object goes Down (alert) and one that fires when it returns to Up (recovery notification). Without the recovery applet, the NOC team has no automated confirmation that an outage has ended and must manually verify resolution.
ISP-A Gateway — Down and Up Applets
! ══════════════════════════════════════════════════════════ ! ISPA-GW-DOWN — fires when track 1 transitions to Down ! ══════════════════════════════════════════════════════════ NetsTuts_R1(config)#event manager applet ISPA-GW-DOWN NetsTuts_R1(config-applet)# description "Alert: ISP-A gateway unreachable" NetsTuts_R1(config-applet)# event track 1 state down NetsTuts_R1(config-applet)# maxrun 90 ! ── ACTION 1: Critical syslog alert ────────────────────── NetsTuts_R1(config-applet)# action 1.0 syslog priority critical \ msg "*** OUTAGE *** ISP-A gateway 203.0.113.1 UNREACHABLE on $_hostname — SLA probe failing" ! ── ACTION 2: Capture SLA statistics at moment of failure ─ NetsTuts_R1(config-applet)# action 2.0 cli command "enable" NetsTuts_R1(config-applet)# action 2.1 cli command \ "show ip sla statistics 1 | redirect flash:/sla-logs/ispa-failure.txt" ! ── ACTION 3: Capture interface state ──────────────────── NetsTuts_R1(config-applet)# action 3.0 cli command \ "show interfaces GigabitEthernet0/0 | redirect flash:/sla-logs/ispa-intf.txt" ! ── ACTION 4: Capture routing table — confirm failover ──── NetsTuts_R1(config-applet)# action 4.0 cli command \ "show ip route | redirect flash:/sla-logs/ispa-route.txt" ! ── ACTION 5: Email NOC team ────────────────────────────── NetsTuts_R1(config-applet)# action 5.0 mail server "$_email_server" \ to "$_email_to" \ from "$_email_from" \ subject "*** OUTAGE: ISP-A Gateway DOWN on $_hostname ***" \ body "ALERT: IP SLA probe 1 reports ISP-A gateway 203.0.113.1 \ is UNREACHABLE from $_hostname. \ Diagnostics saved to flash:/sla-logs/. \ Verify routing failover to ISP-B is active. \ Check show ip route on the router." NetsTuts_R1(config-applet)#exit ! ══════════════════════════════════════════════════════════ ! ISPA-GW-UP — fires when track 1 transitions back to Up ! ══════════════════════════════════════════════════════════ NetsTuts_R1(config)#event manager applet ISPA-GW-UP NetsTuts_R1(config-applet)# description "Recovery: ISP-A gateway reachable again" NetsTuts_R1(config-applet)# event track 1 state up NetsTuts_R1(config-applet)# maxrun 60 ! ── ACTION 1: Informational syslog — recovery ───────────── NetsTuts_R1(config-applet)# action 1.0 syslog priority notice \ msg "*** RECOVERY *** ISP-A gateway 203.0.113.1 REACHABLE again on $_hostname" ! ── ACTION 2: Capture SLA statistics after recovery ─────── NetsTuts_R1(config-applet)# action 2.0 cli command "enable" NetsTuts_R1(config-applet)# action 2.1 cli command \ "show ip sla statistics 1 | redirect flash:/sla-logs/ispa-recovery.txt" ! ── ACTION 3: Email recovery notification ───────────────── NetsTuts_R1(config-applet)# action 3.0 mail server "$_email_server" \ to "$_email_to" \ from "$_email_from" \ subject "RECOVERY: ISP-A Gateway restored on $_hostname" \ body "RECOVERY: IP SLA probe 1 reports ISP-A gateway 203.0.113.1 \ is now REACHABLE from $_hostname. \ Verify primary routing has been restored. \ Check routing table to confirm ISP-A routes are active." NetsTuts_R1(config-applet)#exit
ISP-B Gateway — Down and Up Applets
NetsTuts_R1(config)#event manager applet ISPB-GW-DOWN NetsTuts_R1(config-applet)# description "Alert: ISP-B gateway unreachable" NetsTuts_R1(config-applet)# event track 2 state down NetsTuts_R1(config-applet)# maxrun 90 NetsTuts_R1(config-applet)# action 1.0 syslog priority critical \ msg "*** OUTAGE *** ISP-B gateway 198.51.100.1 UNREACHABLE on $_hostname" NetsTuts_R1(config-applet)# action 2.0 cli command "enable" NetsTuts_R1(config-applet)# action 2.1 cli command \ "show ip sla statistics 2 | redirect flash:/sla-logs/ispb-failure.txt" NetsTuts_R1(config-applet)# action 3.0 cli command \ "show interfaces GigabitEthernet0/1 | redirect flash:/sla-logs/ispb-intf.txt" NetsTuts_R1(config-applet)# action 4.0 mail server "$_email_server" \ to "$_email_to" \ from "$_email_from" \ subject "OUTAGE: ISP-B Gateway DOWN on $_hostname" \ body "IP SLA probe 2 reports ISP-B gateway 198.51.100.1 \ is UNREACHABLE from $_hostname." NetsTuts_R1(config-applet)#exit NetsTuts_R1(config)#event manager applet ISPB-GW-UP NetsTuts_R1(config-applet)# description "Recovery: ISP-B gateway reachable again" NetsTuts_R1(config-applet)# event track 2 state up NetsTuts_R1(config-applet)# maxrun 30 NetsTuts_R1(config-applet)# action 1.0 syslog priority notice \ msg "*** RECOVERY *** ISP-B gateway 198.51.100.1 REACHABLE again on $_hostname" NetsTuts_R1(config-applet)# action 2.0 mail server "$_email_server" \ to "$_email_to" \ from "$_email_from" \ subject "RECOVERY: ISP-B Gateway restored on $_hostname" \ body "IP SLA probe 2 reports ISP-B gateway 198.51.100.1 \ is REACHABLE from $_hostname." NetsTuts_R1(config-applet)#exit
Branch Jitter — Down and Up Applets
NetsTuts_R1(config)#event manager applet BRANCH-JITTER-DOWN NetsTuts_R1(config-applet)# description "Alert: Branch voice quality degraded" NetsTuts_R1(config-applet)# event track 3 state down NetsTuts_R1(config-applet)# maxrun 90 NetsTuts_R1(config-applet)# action 1.0 syslog priority critical \ msg "*** VOIP DEGRADED *** Branch UDP jitter probe failing on $_hostname — check WAN QoS" NetsTuts_R1(config-applet)# action 2.0 cli command "enable" NetsTuts_R1(config-applet)# action 2.1 cli command \ "show ip sla statistics 3 details | redirect flash:/sla-logs/branch-jitter.txt" NetsTuts_R1(config-applet)# action 3.0 cli command \ "show policy-map interface GigabitEthernet0/0 | \ redirect flash:/sla-logs/branch-qos.txt" NetsTuts_R1(config-applet)# action 4.0 mail server "$_email_server" \ to "$_email_to" \ from "$_email_from" \ subject "VOIP QUALITY ALERT: Branch jitter threshold exceeded on $_hostname" \ body "UDP jitter probe 3 to branch (10.10.0.1) reports threshold \ violation on $_hostname. VoIP quality may be degraded. \ Check WAN QoS policy and interface utilisation." NetsTuts_R1(config-applet)#exit NetsTuts_R1(config)#event manager applet BRANCH-JITTER-UP NetsTuts_R1(config-applet)# description "Recovery: Branch voice quality restored" NetsTuts_R1(config-applet)# event track 3 state up NetsTuts_R1(config-applet)# maxrun 30 NetsTuts_R1(config-applet)# action 1.0 syslog priority notice \ msg "*** VOIP RESTORED *** Branch jitter probe back in threshold on $_hostname" NetsTuts_R1(config-applet)# action 2.0 cli command "enable" NetsTuts_R1(config-applet)# action 2.1 cli command \ "show ip sla statistics 3 | redirect flash:/sla-logs/branch-jitter-recovery.txt" NetsTuts_R1(config-applet)#exit
Web Server — Down and Up Applets
NetsTuts_R1(config)#event manager applet WEBSERVER-DOWN NetsTuts_R1(config-applet)# description "Alert: Internal web server HTTP probe failing" NetsTuts_R1(config-applet)# event track 4 state down NetsTuts_R1(config-applet)# maxrun 60 NetsTuts_R1(config-applet)# action 1.0 syslog priority critical \ msg "*** OUTAGE *** Web server HTTP probe FAILING on $_hostname — http://10.0.0.50" NetsTuts_R1(config-applet)# action 2.0 cli command "enable" NetsTuts_R1(config-applet)# action 2.1 cli command \ "show ip sla statistics 4 | redirect flash:/sla-logs/webserver-failure.txt" NetsTuts_R1(config-applet)# action 3.0 mail server "$_email_server" \ to "$_email_to" \ from "$_email_from" \ subject "OUTAGE: Web server (10.0.0.50) DOWN on $_hostname" \ body "IP SLA HTTP probe 4 cannot reach http://10.0.0.50/health. \ Server may be down or the application is not responding. \ Escalate to the application team." NetsTuts_R1(config-applet)#exit NetsTuts_R1(config)#event manager applet WEBSERVER-UP NetsTuts_R1(config-applet)# description "Recovery: Web server HTTP responding again" NetsTuts_R1(config-applet)# event track 4 state up NetsTuts_R1(config-applet)# maxrun 30 NetsTuts_R1(config-applet)# action 1.0 syslog priority notice \ msg "*** RECOVERY *** Web server HTTP probe SUCCEEDING on $_hostname" NetsTuts_R1(config-applet)# action 2.0 mail server "$_email_server" \ to "$_email_to" \ from "$_email_from" \ subject "RECOVERY: Web server (10.0.0.50) restored on $_hostname" \ body "IP SLA HTTP probe 4 reports http://10.0.0.50/health is \ responding successfully. Application appears to be restored." NetsTuts_R1(config-applet)#exit NetsTuts_R1(config)#end NetsTuts_R1#wr
event track N state up recovery applet is as
important as the down alert. Without it, the NOC team must either
manually poll the router for track state or wait for the next
monitoring cycle on their NMS to confirm resolution. An automated
recovery notification closes the incident loop: the on-call engineer
receives the down alert, works the issue, and receives the recovery
alert — no manual verification step needed. For OSPF deployments
where the tracking object also controls route injection or
redistribution, recovery is especially critical to confirm the
primary route has been re-advertised — see
OSPF Single-Area Configuration.
For HSRP/FHRP integration with tracking, see
FHRP — HSRP, VRRP & GLBP
and HSRP.
8. Step 6 — Advanced: RTT Threshold Alerting (Degradation Before Failure)
Reachability probes alert only when a target becomes completely unreachable. RTT threshold alerting goes further — it generates an alert when the link is still up but latency has degraded to a level that impacts applications. This gives the NOC team early warning before users start complaining.
! ══════════════════════════════════════════════════════════ ! IP SLA 6 — ISP-A with RTT threshold alerting ! Alerts if RTT exceeds 100ms even if probe still succeeds ! ══════════════════════════════════════════════════════════ NetsTuts_R1(config)#ip sla 6 NetsTuts_R1(config-ip-sla)# icmp-echo 203.0.113.1 source-interface GigabitEthernet0/0 NetsTuts_R1(config-ip-sla-echo)# frequency 30 NetsTuts_R1(config-ip-sla-echo)# timeout 5000 NetsTuts_R1(config-ip-sla-echo)# threshold 100 NetsTuts_R1(config-ip-sla-echo)# tag ISP-A-LATENCY-MONITOR NetsTuts_R1(config-ip-sla-echo)#exit NetsTuts_R1(config)#ip sla schedule 6 life forever start-time now ! ── Track 6 on STATE (not reachability) ! ── "state" fires when probe is over-threshold, even if ! ── the probe technically succeeds (not a timeout) NetsTuts_R1(config)#track 6 ip sla 6 state NetsTuts_R1(config-track)# delay down 20 up 30 NetsTuts_R1(config-track)#exit ! ── EEM applet for latency degradation alert ────────────── NetsTuts_R1(config)#event manager applet ISPA-LATENCY-HIGH NetsTuts_R1(config-applet)# description "Alert: ISP-A RTT above 100ms threshold" NetsTuts_R1(config-applet)# event track 6 state down NetsTuts_R1(config-applet)# maxrun 60 NetsTuts_R1(config-applet)# action 1.0 syslog priority warning \ msg "*** LATENCY WARNING *** ISP-A RTT exceeded 100ms threshold on $_hostname — link degraded" NetsTuts_R1(config-applet)# action 2.0 cli command "enable" NetsTuts_R1(config-applet)# action 2.1 cli command \ "show ip sla statistics 6 details | redirect flash:/sla-logs/ispa-latency.txt" NetsTuts_R1(config-applet)#exit NetsTuts_R1(config)#event manager applet ISPA-LATENCY-NORMAL NetsTuts_R1(config-applet)# description "Recovery: ISP-A RTT back below threshold" NetsTuts_R1(config-applet)# event track 6 state up NetsTuts_R1(config-applet)# maxrun 30 NetsTuts_R1(config-applet)# action 1.0 syslog priority notice \ msg "*** LATENCY NORMAL *** ISP-A RTT back below 100ms threshold on $_hostname" NetsTuts_R1(config-applet)#exit
track N ip sla N reachability
and track N ip sla N state is subtle but important for
latency alerting. Reachability is binary: the probe
either gets a response within the timeout window or it does not.
A probe that takes 4,900 ms to respond (still within the 5,000 ms
timeout) is considered "reachable" — no alert fires even though the
WAN is almost unusable. State incorporates the
threshold value: the probe is considered over-threshold
when the RTT exceeds the configured threshold, and the tracking
object transitions to Down even though technically the probe is
still receiving responses. This pattern — reachability probe for
outage alerting, state probe for degradation alerting — gives
two distinct alert tiers: Warning (high latency) and Critical
(complete outage).
9. Verification
show ip sla statistics — Per-Probe Results
NetsTuts_R1#show ip sla statistics
IPSLAs Latest Operation Statistics
IPSLA operation id: 1
Latest RTT: 8 milliseconds
Latest operation start time: 14:35:30 UTC Wed Oct 16 2024
Latest operation return code: OK
Number of successes: 142
Number of failures: 0
Operation time to live: Forever
IPSLA operation id: 2
Latest RTT: 12 milliseconds
Latest operation start time: 14:35:33 UTC Wed Oct 16 2024
Latest operation return code: OK
Number of successes: 141
Number of failures: 0
Operation time to live: Forever
IPSLA operation id: 3
Latest RTT: 24 milliseconds
Latest operation start time: 14:35:00 UTC Wed Oct 16 2024
Latest operation return code: OK
Number of successes: 71
Number of failures: 0
Operation time to live: Forever
IPSLA operation id: 4
Latest RTT: 87 milliseconds
Latest operation start time: 14:35:00 UTC Wed Oct 16 2024
Latest operation return code: OK
Number of successes: 70
Number of failures: 0
Operation time to live: Forever
return code: Timeout and incrementing failure counts.
show ip sla statistics details — Rich Jitter Metrics
NetsTuts_R1#show ip sla statistics 3 details
IPSLAs Latest Operation Statistics
IPSLA operation id: 3
Type of operation: UDP Jitter
Latest RTT: 24 ms
Latest operation start time: 14:35:00 UTC Wed Oct 16 2024
Latest operation return code: OK
RTT Values:
Number Of RTT: 20 RTT Min/Avg/Max: 22/24/31 milliseconds
Latency one-way time:
Number of Latency one-way Samples: 20
Source to Destination Latency one way Min/Avg/Max: 9/11/14 ms
Destination to Source Latency one way Min/Avg/Max: 12/13/17 ms
Jitter Time:
Num of SD Jitter Samples: 19
Num of DS Jitter Samples: 19
Source to Destination Jitter Min/Avg/Max: 0/1/4 ms
Destination to Source Jitter Min/Avg/Max: 0/1/3 ms
Packet Loss Values:
Loss Source to Destination: 0 Loss Destination to Source: 0
Out Of Sequence: 0 Tail Drop: 0
Skipped: 0 Late Arrival: 0
Voice Score Values:
Calculated Planning Impairment Factor (ICPIF): 0
MOS score: 4.40
Number of successes: 71
Number of failures: 0
Operation time to live: Forever
show track — Tracking Object States
NetsTuts_R1#show track
Track 1
IP SLA 1 Reachability
Reachability is Up
2 changes, last change 00:47:23
Latest operation return code: OK
Latest RTT (millisecs) 8
Tracked by:
ISPA-GW-DOWN (EEM)
ISPA-GW-UP (EEM)
Track 2
IP SLA 2 Reachability
Reachability is Up
1 change, last change 02:15:44
Track 3
IP SLA 3 State
State is Up
3 changes, last change 00:12:05
Track 4
IP SLA 4 Reachability
Reachability is Up
1 change, last change 04:30:10
Track 5
IP SLA 5 Reachability
Reachability is Up
1 change, last change 04:30:12
Track 6
IP SLA 6 State
State is Up
4 changes, last change 00:05:32
show ip sla statistics aggregated — Historical Performance
NetsTuts_R1#show ip sla statistics aggregated 1
IPSLAs Aggregated Statistics
IPSLA operation id: 1
Start Time Index: 14:00:00 UTC Wed Oct 16 2024
Aggregation interval: 900 seconds (15 minutes)
Round-Trip-Time (RTT) Values
Num of Measurements: 30 Min RTT: 7 ms
Max RTT: 145 ms Avg RTT: 9 ms
Over thresholds: 2 (2 probes exceeded 2000ms threshold)
Number of successes: 28
Number of failures: 2
Completion Time: 14:15:00 UTC Wed Oct 16 2024
show ip sla statistics aggregated shows the
last 15-minute (configurable) window of probe results. This
reveals intermittent problems that the instantaneous
show ip sla statistics misses — in this example,
2 of 30 probes in the last 15 minutes failed (Number of
failures: 2) and 2 exceeded the 2,000 ms threshold.
The current probe shows OK, but the aggregated history shows
the link is experiencing intermittent connectivity issues.
This is the difference between point-in-time monitoring and
trend analysis.
show logging — Confirm Alert Flow
NetsTuts_R1#show logging | include SLA\|OUTAGE\|RECOVERY\|HA_EM Oct 16 14:32:01: %TRACK-6-STATE: 1 ip sla 1 reachability Up->Down Oct 16 14:32:11: %HA_EM-2-LOG: ISPA-GW-DOWN: *** OUTAGE *** ISP-A gateway \ 203.0.113.1 UNREACHABLE on NetsTuts_R1 — SLA probe failing Oct 16 14:32:12: %HA_EM-6-LOG: ISPA-GW-DOWN: diagnostic files saved to \ flash:/sla-logs/ Oct 16 14:38:45: %TRACK-6-STATE: 1 ip sla 1 reachability Down->Up Oct 16 14:38:55: %HA_EM-5-LOG: ISPA-GW-UP: *** RECOVERY *** ISP-A gateway \ 203.0.113.1 REACHABLE again on NetsTuts_R1
Up->Down (IOS-generated
message), 10 seconds later (the delay down 10
hold-down has elapsed) the EEM applet ISPA-GW-DOWN fires at
14:32:11 and generates the CRITICAL alert. At 14:38:45 the
tracking object transitions Down->Up and 10 seconds
later the ISPA-GW-UP recovery applet fires. The outage lasted
approximately 6 minutes and 44 seconds — this timeline is now
permanently recorded in the syslog and the diagnostic files on
flash allow post-mortem analysis of what the router's state was
at the moment of failure. For forwarding these alerts to a central
server, see Syslog Server
Configuration.
Verification Command Summary
| Command | What It Shows | Primary Use |
|---|---|---|
show ip sla statistics |
All probes — latest RTT, return code (OK/Timeout/Error), success and failure counts since last reset | Instant health check — confirm all probes are returning OK with zero or low failure counts |
show ip sla statistics [N] details |
Single probe — full detail including per-direction RTT, jitter, packet loss, MOS, threshold violations | Deep-dive on a specific probe, especially UDP jitter — verify voice quality metrics |
show ip sla statistics aggregated [N] |
15-minute aggregated window — min/avg/max RTT, total successes/failures, over-threshold count | Identify intermittent issues that the current probe misses — reveals patterns over time |
show ip sla configuration [N] |
Full probe configuration — target IP, source, frequency, timeout, threshold, tag, schedule | Verify probe is configured correctly — confirm source interface, frequency, and threshold values |
show track |
All tracking objects — current state (Up/Down), change count, last change time, EEM subscribers | Confirm tracking objects are Up and EEM applets are registered. High change count indicates a flapping probe |
show track [N] |
Single tracking object detail — IP SLA operation, reachability/state type, delay settings | Verify delay down/up settings are correct and confirm the linked IP SLA operation number |
show event manager policy registered |
All EEM applets — name, event type (track), registered track number, registration time | Confirm all down and up applets are registered against the correct track object numbers |
show event manager history events |
EEM execution history — applet name, event type, execution time | Verify applets fired when expected. Cross-reference with show logging timestamps |
show logging | include TRACK\|HA_EM |
All tracking state changes and EEM syslog actions in the log buffer | Complete timeline — correlate track state transitions with EEM alert timestamps |
dir flash:/sla-logs/ |
Diagnostic files written by EEM actions | Confirm files are being created at failure events. Use more flash:/sla-logs/[file].txt to review captured output |
10. Troubleshooting IP SLA + EEM Monitoring
| Problem | Symptom | Cause | Fix |
|---|---|---|---|
| Probe shows continuous failures but target is reachable | show ip sla statistics 1 shows return code: Timeout and incrementing failure count even though manual pings to the target succeed |
The probe is not using source-interface and is being sourced from a different interface than intended, or the target device rate-limits ICMP causing the probe to time out while manual pings from the router succeed because they use a shorter timeout. Alternatively, the ISP gateway specifically blocks ICMP from some source IPs but not the router's loopback |
Add source-interface GigabitEthernet0/0 to the probe configuration — this forces the probe to use the WAN interface IP as its source, matching the exact path to the gateway. Verify the exact source IP used: show ip sla configuration 1 shows the source IP being used. If rate-limiting is the issue, increase the frequency to 60 seconds to reduce ICMP rate — or switch to tcp-connect probe on a port the gateway accepts |
| EEM applet does not fire when track state changes | show track shows track state is Down, but show event manager history events shows no execution of the ISPA-GW-DOWN applet |
The EEM applet is registered against the wrong track object number — the applet says event track 2 state down but the tracking object for ISP-A is track 1. Or the applet event clause says state down but the tracking object is using reachability type (which generates a different event notification) |
Run show event manager policy registered — confirm the applet shows the correct track number and state. Run show track 1 — confirm the track type (Reachability vs State) matches the applet's expectation. The EEM event track N state down works with both reachability and state tracking objects — "state down" means the object transitioned to the Down state regardless of tracking type. Re-check the track number in the applet matches the track number shown in show track |
| Track object flapping — repeated Down/Up transitions | show track 1 shows a very high change count (50+ changes in an hour). EEM applet fires repeatedly, flooding syslog and the NOC inbox with alternating OUTAGE/RECOVERY emails |
The track delay values are too short — a single dropped probe immediately triggers a Down transition, the next successful probe triggers Up, and so on. Or the WAN link is genuinely unstable (physical layer issue, ISP congestion) | Increase the track delay values: track 1 ip sla 1 reachability → delay down 30 up 60. This requires 30 consecutive seconds of failure before declaring Down (approximately one missed probe at frequency 30), and 60 seconds of continuous success before declaring recovery. Add ratelimit 600 to the EEM event clause as an additional protection. Investigate the underlying WAN stability separately with show ip sla statistics aggregated 1 to see the failure pattern |
UDP jitter probe shows return code: Busy or No Connection |
show ip sla statistics 3 shows return code other than OK — specifically Busy, No Connection, or Timeout |
Busy means the responder is not enabled or not listening on the configured port on the target router. No Connection means IP connectivity exists but the responder is not accepting UDP on port 5000. Timeout means no response at all — possible if the probe can reach the router but the responder is not running |
Verify the responder is enabled on the branch router: SSH to Branch_Router and run show ip sla responder — it should show UDP responder listening on port 5000. If not, re-configure: ip sla responder and ip sla responder udp-echo ipaddress 10.10.0.1 port 5000. Confirm the ACL on the branch router does not block UDP 5000. Also verify the source port configured on the probe does not conflict with other probes (each probe needs a unique source port) |
| IP SLA probe stops running after a reload | After a router reload, show ip sla statistics shows old data but no new measurements — the probe is not generating new results |
The ip sla schedule was configured with a specific start time in the past (start-time 14:00:00) rather than start-time now or start-time after 0:0:5. After a reload, IOS sees the start time has already passed and does not restart the schedule. Alternatively, ip sla schedule was configured with a life value that has expired |
Reconfigure the schedule: ip sla schedule 1 life forever start-time now. The life forever ensures the probe never expires. start-time now restarts it immediately. Verify the probe is running after the schedule: show ip sla statistics 1 should show the Latest operation start time updating every frequency seconds |
| Alert fires for a target that has a planned maintenance window | A server is being patched and taken offline deliberately. The monitoring system fires OUTAGE alerts and emails the NOC every 30 seconds during the maintenance window — flooding the team with false positives they must manually suppress | No maintenance mode mechanism is built into the IP SLA + EEM monitoring by default. The tracking object transitions Down the moment the probe fails, regardless of whether the outage is planned or unplanned | For planned maintenance, temporarily suspend the IP SLA schedule: no ip sla schedule 4 before the maintenance window, then ip sla schedule 4 life forever start-time now after. Alternatively, add an EEM environment variable as a maintenance flag (event manager environment _maintenance 1) and add a conditional check in the applet: action 0.5 if $_maintenance eq 1 → action 0.6 exit to skip all further actions when maintenance mode is active |
Key Points & Exam Tips
- The complete IP SLA alerting pipeline has three layers: IP SLA probe (sends synthetic test traffic and measures results), Object Tracking (translates probe results into a binary Up/Down state with configurable delay), and EEM applet (fires on state transitions and executes alert actions). All three layers must be correctly configured for automated alerting to work.
- Always configure
source-interfaceon WAN gateway probes. Without it, IOS sources the probe from the best available exit interface — a failed WAN link that still has an alternate path may produce false-healthy results if the probe routes around the failure instead of through it. - The
track delay down [seconds] up [seconds]command prevents false alerts from single dropped probes.delay downrequires the probe to fail continuously for N seconds before declaring Down;delay uprequires continuous success for N seconds before declaring recovery. Size these to be slightly longer than one probe cycle at the configuredfrequency. - There are two tracking types for IP SLA objects:
reachability(Down when probe times out — complete failure only) andstate(Down when probe result exceeds the configuredthreshold— fires on latency degradation even without packet loss). Use both together for two-tier alerting: Warning on degradation, Critical on outage. - Always deploy paired applets — one on
event track N state downand one onevent track N state up. The down applet alerts on outage; the up applet confirms recovery. Without the recovery applet, the NOC team must manually verify resolution — defeating the purpose of automated monitoring. - UDP jitter probes require the Cisco IP SLA Responder (
ip sla responder) on the target device. The responder uses hardware timestamps for precise one-way delay measurement. Without the responder, UDP jitter probes returnBusyorNo Connectionreturn codes. show ip sla statisticsshows the current probe result (latest RTT, return code, cumulative success/failure counts).show ip sla statistics aggregatedshows the historical 15-minute window including min/avg/max RTT and over-threshold counts — essential for identifying intermittent problems that the instantaneous view misses.show trackis the primary operational command — it shows current state, change count, last change time, and which EEM applets are subscribed. A high change count on a tracking object indicates a flapping probe — investigate both the underlying path stability and the track delay values.- IP SLA schedules configured with a past
start-timedo not restart automatically after a reload. Always usestart-time noworstart-time after 0:0:5withlife foreverto ensure probes survive router reloads. - On the exam: know the three IP SLA probe types and whether they require a responder (ICMP — no; UDP jitter — yes; HTTP/DNS — no), the two tracking types (reachability vs state), the
delay down/uppurpose, and the EEMevent track N state down/upsyntax. For traffic-volume monitoring alongside SLA alerting, see NetFlow Configuration.