Agentic Runbook Tools Compared: PagerDuty, incident.io, FireHydrant, Rootly, and 8 More (April 2026)
Capability matrix
| Vendor | Config format | K8s | AWS | MCP | Pricing signal | MTTR claim |
|---|---|---|---|---|---|---|
PagerDuty Runbook Automation formerly Rundeck | Jobs (YAML / UI) | Yes (via plugins) | Yes (via plugins) | Partial (2026 roadmap) | $125/user/mo | Up to 95% faster |
incident.io | Workflow builder (no-code/low-code) | Partial | Partial | No | Custom (contact sales) | Not published |
FireHydrant | Runbook builder (UI) | Partial | Partial | No | Custom | Not published |
Rootly | Workflow YAML + UI | Partial | Yes | No | Custom | Multi-faceted approach (no specific number) |
Shoreline.io | Shoreline language (DSL) | Yes (strong) | Yes | No | Custom | 75% MTTR reduction, 50% auto-remediation |
Kubiya | YAML + Terraform native | Yes (strong) | Yes | Yes | Custom | Not published |
Komodor Klaudia | Komodor UI + API | Yes (specialist) | Partial | No | Custom | 95% accuracy, 23-second MTTR on K8s |
xMatters (Everbridge) | Flow designer + YAML | Partial | Yes | No | Custom (enterprise) | Not published |
Resolve.ai | Resolve platform | Yes | Yes | No | Custom ($1B valuation) | 80% autonomous resolution target |
Traversal | Traversal API | Yes | Yes | No | Custom | 38% MTTR reduction at DigitalOcean (36,000 hrs/yr) |
Datadog Bits AI | Datadog workflows | Yes | Yes | Partial | Add-on to Datadog (contact) | 70-90% faster resolution |
AWS DevOps Agent (Bedrock AgentCore) | CDK / CloudFormation + MCP tools | Yes (EKS) | Yes (specialist) | Yes (native) | Usage-based (Bedrock model cost + AgentCore) | Hours to minutes (AWS blog) |
Vendor profiles
PagerDuty Runbook Automation
formerly Rundeck
Runbook execution + AIOps event correlation
Orgs already on PagerDuty wanting runbook automation alongside AIOps
$125/user/mo
incident.io
AI workflows, Slack-native incident management
Slack-first SRE teams with structured incident processes
Custom (contact sales)
FireHydrant
AI-assisted runbooks, service catalog-driven
Teams with complex service catalogs needing structured incident runbook management
Custom
Rootly
AI postmortem + RCA, Slack-native
Postmortem-heavy organisations needing AI-drafted RCA and knowledge reinforcement
Custom
Shoreline.io
Notebooks (interactive runbooks), 120+ pre-built
Kubernetes-heavy teams with high incident volume wanting pre-built remediation playbooks
Custom
Kubiya
Meta-agent orchestrating specialised agents
Platform engineering teams building agentic workflows with CI/CD, Terraform, and K8s
Custom
Komodor Klaudia
Kubernetes-focused AI SRE
Kubernetes-only shops wanting deep K8s context awareness
Custom
xMatters (Everbridge)
Enterprise IT operations, AI Agent
Enterprise IT operations teams with complex escalation trees and on-call management
Custom (enterprise)
Resolve.ai
Autonomous incident resolution
Organisations with aggressive automation ambitions and budget for enterprise tooling
Custom ($1B valuation)
Traversal
Academic ML-heavy, causal RCA
Complex distributed systems where causal inference RCA is the primary need
Custom
Datadog Bits AI
Native Datadog integration, HIPAA compliant
Existing Datadog customers wanting AI-augmented incident response without a new vendor
Add-on to Datadog (contact)
AWS DevOps Agent (Bedrock AgentCore)
Always-available AI SRE teammate on AWS
AWS-heavy orgs wanting native cross-account investigation and topology intelligence
Usage-based (Bedrock model cost + AgentCore)
How to evaluate: 7-question buyer checklist
Does it integrate with your existing stack? (Your observability tool, incident tool, and comms tool are the integration gates. Vendors that do not have native connectors require custom webhook work.)
Does it support your primary infrastructure? (Kubernetes-heavy teams should prioritise K8s-native agents. AWS-heavy teams should evaluate Bedrock AgentCore or Datadog Bits AI first.)
What is the audit trail and can it meet your compliance requirements? (SOC 2 and HIPAA teams should evaluate deterministic wrappers; CloudTrail immutability; whether the reasoning trace is exportable.)
What is the action approval model? (Start with require_human on all write actions. Auto-approve should be earned incrementally over 90+ days of accurate recommendations.)
What is the real pricing model? (Most vendors are custom-quote. Get per-seat, per-incident, and usage-based scenarios. Ask specifically about overage and data egress costs.)
What is the exit cost? (If you instrument 400 runbooks in vendor X's DSL, migration to vendor Y costs engineering time. Evaluate vendor lock-in before commitment.)
What is the vendor stability? (In a fast-moving category, vendor acquisitions and pivots are common. Resolve.ai, Traversal, and Neubird are all venture-backed; evaluate financial stability alongside feature maturity.)