See everything. Replay anything.
Every agent action is logged with what happened, what it cost, what context the agent had, and what it decided. You can replay any action to understand why. Debugging is grep, not vibes.
CLI-first observability
Full-stack observability from your terminal. Activity feed, cost breakdown, and full replay with a single command. Every action searchable, filterable, replayable.
$ capx activity --company acme-marketing --since 2h TIME AGENT ACTION COST STATUS 09:14 engineer playbook_run $1.42 completed 09:31 marketer content-write $0.94 completed 09:45 strategist content-review $0.38 approved 10:02 support ticket-triage $0.18 completed 10:15 support ticket-triage $0.18 escalated 10:30 engineer playbook_run $2.81 completed Total: $5.91 | Tasks: 6 | Escalations: 1 $ capx replay 09:14 --verbose Replaying engineer playbook_run at 09:14... Context: 4,200 tokens Tool calls: llm.prompt, code.run, shell.run Tests: 14/14 passing Rubric score: 0.87 (threshold: 0.80) Decision: auto-approved Cost breakdown: inference $1.12, compute $0.30
Cost intelligence
Granular cost tracking per agent, per task, per playbook run. Know exactly what you are spending and where the value is.
Webhook event log
Get notified the moment something needs attention. Six alert types cover failures, spend, escalations, and anomalies. Every event in structured log format.
Max retries exceeded on playbook_run. Last error: context window exceeded. State preserved at checkpoint #3.
Agent at 85% of daily spend cap ($42.50 / $50.00). Current task paused pending review.
Approval for TSK-043 pending > 2h. Escalation timeout is 4h. Second reminder sent to founder.
Hourly spend $4.23 exceeds 2x rolling average ($1.82). Breakdown: inference $3.54, compute $0.69.
Scheduled heartbeat at 08:30 delayed by 14s. Cause: cold start after 6h pause. Resumed normally.
Rubric score 0.61 below threshold 0.80 on content-write. Auto-retry triggered. Attempt 2/3.
See your agents in action.
Every action logged. Every cost tracked. Every decision replayable. Full-stack observability for autonomous agents.
