AI Deflection dashboard

The Deflection dashboard answers one question: how much work are your bots actually saving you?

Find it under Analytics → AI Deflection. The page is admin-only and updates in real time — every bot conversation that completes contributes to the metrics on the next page load.

What gets measured

Everything on this page is scoped to conversations created in the selected window (7, 30, or 90 days). A conversation is "in the window" if its start time falls in the range, regardless of when it was resolved or handed off — so a conversation that started yesterday and resolves today still counts toward today's window.

Headline tiles

Bot Conversations — every conversation in the window where a bot was assigned. This is the denominator for the percentages that follow.
Resolved by Bot — conversations that ended in the resolved status AND the bot never invoked its human-handoff tool. These are the wins: the bot solved the customer's problem on its own.
Handoffs — conversations where the bot's handoff tool fired, regardless of what happened next. A handoff is unambiguously bot-initiated; manual reassignment by an agent does not count toward this metric.
Avg Bot Turns — average number of assistant messages the bot produced per conversation. A higher number isn't necessarily worse (some problems take more back-and-forth), but a sudden jump can indicate the bot is struggling to find an answer.
Claude Spend — total Anthropic API spend in the window across every bot. Counts main chat turns, the post-turn safety-net check, and conversation summaries. Non-bot Claude usage (mini-app generation, help chat, onboarding) is tracked separately and does not appear here.

Resolved-by-bot over time

Line chart of the resolved-by-bot percentage by day (or by week, on the 90-day view). Buckets with zero conversations show as gaps in the line — not as zero — so a quiet weekend doesn't make it look like the bot stopped working.

A clean upward trend usually means knowledge base updates are landing well. A drop coinciding with a product release usually means the bot needs to learn the new feature.

Per-bot breakdown

The same metrics, broken down by bot, sorted by conversation volume. The Spend column shows that bot's portion of Claude API cost in the window. Click a bot's name to jump to its configuration page if you want to tune its behavior.

A bot with high conversation volume but disproportionately high spend is usually one of two things: a verbose persona producing long replies, or a knowledge base that's forcing the model to read a lot of context every turn. Both are tunable.

Top handoff reasons

Each time a bot hands off, it provides a one-line reason — captured directly from the model's reasoning when it fired the handoff tool. The dashboard groups these reasons by exact text match and shows the top 10.

Heads up: since these are raw text strings, two semantically identical reasons phrased differently ("user asked for a human" vs. "customer wants to speak with someone") will show as two separate buckets. We'll add LLM clustering in a future release once enough volume makes the noise visible.

How to act on what you see

Resolved % is low → look at the top handoff reasons. If they cluster around a topic the bot should be able to handle (refunds, password resets, hours), update the bot's knowledge base or directives.
Handoff % is climbing week over week → check whether your knowledge base has stale answers, or whether new product changes haven't been reflected in the bot's training material.
Avg Bot Turns is unusually high → indicates the bot is going round and round before resolving. Often a sign of a knowledge gap or an ambiguous answer.
A specific bot is dragging the average down → use the per-bot table to isolate which bot needs attention; tune its persona or behaviors instead of changing settings tenant-wide.

A note on cost backfill

Claude usage tracking started on 2026-04-28. Conversations and API calls from before that date are not included in the Claude Spend tile or the per-bot Spend column. Everything else on the dashboard (conversation counts, resolved %, handoff %, turns) does include historical data.

If a recently-resolved conversation shows zero spend in its bot's row but did produce real API calls, check the conversation's created_at — if it's before 2026-04-28, that's expected.

Why we measure handoffs separately from "escalations"

Hydra tracks two different "I need a human" signals:

Handoff — the bot itself decided the conversation needed a human and called its handoff tool. This shows on this dashboard.
Escalation — an agent in your inbox clicked "Escalate to Ticket" to convert a conversation into a ticket. This is a human-driven decision and shows on the Reports → Operational Dashboard, not here.

Keeping them separate lets you measure bot quality (handoffs) independently of agent workload (escalations).