Issues / #100
Surface agent errors to the user instead of swallowing them in webchat SSE
open
improvement
Project: nimsforestwebchat
Reporter:
1 May 2026 18:47
Description
When a nim's brain returns an error today, the user sees nothing — webchat SSE just goes quiet. Reproduced during issue #95: nova hit `error calling brain: agent returned error` four times for Emilie, nothing rendered in the UI, she filed a ticket saying "can't talk to the sims". The investigation initially chased the wrong cause (iamnim cookie) because the failure mode was indistinguishable from "network slow / agent thinking".
Fix structurally: define a stable error contract on `song.webchat.<session_id>` so the frontend can render a visible "that nim failed — retry" bubble. The leaf should carry:
- A coarse, user-safe reason ("agent unavailable", "timed out", "message too large") — never internal stack traces.
- A correlation id pointing to the `agent.work.ai.*` work id so ops can grep server logs.
- An "is_terminal" flag so the frontend knows whether to stop the spinner.
Touches:
- Nim Runner in nimsforest2 (where brain errors are currently logged but not propagated as a song leaf).
- nimsforestwebchat SSE handler + frontend bubble component.
Relates: #95.
Fix structurally: define a stable error contract on `song.webchat.<session_id>` so the frontend can render a visible "that nim failed — retry" bubble. The leaf should carry:
- A coarse, user-safe reason ("agent unavailable", "timed out", "message too large") — never internal stack traces.
- A correlation id pointing to the `agent.work.ai.*` work id so ops can grep server logs.
- An "is_terminal" flag so the frontend knows whether to stop the spinner.
Touches:
- Nim Runner in nimsforest2 (where brain errors are currently logged but not propagated as a song leaf).
- nimsforestwebchat SSE handler + frontend bubble component.
Relates: #95.