Issues / #99
Bound per-session conversation history with rollover summarization
open
improvement
Project: nimsforest2
Reporter:
1 May 2026 18:47
Description
Conversations stored under `conversations.<nim>.<session>` in Soil grow unbounded (nova for one user is already 179,551 bytes). The assistant prompt is rebuilt by inlining the full transcript on every turn, so growth is linear forever and bumps into every downstream limit eventually:
- Linux execve argv size on tool-less nims (just hit on agentclaudecode v0.9.0 → issue #95)
- NATS default 1MB message ceiling on `agent.work.ai.*` leaves
- Claude context window and per-call cost
Fix structurally: per-session token budget (e.g. 50k–80k input tokens) with a rollover step that summarizes/compresses older turns into a running summary stored alongside the recent-turns tail. Cap on size, not count. Compress in the background when budget is exceeded so the next user turn is not blocked.
Decide:
- Where the cap is enforced (brain composer in nimsforest2 vs. each nim).
- What summarization strategy (LLM call vs. heuristic — LLM is fine if it runs once per overflow, not per turn).
- Whether existing oversized conversations get retroactively compressed or left alone (recommend a one-shot compaction job over Soil keys above the threshold).
Until this lands, every active long-running session is a future incident.
Relates: #95.
- Linux execve argv size on tool-less nims (just hit on agentclaudecode v0.9.0 → issue #95)
- NATS default 1MB message ceiling on `agent.work.ai.*` leaves
- Claude context window and per-call cost
Fix structurally: per-session token budget (e.g. 50k–80k input tokens) with a rollover step that summarizes/compresses older turns into a running summary stored alongside the recent-turns tail. Cap on size, not count. Compress in the background when budget is exceeded so the next user turn is not blocked.
Decide:
- Where the cap is enforced (brain composer in nimsforest2 vs. each nim).
- What summarization strategy (LLM call vs. heuristic — LLM is fine if it runs once per overflow, not per turn).
- Whether existing oversized conversations get retroactively compressed or left alone (recommend a one-shot compaction job over Soil keys above the threshold).
Until this lands, every active long-running session is a future incident.
Relates: #95.