Context Management

OpenViber vibers are stateless between requests — no in-process memory carries across calls. The Board (or caller) owns session history and decides what to include in each request. The strategies below keep context useful and within model limits.

1. What Counts as Context

Context is everything sent to the model for a single run:

Layer	Examples	Typical Size
System prompt	Rules, tool definitions, skill instructions, runtime info	2-6K tokens
Personalization	`soul.md`, `user.md`, `memory.md` (see personalization.md)	1-3K tokens
Conversation history	Prior user/assistant turns, compacted summaries	Variable
Tool calls & results	Function calls, returned data, attachments	Often the largest contributor
Workspace files	Code, docs, configs selected for the task	Variable

Memory is separate from context. Memory lives on disk or in Board-managed stores (see memory.md), but only becomes context when explicitly injected into a request.

2. Context Budget

Each model has a fixed context window. The Board should maintain a budget breakdown:

┌─────────────────────────────────────────────────┐
│ Model context window (e.g., 128K tokens)        │
├─────────────────────────────────────────────────┤
│ System prompt + tools + skills     │  ~4K       │
│ Personalization (soul/user/memory) │  ~2K       │
│ Reserved for response              │  ~4K       │
│ ─────────────────────────────────────────────── │
│ Available for history + workspace  │  ~118K     │
└─────────────────────────────────────────────────┘

Board Diagnostics

The Board should expose simple diagnostics so operators know when compaction is needed:

Context breakdown: tokens used by each layer (system, tools, skills, history, files).
Per-file limits: indication when a workspace file was truncated.
Session gauge: current token usage vs. model limit.
Compaction hint: visual indicator when history exceeds 70% of available budget.

3. Compaction (Summarize Older History)

When the context window gets tight, the Board compacts older history into a summary and replaces the original turns.

Strategy

Keep recent turns intact — the last N turns (typically 4-8) stay verbatim for continuity.
Summarize older turns — replace everything before the keep-window with a structured summary.
Persist the summary — store it in Board-managed session history so future requests include it.

Summary Format

## Session Summary (compacted)
- User asked to build a landing page with dark theme.
- Created `index.html` with Tailwind CSS setup.
- Added hero section and responsive navigation.
- User approved the design; requested footer changes.
- Footer updated with three-column layout.

Trigger Modes

Mode	Behavior
Automatic	Board compacts when history exceeds the compaction threshold (e.g., 70% of available budget)
Manual	Operator explicitly requests compaction via Board UI or `/compact` command
Pre-flush	Memory flush happens first (see Section 5), then compaction runs

4. Pruning (Tool-Result Hygiene)

Tool outputs are often the largest token contributor. The Board should aggressively manage them:

Truncate oversized outputs — cap tool results at a configurable max (e.g., 8K tokens). Include a [truncated, {N} tokens omitted] marker.
Drop stale tool results — historical tool outputs beyond the keep-window can be replaced with a one-line summary: [tool: read_file("src/app.ts") → 247 lines, TypeScript].
Preserve verification evidence — keep tool results that serve as proof of task completion (screenshots, test output, build logs).

Example: Before and After Pruning

Before (12K tokens of tool results):

tool_result: read_file("src/components/App.tsx") → [full 500-line file contents]
tool_result: search_web("react hooks best practices") → [full search results]
tool_result: write_file("src/hooks/useAuth.ts") → [file written successfully]

After (200 tokens):

[tool: read_file("src/components/App.tsx") → 500 lines, React component]
[tool: search_web("react hooks best practices") → 8 results summarized]
tool_result: write_file("src/hooks/useAuth.ts") → [file written successfully]

5. Memory Flushes

Before compaction discards older history, a memory flush can capture durable insights into the memory system (see memory.md).

The flush extracts:

Decisions made — “User chose Tailwind over styled-components.”
Preferences discovered — “User prefers concise responses, no emojis.”
Key artifacts — “Landing page lives at src/routes/+page.svelte.”
Patterns observed — “This repo uses conventional commits.”

These notes are written to memory.md or the daily memory log and may be re-injected in future sessions without keeping the entire history.

6. Session Reset

If a session becomes too noisy or context-polluted:

Start a new session ID — clean slate in the Board.
Transfer summary — carry a short summary of the prior session’s outcome.
Re-inject memory — load relevant memory entries from memory.md.
Discard stale tool results — only bring forward conclusions, not raw outputs.

This keeps the daemon stateless while preserving continuity in the Board.

7. Handling Context Overflow

When context exceeds the model’s window despite compaction, the Board should follow the recovery path defined in error-handling.md:

Emergency compaction — aggressively summarize all but the last 2 turns.
Tool result purge — replace all historical tool results with one-line summaries.
File content drop — remove injected workspace files (they can be re-read by tools).
Session reset — as a last resort, start a fresh session with a transfer summary.

The daemon reports context overflow via task:error with error type context_overflow, giving the Board a chance to compact and retry.