picoclaw

mirror of https://github.com/sipeed/picoclaw.git synced 2026-06-12 18:08:54 +00:00

Author	SHA1	Message	Date
xiaoen	8034ee7be1	fix(agent): correct media token arithmetic and tool call double-counting Two estimation bugs fixed: 1. Media tokens were added to the chars accumulator before the chars2/5 conversion, resulting in 2562/5=102 tokens per item instead of 256. Fix: add media tokens directly to the final token count, bypassing the character-based heuristic. 2. estimateMessageTokens counted both tc.Name and tc.Function.Name for tool calls, but providers only send one (OpenAI-compat uses function.name, Anthropic uses tc.Name). Fix: count tc.Function.Name when Function is present, fall back to tc.Name only otherwise. Also fix i18n hint text: "auto-detect" was misleading — the backend uses a 4x max_tokens heuristic, not actual model detection.	2026-03-16 14:48:34 +08:00
xiaoen	639739cb85	refactor(agent): use Turn as the atomic unit for compression cut-off Introduce parseTurnBoundaries() which identifies each Turn start index in the session history. A Turn is a complete "user input → LLM iterations → final response" cycle (as defined in the agent refactor design #1316). findSafeBoundary now uses Turn boundaries instead of raw role-scanning, making the intent explicit: "find the nearest Turn boundary." forceCompression drops the oldest half of Turns (not arbitrary messages), which is simpler and more intuitive. The Turn-based approach naturally prevents splitting tool-call sequences since each Turn is atomic.	2026-03-16 14:48:34 +08:00
xiaoen	d5fdd5ebd2	fix(agent): include ReasoningContent and Media in token estimation estimateMessageTokens now counts ReasoningContent (extended thinking / chain-of-thought) which can be substantial and is persisted in session history. Media items get a fixed per-item overhead (256 tokens) since actual cost depends on provider-specific image tokenization.	2026-03-16 14:48:33 +08:00
xiaoen	9c82b0baa2	refactor(agent): context boundary detection, proactive budget check, and safe compression Separate context_window from max_tokens — they serve different purposes (input capacity vs output generation limit). The previous conflation caused premature summarization or missed compression triggers. Changes: - Add context_window field to AgentDefaults config (default: 4x max_tokens) - Extract boundary-safe truncation helpers (isSafeBoundary, findSafeBoundary) into context_budget.go — pure functions with no AgentLoop dependency - forceCompression: align split to safe boundary so tool-call sequences (assistant+ToolCalls → tool results) are never torn apart - summarizeSession: use findSafeBoundary instead of hardcoded keep-last-4 - estimateTokens: count ToolCalls arguments and ToolCallID metadata, not just Content — fixes systematic undercounting in tool-heavy sessions - Add proactive context budget check before LLM call in runAgentLoop, preventing 400 context-length errors instead of reacting to them - Add estimateToolDefsTokens for tool definition token cost Closes #556, closes #665 Ref #1439	2026-03-16 14:48:32 +08:00

4 Commits