mirror of
https://github.com/sipeed/picoclaw.git
synced 2026-06-12 18:08:54 +00:00
refactor(agent): context boundary detection, proactive budget check, and safe compression
Separate context_window from max_tokens — they serve different purposes (input capacity vs output generation limit). The previous conflation caused premature summarization or missed compression triggers. Changes: - Add context_window field to AgentDefaults config (default: 4x max_tokens) - Extract boundary-safe truncation helpers (isSafeBoundary, findSafeBoundary) into context_budget.go — pure functions with no AgentLoop dependency - forceCompression: align split to safe boundary so tool-call sequences (assistant+ToolCalls → tool results) are never torn apart - summarizeSession: use findSafeBoundary instead of hardcoded keep-last-4 - estimateTokens: count ToolCalls arguments and ToolCallID metadata, not just Content — fixes systematic undercounting in tool-heavy sessions - Add proactive context budget check before LLM call in runAgentLoop, preventing 400 context-length errors instead of reacting to them - Add estimateToolDefsTokens for tool definition token cost Closes #556, closes #665 Ref #1439
This commit is contained in:
+12
-1
@@ -127,6 +127,17 @@ func NewAgentInstance(
|
||||
maxTokens = 8192
|
||||
}
|
||||
|
||||
contextWindow := defaults.ContextWindow
|
||||
if contextWindow == 0 {
|
||||
// Default heuristic: 4x the output token limit.
|
||||
// Most models have context windows well above their output limits
|
||||
// (e.g., GPT-4o 128k ctx / 16k out, Claude 200k ctx / 8k out).
|
||||
// 4x is a conservative lower bound that avoids premature
|
||||
// summarization while remaining safe — the reactive
|
||||
// forceCompression handles any overshoot.
|
||||
contextWindow = maxTokens * 4
|
||||
}
|
||||
|
||||
temperature := 0.7
|
||||
if defaults.Temperature != nil {
|
||||
temperature = *defaults.Temperature
|
||||
@@ -224,7 +235,7 @@ func NewAgentInstance(
|
||||
MaxTokens: maxTokens,
|
||||
Temperature: temperature,
|
||||
ThinkingLevel: thinkingLevel,
|
||||
ContextWindow: maxTokens,
|
||||
ContextWindow: contextWindow,
|
||||
SummarizeMessageThreshold: summarizeMessageThreshold,
|
||||
SummarizeTokenPercent: summarizeTokenPercent,
|
||||
Provider: provider,
|
||||
|
||||
Reference in New Issue
Block a user