* fix(tools): close resp.Body on retry cancel and cache http.Client instances
Fix resp.Body leak in DoRequestWithRetry where req.Body (request) was
incorrectly closed instead of resp.Body (response) on context cancel.
Cache http.Client on web search/fetch provider structs and channel
adapters (WeCom, LINE) to avoid per-call allocation overhead.
* fix(channels): preserve original http client timeouts for LINE and WeCom
Split LINE single 60s client into infoClient (10s) for bot info lookups
and apiClient (30s) for messaging API calls. Lower WeCom cached client
base timeout from 60s to 30s (matching uploadMedia), and ensure it is
always >= the configured ReplyTimeout so the per-request context
deadline remains the effective limit.
* refactor(tools): extract timeout consts and deduplicate WebFetchTool constructors
Address PR review feedback from xiaket:
- Define searchTimeout, perplexityTimeout, fetchTimeout, defaultMaxChars,
and maxRedirects as package-level consts instead of magic numbers.
- Remove misleading "No proxy" comment in NewWebFetchTool.
- Deduplicate NewWebFetchTool by delegating to NewWebFetchToolWithProxy.
* test(utils): add context cancellation test for DoRequestWithRetry
Verify that resp.Body is properly closed when the context is canceled
during retry sleep, covering the C8 resp.Body leak fix.
* fix(utils): close resp in test to satisfy bodyclose linter
* fix(utils): eliminate flakiness in context cancellation retry test
Synchronize cancellation using an onRoundTrip callback from the
transport wrapper instead of a timing-based context timeout. This
ensures the first client.Do completes before cancel fires, so
cancellation always hits during sleepWithCtx.
- WhatsApp Start(): use deferred cleanup to nil out c.client/c.container
and disconnect/close resources on any error after struct fields are
assigned, preventing stale references and double-close in Stop()
- handleReasoning: treat bus.ErrBusClosed as an expected condition
(DEBUG level) alongside context timeout/cancel, avoiding WARN noise
during normal shutdown
- WhatsApp Send(): detect unpaired state (Store.ID == nil) and return
ErrTemporary instead of attempting to send while QR login is pending
- handleReasoning: check the returned error type (DeadlineExceeded /
Canceled) instead of ctx.Err() to decide log level, so pubCtx
timeouts on a full bus are correctly classified as expected
- Test: fill bus with a short-timeout loop instead of hardcoding the
buffer size (64), making the test resilient to buffer size changes
- Use c.runCtx for GetQRChannel so the QR producer is canceled on Stop()
- Add atomic stopping guard to prevent wg.Add/wg.Wait race in eventHandler
- Make Stop() context-aware: disconnect client before waiting, respect ctx deadline
- Reduce reasoning publish log noise: use debug level for expected ctx errors
- Add test for handleReasoning when outbound bus is full (timeout path)
Add a 5-second timeout to handleReasoning's PublishOutbound call so
fire-and-forget goroutines do not block indefinitely when the outbound
bus channel is full. Reasoning output is best-effort; on timeout the
publish is abandoned with a warning log instead of holding the
goroutine alive.
Fixes goroutine leak introduced in #802.
* fix: remove redundant tools definitions from system prompt
Tools are already provided to the LLM via JSON schema through
ToProviderDefs(), so the text-based tools section in the system
prompt is redundant.
This removes the buildToolsSection() logic and the tools field
from ContextBuilder, reducing system prompt length while maintaining
the "ALWAYS use tools" rule reminder.
Fixes#731
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
* fix: correct spelling 'initialized' (was 'initialised')
---------
Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
Avoid rebuilding the entire system prompt on every BuildMessages() call
by caching the static portion (identity, bootstrap, skills summary,
memory) and only recomputing it when workspace source files change.
Key changes:
- ContextBuilder caches the static prompt behind an RWMutex with
double-checked locking. Source file changes are detected via cheap
os.Stat mtime checks so no explicit invalidation is needed.
- Track file existence at cache time (existedAtCache map) so that
newly created or deleted bootstrap/memory files also trigger a
rebuild — the old modifiedSince() silently returned false on
os.IsNotExist.
- Walk the skills directory recursively with filepath.WalkDir to
catch content-only edits at any nesting depth; directory mtime
alone misses in-place file modifications on most filesystems.
- ToolRegistry.sortedToolNames() sorts tool names before iteration,
ensuring deterministic tool definition order across calls — a
prerequisite for LLM-side prefix/KV cache reuse.
- Merge all context (static + dynamic + summary) into a single
system message for provider compatibility: the Anthropic adapter
extracts messages[0] as the top-level system parameter, and Codex
reads only the first system message as instructions.
- Fix a data race in BuildMessages() where cachedSystemPrompt was
read without holding the lock in a debug log statement.
- Add tests: single system message invariant, mtime auto-invalidation,
new-file creation detection, skill file content change, explicit
InvalidateCache, cache stability, concurrent access (20 goroutines
x 50 iterations, passes go test -race), and a benchmark.
HTTP timeouts (context deadline exceeded, Client.Timeout) were
incorrectly classified as context window errors, triggering useless
history compression. Replace broad substring checks ("context",
"token", "length") with specific patterns for real context limit
errors and explicitly exclude timeout errors from that path.
Additionally, timeout errors were not retried at all — the retry
loop only handled context window errors. Now timeouts are retried
up to 2 times with exponential backoff (5s, 10s).
Walk backwards over preceding tool messages to find the nearest assistant
with ToolCalls, instead of only checking the immediate predecessor. Add
unit tests for sanitizeHistoryForProvider covering key edge cases.
- Drain buffered messages in MessageBus.Close() so they aren't silently lost
- Replace all context.TODO() with context.WithTimeout(5s) across 7 call sites
- Fix OneBot pending channel leak: send nil sentinel in Stop() and handle
nil response in sendAPIRequest() to unblock waiting goroutines
The configuration field for specifying the model has been renamed from
"model" to "model_name" for better clarity and consistency with the
model_list configuration.
A GetModelName() accessor method has been added to maintain backward
compatibility. Existing configurations using the old "model" field will
continue to work correctly.
This change affects:
- Configuration structure (AgentDefaults struct)
- All references across the codebase
- Documentation in all language variants
- Example configuration files
- MediaStore: use full UUID to prevent ref collisions, preserve and
expose metadata via ResolveWithMeta, include underlying OS errors
- Agent loop: populate MediaPart Type/Filename/ContentType from
MediaStore metadata so channels can dispatch media correctly
- SplitMessage: fix byte-vs-rune index mixup in code block header
parsing, remove dead candidateStr variable
- Pico auth: restrict query-param token behind AllowTokenQuery config
flag (default false) to prevent token leakage via logs/referer
- HandleMessage: replace context.TODO with caller-propagated ctx,
log PublishInbound failures instead of silently discarding
- Gateway shutdown: use fresh 15s timeout context for StopAll so
graceful shutdown is not short-circuited by the cancelled parent ctx
Add outbound media sending capability so the agent can publish media
attachments (images, files, audio, video) through channels via the bus.
- Add MediaPart and OutboundMediaMessage types to bus
- Add PublishOutboundMedia/SubscribeOutboundMedia bus methods
- Add MediaSender interface discovered via type assertion by Manager
- Add media dispatch/worker in Manager with shared retry logic
- Extend ToolResult with Media field and MediaResult constructor
- Publish outbound media from agent loop on tool results
- Implement SendMedia for Telegram, Discord, Slack, LINE, OneBot, WeCom