* feat: add load_image tool for local file vision
* fix: address load_image PR review feedback
- Exclude load_image from sub-agent tools via Unregister after Clone,
since RunToolLoop does not call resolveMediaRefs
- Add ToolRegistry.Unregister() method
- Fix scope collision: use channel:chatID instead of filename
- Add channel/chatID context resolution matching send_file pattern
- Add comment explaining iteration > 1 guard on resolveMediaRefs
- Remove emoji from ForUser for consistency with send_file
- Add load_image_test.go
* feat: enable load_image for subagents via MediaResolver in RunToolLoop
Instead of removing load_image from sub-agent tools (28f69e71), inject a
MediaResolver into the legacy RunToolLoop fallback path so media:// refs
are resolved to base64 before each LLM call — matching the main agent
loop behavior.
- Add MediaResolver field to ToolLoopConfig and call it on iteration > 1
- Add SubagentManager.SetMediaResolver() and wire it through runTask
- Remove ToolRegistry.Unregister() (no longer needed)
- Restore load_image in sub-agent tool set (revert Clone+Unregister)
- Add TestSubagentManager_SetMediaResolver_StoresResolver
* refactor(load_image): remove prompt parameter from tool schema
* test(tools): add success-path test for LoadImageTool
Add TestLoadImage_SuccessPath that creates a real PNG file with valid
magic bytes, calls Execute with WithToolContext, and verifies:
- result.IsError == false
- ToolResult.Media contains a media:// ref
- ToolResult.ForLLM contains the [image: marker
- media ref is resolvable in the store
Add explanatory comment in loop.go for why Media and ArtifactTags
coexist on non-ResponseHandled tool results (e.g. load_image).
* fix: preallocate slice in tests and add ResponseHandled guard in toolloop
Fix prealloc linter failure in load_image_test.go.
Prevent double-resolving media by checking ResponseHandled in toolloop.go.
* Register TTS tool if provider is available
---------
Co-authored-by: Reusu <admin@yumao.name>
Co-authored-by: 美電球 <hoshina@evaz.org>
Treat SystemParts as an alternative representation of message Content
rather than an additive one. This prevents systematic overestimation
of system message tokens which could trigger premature context
pruning or summarization.
- Picks the maximum of Content vs. SystemParts to stay conservative.
- Adds a per-part overhead (20 chars) to account for JSON metadata.
- Streamlines the ReasoningContent counting logic.
Fixes a deficiency where structured blocks for cache-aware adapters
caused overestimated budgets or hidden overflows.
* feat(channels): Channel.Send and MediaSender.SendMedia return delivered message IDs
Change Channel.Send signature from (ctx, msg) error to (ctx, msg) ([]string, error)
and MediaSender.SendMedia similarly, so callers can capture platform message IDs
for threading, reactions, and history annotation.
Adapters that return real IDs: Telegram (per-chunk MessageID), Discord (Message.ID),
Slack Send (ts), QQ (sentMsg.ID), Matrix (EventID). Slack SendMedia returns nil
because UploadFileV2 does not expose the posted message timestamp in its response.
All other adapters return nil IDs.
preSend and sendWithRetry in manager.go updated to propagate ([]string, bool).
README examples updated for both English and Chinese docs.
* style: apply golangci-lint fixes (golines)
* docs: fix Send migration guide — restore old error-only signature in before/after example
- Add `reaction` tool that reacts to a message (defaults to current inbound message via context)
- Extend `message` tool with optional `reply_to_message_id` parameter
- Introduce `WithToolInboundContext` to inject inbound message IDs into tool execution context
- Surface `MessageID` and `ReplyToMessageID` in `processOptions` for tool-surface consumption
Refs #2137
* fix(cron): publish agent response to outbound bus for cron-triggered jobs
When a cron job triggers agent execution via ProcessDirectWithChannel,
the agent response was silently discarded — the code assumed AgentLoop
would auto-publish it, but SendResponse is false on this path.
Delegate to PublishResponseIfNeeded (exported from AgentLoop) so the
response reaches the originating channel (e.g. Telegram) only when the
message tool did not already deliver content in the same round.
Also adds a "directive" message type to CronPayload, allowing cron jobs
to instruct the agent to execute a task rather than echo static text.
* fix(cron): add type validation and directive test coverage
Address reviewer blocking feedback:
1. Server-side whitelist for `type` parameter — the `enum` in
Parameters() is only an LLM schema hint; any string was persisted.
Now `addJob` rejects values other than "message" and "directive".
2. Comprehensive test coverage for the directive code path:
- directive adds prompt prefix to ProcessDirectWithChannel
- deliver=true + directive routes through agent (not direct publish)
- directive prompt content, sessionKey, channel, chatID are correct
- invalid type is rejected; valid types ("", "message", "directive") pass
- deliver=true message type goes directly to bus (regression)
- agent error path does not trigger publish (regression)
Also merge the two UpdateJob calls in addJob into one to avoid
redundant disk I/O (non-blocking suggestion from review).
* fix(cron): remove omitempty from CronPayload.Type for consistent JSON
Empty string and "message" are semantically equivalent defaults;
always serializing the field avoids asymmetric JSON output.
* test(cron): remove redundant test, strengthen error path coverage
- Remove ExecuteJobDirectivePassesCorrectContent: its assertions on
sessionKey/channel/chatID duplicate ExecuteJobPublishesAgentResponse;
its prompt check duplicates DirectiveAddsPromptPrefix.
- Strengthen DirectiveAddsPromptPrefix with exact prompt match and
publish response assertion.
- Fix ReturnsErrorWithoutPublish: set non-empty stub response so the
test verifies the error branch early-return, not the response==""
guard.
* fix(ci): satisfy golines and gosmopolitan in cron code
- loop_test.go: replace undefined WithSecurity/SecurityConfig/ModelSecurityEntry
with direct APIKeys field using SimpleSecureStrings()
- dingtalk_test.go: use ClientSecret.String() and ClientSecret.Set()
instead of non-existent ClientSecret() and SetClientSecret() methods
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* Add multi-message sending via split marker
* Add marker and length split integration tests
Tests that SplitByMarker and SplitMessage work together correctly, and
that code block boundaries are preserved during marker splitting.
* Simplify message chunking logic in channel worker
Extract splitByLength helper function and remove goto-based control
flow.
The logic now flows more naturally - try marker splitting first, then
fall
back to length-based splitting.
* Update multi-message output instructions in agent context
* Add split_on_marker to config defaults
* Add split_on_marker config option
* Rename 'Multi-Message Sending' setting to 'Chatty Mode'
* Add SplitOnMarker config option
* feat: add ElevenLabs Scribe STT transcriber and Telegram SendVoice support
Add ElevenLabsTranscriber as an alternative speech-to-text provider using
the ElevenLabs Scribe API (scribe_v1). This enables voice message
transcription for users who already have an ElevenLabs API key, without
requiring a separate Groq account.
Changes:
- Add ElevenLabsTranscriber implementing the Transcriber interface
- Update DetectTranscriber to check providers.elevenlabs.api_key first,
falling back to Groq for backward compatibility
- Add ElevenLabs to ProvidersConfig
- Add "voice" media type for OGG files with "voice" in filename
- Add SendVoice support in Telegram channel for voice bubble messages
- Add comprehensive tests for ElevenLabs transcriber
Configuration:
"providers": {
"elevenlabs": {
"api_key": "sk_your_key_here"
}
}
Closes#1503 (partial)
* fix: move voice-bubble detection into Telegram channel to avoid regression in other channels
Address review feedback: keep inferMediaType returning "audio" for all
OGG files. Voice-bubble detection (SendVoice vs SendAudio) is now done
inside the Telegram channel based on filename, so other channels that
map "audio" explicitly are unaffected.
* fix: align VoiceConfig struct tags to pass golines formatter
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* fix(agent): use ModelName in loop test added by upstream
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
---------
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
LLM
Prevent LLM from seeing its own credentials (API keys, tokens, secrets)
by filtering sensitive values from tool call results before sending to
the
model. Values are collected from .security.yml and replaced with
[FILTERED] using an efficient strings.Replacer (O(n+m)).
- Add FilterSensitiveData and FilterMinLength to ToolsConfig
- Implement SensitiveDataReplacer() with sync.Once caching in
SecurityConfig
- Use reflection to collect all sensitive values (Model API keys,
channel
tokens, web tool API keys, skills tokens)
- Apply filtering in agent loop at 4 tool result locations
- Add comprehensive tests covering all token types
Anthropic API returns 400 when multiple tool_result blocks share the same
tool_use_id, or when consecutive tool results are sent as separate user
messages. This fix:
1. Adds ToolCallID deduplication in sanitizeHistoryForProvider (context.go)
to drop duplicate tool results before sending to any provider.
2. Merges consecutive tool result messages into a single user message with
multiple tool_result content blocks in Anthropic's buildRequestBody,
for both "user" (with ToolCallID) and "tool" role messages.
3. Adds tests for both behaviors.
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Add BaiduSearchConfig struct and register in WebToolsConfig/defaults
- Insert Baidu Search in priority chain: DuckDuckGo > Baidu > GLM Search
- Use perplexityTimeout (30s) — Qianfan is LLM-based
- Fix response parsing: use references[] field per API spec
- Add baidu_search block to config.example.json
docs: sync configuration.md and README Documentation table across all languages
- Complete truncated configuration.md for fr/ja/pt-br/vi/zh: add Spawn
async flow diagram, Providers table, Model Configuration (all vendors,
examples, load balancing, migration), Provider Architecture, Scheduled
Tasks, and Advanced Topics links
- Add Hooks/Steering/SubTurn entries to Documentation table in all 8
READMEs (en/zh/fr/id/it/ja/pt-br/vi), ordered before Troubleshooting
- Add Baidu Search row to web search table in all 8 READMEs and
tools_configuration.md (en + 5 i18n); zh README reorders search
engines with China-friendly options first
- Add Matrix channel docs translations (fr/ja/pt-br/vi)
- Add Weixin channel to chat-apps.md and all README Channels tables
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>