The reviewer identified two bugs in the original PR:
1. PATCH /api/config leaves session.dimensions stale: LoadConfig()
derives dimensions from the old dm_scope, and the merge carries
those stale dimensions forward. ApplyDmScope() then exits early
because dimensions is already populated, causing a mismatch between
dm_scope (new) and dimensions (old).
2. Legacy/default configs omit dm_scope in GET response: configs with
explicit dimensions but no dm_scope (including DefaultConfig) return
no dm_scope field, causing the frontend to fall back to its default
('per-channel-peer'), which may not match the actual dimensions.
Fix:
- Add DeriveDmScope() to reverse-map known dimensions arrays to
dm_scope when dm_scope is empty.
- Call it in LoadConfig(), PUT handler, PATCH handler, and
ResetToDefaults() for consistent normalization.
- In PATCH handler, clear stale dimensions from the merge result when
the patch contains session.dm_scope but not session.dimensions,
allowing ApplyDmScope() to re-derive from the new scope.
- Add comprehensive unit tests for DeriveDmScope() and scope
transition scenarios.
* fix(launcher): hide console flashes in all Windows child processes
PR #2654 only applied HideWindow to child processes in gateway.go (powershell, tasklist, ps). Several other files still use exec.Command directly, causing visible console windows on Windows.
- startup.go: reg query/add/delete for autostart registry
- version.go: picoclaw version subcommand
- runtime.go: rundll32 for browser launch
- onboard.go: picoclaw onboard subcommand
Add launcherExecCommand to the utils package (matching the api package pattern) and replace all bare exec.Command calls on Windows paths.
* refactor: consolidate launcherExecCommand into utils package
Export LauncherExecCommand and ApplyLauncherProcAttrs from the utils
package as the single source of truth. The api package now imports
and delegates to these exported functions, eliminating code duplication.
Addresses review feedback from imguoguo on PR #3061.
short_retrieval.go: Check Atoi error even though regex ensures numeric input. gateway.go: Log warning when gateway config JSON is malformed instead of silently using defaults.
RFC 2544 benchmark addresses (198.18.0.0/15) are not globally routable
but were missing from the isPrivateOrRestrictedIP blocklist, allowing
SSRF bypasses via literal IPv4.
Fixes#3077
The dm_scope field was stored in config but never translated into the
dimensions array that the routing layer actually consumes. This meant
changing the session isolation scope in the UI had no effect at runtime.
Add ApplyDmScope() to SessionConfig which maps the user-facing dm_scope
values (per-channel-peer, per-channel, per-peer, global) to the
corresponding dimension arrays. Call it in LoadConfig post-processing
and in both the PATCH and PUT API handlers.
Includes table-driven tests covering all dm_scope values and the
precedence rule (explicit dimensions > derived from dm_scope).
The frontend sends dm_scope as part of the session config, but the
backend SessionConfig struct lacked the corresponding field. Go's
encoding/json silently discards unknown fields, so the value was lost
on every PATCH request. Additionally, MarshalJSON only emitted the
session block when Dimensions or IdentityLinks were set, so even a
stored dm_scope would not appear in GET responses.
- Add DmScope string field with json tag 'dm_scope' to SessionConfig
- Update MarshalJSON condition to include session when DmScope is set
Replace raw log.Printf and fmt.Printf calls in pkg/state, pkg/agent, and pkg/tools with structured logger calls (WarnCF/InfoCF). This ensures warnings and info messages are routed through the configured logging infrastructure instead of raw stderr/stdout.
errutil.go: Change %v to %w in ClassifySendError and ClassifyNetError so callers can use errors.Is/errors.As on the underlying HTTP/network error.
isolated_command_transport.go: Change %v to %w in Close() and Write() error paths for the same reason.
GetStartupInfo returns map[string]any, and type-asserting tools/skills entries without checking ok is fragile. While the current implementation always stores the correct types, a future refactor could cause silent nil dereference. Add ok checks with explicit nil fallback.
When os.Getwd fails, wd is empty and builtinSkillsDir resolves to relative path, causing confusing downstream errors. Fall back to config.GetHome on error.
singleflight.Group.Do() returns any, which is type-asserted as bool
without an ok check at model_status.go:211. If a non-bool value is
returned (e.g. nil from shared/cache corruption), this panics.
Add ok check and return false (model probe failed) as a safe default.
Add a warning log when the type assertion from sync.Map.LoadAndDelete fails in UnsubscribeEvents, per review suggestion. This makes a mismatched type observable for debugging.
Add 3 tests covering scenarios that previously panicked: 1) missing enabled key in settings 2) enabled field with non-bool type 3) teams_webhook with webhooks using map[string]any from JSON unmarshal
Address remaining review feedback: 1) Add HistoryTokens field to ContextUsage/ContextStats, showing history-only token count in /context and frontend UI alongside SummarizeAtTokens so users can see the actual summarization trigger comparison. 2) Remove .codebuddy/github-contribute/ state files accidentally included in the PR.
sync.Map.LoadAndDelete returns any; unprotected type assertion could panic if an unexpected type were stored. Add ok check to safely handle mismatched types.
Two type assertions in toChannelHashes could panic when channel config values had unexpected types from JSON unmarshal: 1) value[enabled].(bool) panics if the key is missing or not a bool 2) vv.(map[string]string) panics when JSON unmarshal produces map[string]any. Add ok checks to safely handle both cases.
When an incoming group message is received, the inbound context ChatID was set to the raw group number without the group: prefix. This caused the outbound reply to use send_private_msg instead of send_group_msg. Fix by using the prefixed chatID as inbound context ChatID. Closes#3002
The SDK renamed ReceiveIdTypeChatId to CreateMessageV1ReceiveIDTypeChatId
in v3.9.4. Update all 5 usages in feishu_64.go and bump the dependency
version.
This fixes the build failure for Dependabot PR #3005.
isProcessRunning() previously only checked whether a PID existed via signal(0)/OpenProcess, without confirming the process was actually picoclaw. When the PID was reused by an unrelated process (e.g., systemd-resolved after a kill -9), the gateway would refuse to start with 'already running'.
Add isPicoclawProcess() that verifies the process name matches picoclaw:
- Unix: reads /proc/<pid>/comm
- Windows: calls QueryFullProcessImageNameW
If the running process is not picoclaw, treat the PID file as stale and proceed with normal startup. Falls back to trusting the liveness check when identity verification is unavailable (e.g., /proc unreadable, API call fails).
Fixes#2720.
The workspace guard's absolutePathPattern regex matches /Beijing?T in commands like 'curl wttr.in/Beijing'. Since 'wttr.in' is not a recognized web scheme, the path was routed through workspace sandbox validation, which could block legitimate scheme-less URL usage (curl allows bare domains without http://).
Add detection for domain-like tokens preceding /path matches:
- looksLikeDomain: checks for dot-separated tokens that don't end with common file extensions (.py, .go, .exe, etc.)
- localPathExists: verifies the token does not exist as a local filesystem entry
This dual guard prevents the symlink bypass identified in PR #2965 review: if 'foo.bar' exists as a local symlink or directory, the path still undergoes full workspace validation.
Fixes#1042.
go env GOVERSION may return values like go1.25.10 X:nodwarf5 with an embedded space on some toolchain configurations, breaking -ldflags. Use firstword to extract only the first token. Fixes#2976.
Replace 7 instances of ignored json.Marshal errors with proper error handling. Previously, if marshaling an ExecResponse failed, a nil byte slice would be silently converted to an empty string in the LLM response. Now each site returns ErrorResult with the marshal error message.
The /context command previously showed only the hard budget compression
threshold (contextWindow - maxTokens), which confused users who expected
to see the soft summarization trigger from summarize_token_percent.
This commit adds SummarizeAtTokens alongside the existing CompressAtTokens
so that both thresholds are visible:
- Compress at: contextWindow - maxTokens (hard budget, triggers proactive
compression when exceeded)
- Summarize at: contextWindow * summarizeTokenPercent / 100 (soft trigger,
matches maybeSummarize's threshold)
The fix updates the /context command output, the Web UI popover, and the
pico channel WebSocket payload.
Fixes#2968
The PromoteAliasHistory method previously promoted the first non-empty alias session into a new canonical session. When a user upgraded, the migrated main session contained old messages that were copied into every new Web UI session because agent:main:main is always the first alias.
Add isMainSessionAlias() to detect and skip the main session alias during promotion. Fixes#2972.
The Stop() method previously used a select/default pattern which was not
safe under concurrent calls — two goroutines could both pass the check
and attempt to close the same channel, causing a panic.
Replace with sync.Once to guarantee exactly-once close semantics,
matching the documented contract of being safe for concurrent use.
Review feedback: afjcjsbx
Previously, only timeout and network errors (matched via string
patterns) were retried. HTTP 500 server errors from
OpenRouter/OpenAI-compatible providers would fail the agent turn
immediately when no model fallback candidate was available.
This commit replaces the separate timeout/network retry branches
with a unified transientLLMRetryReason() helper that:
1. Uses providers.ClassifyError() to detect server_error (HTTP >=500),
timeout, network, and rate_limit errors
2. Falls back to the existing string-based detection for errors
not classified by the provider
A regression test (TestPipeline_CallLLM_HTTP5xxRetry) verifies that
HTTP 500 errors are retried and recover successfully.
This is a clean rebase of the approach originally proposed in #2768
by afjcjsbx.
This fixes issue #2943 where WeChat channel image requests to Zhipu
GLM-5-Turbo vision API failed with error code 1210 (parameter error)
without triggering the fallback mechanism.
Changes:
- Added error code 1210 pattern matching to formatPatterns
- This allows the fallback mechanism to recognize Zhipu API parameter
errors and fall back to alternative vision models
Closes#2943
The SessionManager's background cleanup goroutine previously had no
shutdown mechanism. Each call to NewSessionManager() started a ticker
goroutine that ran indefinitely. In tests, where multiple
SessionManagers are created, this caused goroutine leaks.
This commit adds a Stop() method that cleanly shuts down the background
cleanup goroutine via a channel. Stop() is safe to call multiple times.
All existing tests now call t.Cleanup(sm.Stop) to ensure cleanup.
Claude Opus 4.8 on Bedrock rejects the temperature inference parameter
with a ValidationException ("temperature is deprecated for this model").
buildConverseParams now takes the model id and omits temperature for
claude-opus-4-8* (matching both bare model ids and region-prefixed
inference profiles), logging when it does so. max_tokens and all other
models are unaffected.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Add GetJob and improved UpdateJob to CronService with proper cloning,
schedule diffing, and next-run recomputation. Expose get/update actions
in the cron tool so agents can inspect and partially update jobs without
losing payloads or needing remove+add cycles. Includes access control
for remote channels and command safety gates.
OpenAI/Codex OAuth streams can return text through response.output_text.delta while the final response.completed payload has response.output set to null. That made PicoClaw report an empty model response even though the backend returned valid content.
Accumulate streamed output_text delta events during the Codex response stream and use them as a fallback when the parsed final response has no content. Add a regression test covering the null final output case from issue #2953.
Add FUNDING.yml file to enable GitHub Sponsors button on the repo.
This makes it easy for users who benefit from PicoClaw to support
the project financially.
Closes#2912
When running PicoClaw inside Termux or termux-chroot, HTTPS
requests fail with X509 certificate errors because the Go TLS
stack does not automatically detect the Termux CA bundle path.
This change adds automatic detection of Termux environments and
sets SSL_CERT_FILE to the correct CA bundle path before any
network operations. The detection checks:
- HOME or PATH contains 'com.termux'
- Common CA bundle locations in Termux prefix
Fixes#2944
CronTool.ExecuteJob was calling ExecTool.Execute without setting
action='run' in the args map. ExecTool.Execute requires the action
field and returns ErrorResult('action is required') immediately when
it's missing. This caused all cron command jobs to silently fail.
Adds a test covering the command execution happy path.
Discord only downloaded audio attachments before passing them to the agent. Non-audio attachments (images, videos, files) were passed as raw Discord CDN URLs, which do not flow through resolveMediaRefs and are not serialized as vision inputs.
Download every attachment, store it in the MediaStore with Discord's filename and content type metadata, and emit a media placeholder tag that matches the attachment kind. This lets resolveMediaRefs replace the placeholder with the local path-bearing tag and encode supported images for vision-capable providers. If a download fails, keep the previous raw URL fallback.
- Persistence layer (jsonl.go addMsg/SetHistory) normalizes CreatedAt
when missing so the invariant is guaranteed at the storage boundary
- API layer (session.go) exposes created_at on all transcript message
types with session.updated fallback for legacy messages
- Frontend uses per-message timestamps when available
- messagesContentEqual ignores CreatedAt for tail-matching after
JSONL roundtrip
Fixes#2787
* feat: add request-scoped context policies
Add named turn profiles under agents.defaults so callers can opt into
per-request context and tool policies without changing default chat behavior.
Profiles can disable history, system context, skill prompts, or tools, and can
limit skills/tools with allow lists. Wire profile selection through Pico message
payloads, agent turn execution, Web chat selection, and Web visual config.
Reject invalid turn profiles before saving config through Web APIs and document
the new request context policy behavior.
* fix: address turn profile review blockers
* feat: simplify request context policy config
* fix: suppress tool prompt when turn tools are disabled
* fix: enforce turn profile tool restrictions
Add mimo-v2.5 (multimodal) and mimo-v2.5-pro to MiMo's CommonModels so
the WebUI recommends vision-capable models by default. mimo-v2.5 supports
image understanding while mimo-v2.5-pro is text-only.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
When editing an existing model, the edit form initializes apiKey as
empty for security. This caused "Fetch Available Models" to reject with
"please enter API Key first" even though the key is saved server-side.
Add model_index support: the frontend passes the model's index to the
backend, which looks up the stored key from config. The key never leaves
the backend. Provider and API base are validated to prevent a stored key
from being sent to an unrelated endpoint.
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* feat(chat,seahorse): persist and display model_name across history
* test(seahorse): fix lint regressions in repair coverage
* fix(pico): preserve model_name in live updates
* fix(pico): preserve model_name through live stream wrappers
* feat(models): unify provider metadata around backend catalog
- Move shared provider metadata and alias normalization into backend-owned provider catalog
- Expose display, fetch, auth, and default model metadata through /api/models provider_options
- Replace frontend static provider registry with catalog-driven selection, validation, grouping, and fallback rendering
- Treat provider default api_base as placeholder and effective fetch/test base while keep submitted api_base separate from derived defaults
- Add model page retry handling, touched locale updates, and provider metadata assertions in backend tests
* fix(models): canonicalize backend provider aliases and common models
* fix(models): restore deepseek common model recommendations
* Support streaming
* fix: stream pico reasoning updates
Route Pico reasoning through the active streamer and hide empty thought placeholders.
* fix: harden configured streaming delivery
* fix ci
* fix split issue
Export MakeBackup for external use, add ResetToDefaults function that
backs up current config, creates defaults, and preserves security
credentials. Add `picoclaw config reset` CLI command with --force flag.
* feat(web,api): add fetch models and saved catalog support
Split from PR #2752 (part 2 of 3).
Backend:
- /api/models/catalog endpoint for browsing remote model catalogs
- /api/models/fetch endpoint for fetching available models from providers
- Credential reuse with provider/API base matching for security
- Default API base resolution for providers without explicit base
Frontend:
- FetchModelsDialog for importing models from remote providers
- CatalogDialog for browsing and importing from model catalogs
- Static import for FetchModelsDialog (replaces dynamic import from PR1)
- Dynamic import retained for TestModelDialog (PR3 territory)
* feat(web,api): add test connection with real connectivity verification
Split from PR #2752 (part 3 of 3).
Backend:
- /api/models/{index}/test endpoint for testing saved model configs
- /api/models/test-inline endpoint for testing unsaved form values
- Real network probe (GET /models) for connectivity verification
- Credential reuse with provider/API base matching for security
- Default API base resolution for providers without explicit base
Frontend:
- TestModelDialog for testing model connectivity
- Inline test support for add/edit model sheets
- Static import for TestModelDialog (replaces dynamic import from PR1)
* fix(pico): preserve image media across pico attachments and client
* * fix ci
* fix(pico): preserve text when client media parsing fails
- Skip non-inline Pico attachment URLs instead of treating them as invalid inline media
- Preserve pico_client text messages when malformed media payloads are received
- Add regression coverage for media.create, download attachments, and invalid media payloads
* fix lint
* feat(web,api): add fetch models and saved catalog support
Split from PR #2752 (part 2 of 3).
Backend:
- /api/models/catalog endpoint for browsing remote model catalogs
- /api/models/fetch endpoint for fetching available models from providers
- Credential reuse with provider/API base matching for security
- Default API base resolution for providers without explicit base
Frontend:
- FetchModelsDialog for importing models from remote providers
- CatalogDialog for browsing and importing from model catalogs
- Static import for FetchModelsDialog (replaces dynamic import from PR1)
- Dynamic import retained for TestModelDialog (PR3 territory)
* fix(web,api): support bare-array responses in fetchOpenAICompatibleModels
* fix(web,api): tighten maskAPIKeyValue to match maskAPIKey policy
For 9-12 character keys, maskAPIKeyValue exposed first 4 + last 4
chars (only 1 char masked for a 9-char key). Now uses the same
policy as maskAPIKey: first 3 + last 2 for 9-12 chars, first 3 +
last 4 for longer keys. Adds tests covering all key length boundaries.
* add gemini web search provider
* fix(web): prefer free providers before Gemini in auto mode
* fix(web): expose gemini api key and model settings
* fix(web): prefer configured providers before Gemini in auto mode
* fix(web): satisfy gemini lint checks
* fix(web): address gemini provider review feedback
* test(web): align auto-provider expectations
* fix(web): let gemini ignore search range
* feat: improve model configuration workflows
Add model catalog browsing, provider registry with form validation,
model fetch/test dialogs, and enhanced model management UI.
- Add model catalog API and catalog-dialog component for browsing saved models
- Add provider-registry with auto-populated form fields per provider
- Add provider-combobox, fetch-models-dialog, test-model-dialog components
- Add model-validation for provider-aware model ID validation
- Add command and popover UI components
- Enhance edit-model-sheet with tool schema transform support
- Add anthropic to protocolMetaByName for correct default API base
- Apply NormalizeBaseURL to anthropic provider for consistent URL handling
- Add i18n keys for new model management features (en/zh)
* fix(web): prevent auto-fetch when API key is missing in fetch models dialog
When a provider requires an API key but none is set, the dialog now shows
the warning without triggering a doomed fetch attempt. Fetch is deferred
until the user provides a key.
* fix(web): add credential warning for catalog imports from remote providers
When importing models from a catalog entry whose provider requires an API
key, a yellow warning banner now informs users that credentials will need
to be configured after import.
* feat(web,api): test connection with real connectivity verification and unsaved form values
Add POST /api/models/test-inline endpoint that performs actual network
probes (GET /models) instead of just checking config. Frontend Test
Connection now uses current form values (not saved state) and is
available in both Add and Edit model flows.
* style(web): apply linter formatting across model config components
Normalize quote style, import ordering, and class name ordering as
reported by the project linter.
* fix(web,api): fix edit test connection false negative and gate fetch for unsupported providers
- handleTestInlineModel now accepts optional model_index to fall back to stored credentials when api_key is empty, fixing false negatives when testing edited models
- Add supportsFetch to provider registry and FETCHABLE_PROVIDER_KEYS derived set
- Gate Fetch Models button to only show for OpenAI-compatible and Ollama providers
- Add backend guard in handleFetchModels to reject unsupported providers with clear error
* fix: address review feedback on model config workflow
- Send explicit {} for empty extra_body/custom_headers fields so the
backend clears stored values instead of preserving them
- Merge backend provider_options with frontend PROVIDERS registry so
the provider picker reflects backend-supported providers and policy
fields (create_allowed, default_auth_method, auth_method_locked)
- Render provider combobox popover inside the sheet scroll container
to fix wheel events scrolling the sheet instead of the provider list
* feat(web,api): add provider selection, model form foundation, and validation
Split from PR #2752 (part 1 of 3).
Backend:
- CRUD model endpoints (list/add/update/delete/set-default)
- Provider metadata with default API bases and model provider options
- Model ID validation and normalization
- Anthropic default API base normalization
Frontend:
- Provider registry with metadata, labels, icons, and aliases
- Provider combobox with backend option merging
- Model field validation with provider-aware checks
- Redesigned add/edit model sheets with provider selection
- Dynamic imports for fetch/catalog/test dialogs (coming in PR2/PR3)
- i18n support for model configuration UI
* Add MCP section to config UI
* Handle MCP sse and URL-based server mapping
* Validate duplicate MCP server names before save
* Disable MCP discovery options based on mutual exclusivity in config section
Co-authored-by: Copilot <copilot@github.com>
* Clear stale MCP transport fields in patch payload
* Fix MCP config form state preservation and validation
* Avoid MCP form ID collisions for distinct server names
* Validate remote MCP URLs in config UI
* fix(config): correct MCP discovery merge patch behavior
* Potential fix for pull request finding
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
* fix(config): align MCP discovery semantics and MCP server editor behavior
* fix(config): validate MCP server fields only when active
---------
Co-authored-by: Copilot <copilot@github.com>
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Always route through classifySDKError to ensure resp.Body is
closed even when the API call succeeds.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Store QuoteToken for image, video, and sticker messages (not just text)
- Add webhook.LocationMessageContent case to forward as [location] placeholder
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Fix bodyclose linter errors by ensuring resp.Body is closed
after all *WithHttpInfo SDK calls.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Address review feedback:
- Use *WithHttpInfo SDK variants to get HTTP response status codes
- Map status codes via ClassifySendError (429→ErrRateLimit, 5xx→ErrTemporary, 4xx→ErrSendFailed)
- Fall back to ClassifyNetError for network-level failures
- Configure SDK with 30s timeout HTTP client
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* feat(i18n): add Portuguese (Brazil) locale
Add pt-BR as the third supported language in the Web UI, alongside
English and Chinese. The browser language detector will auto-select
PT-BR for Portuguese-speaking users.
Changes:
- Add web/frontend/src/i18n/locales/pt-br.json with full translation
- Register pt-BR resource and dayjs locale in i18n/index.ts
- Add "Português (Brasil)" option to language selector dropdown
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* chore(i18n): refresh pt-br locale to match current en.json keys
Add 194 new keys (skills marketplace, tour, launcher login/setup, chat
disabled placeholders, web search tools, dashboard password, etc.) and
remove 15 outdated keys so pt-br.json now mirrors en.json (601/601 keys).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix(channels): dismiss tool feedback animation when turn ends via ResponseHandled
When a tool sets ResponseHandled=true (e.g., send_file), the turn ends
without producing a final assistant response. This meant no outbound
message triggered FinalizeToolFeedbackMessage, leaving the animation
goroutine running indefinitely — editing the Feishu card every 3 seconds
with "." / ".." suffixes long after the tool had finished.
Fix: call DismissToolFeedback at "Tool output satisfied delivery" so the
tracker is cleared and the animation goroutine is stopped immediately.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* fix(adapters): add DismissToolFeedback to channelManagerAdapter
The adapter must implement the new interface method added in the
previous commit, otherwise the package fails to compile.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* fix(channels): pass InboundContext to DismissToolFeedback for topic-aware keys
Telegram forum topics use scoped tracker keys like "chatID/topicID",
resolved via ToolFeedbackMessageChatID with the InboundContext. The
previous nil context caused the lookup to fall back to the raw chatID,
missing the topic-scoped entry and leaving the animation goroutine
orphaned in forum-topic conversations.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* style: wrap long function signatures for golines
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* fix(feishu): fix image download with API fallback and post image support
- Add Image.Get API fallback when MessageResource.Get fails (different
permission scope: im:resource vs im:message:readonly)
- Extract and download images from post (rich text) messages
- Extract images from interactive card messages
- Deduplicate post image keys across locales
- Add comprehensive tests for new helpers
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* feat(media): add image path tags alongside base64 for LLM file access
Images are still base64-encoded into msg.Media for multimodal LLMs,
but now also get [image:path] tags injected into message content so
the LLM knows the local file path for save/forward operations.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* refactor(media): only auto-inject images for tool results, not user messages
Channel-received images (role=user) now get path tags only, letting
the LLM decide whether to view via load_image or just operate on
the file. Tool result images (role=tool, e.g. load_image) are
base64-encoded into a synthetic user message appended after the tool
message, since many LLM APIs don't support image_url in tool messages.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* fix(media): preserve tool-message ordering for multi-tool-call scenarios
Move synthetic user message (carrying base64 tool images) to after the
entire contiguous tool-message block instead of immediately after each
tool message. This preserves the assistant→tool→tool ordering required
by OpenAI-compatible APIs.
Also fix load_image to use generic [image: photo] placeholder so
injectPathTags can properly replace it with the actual path.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* fix(test): update load_image test for [image: photo] placeholder
The test was checking ForLLM for the media:// ref, but load_image now
emits the generic [image: photo] placeholder instead.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* fix(media): match all channel image placeholders in injectPathTags
Different channels emit different placeholder formats — Telegram/Feishu
use [image: photo], WeCom/WeChat/Line use bare [image], QQ/Discord use
[image: <filename>]. The previous string-match code only handled
[image: photo], so for the other channels the path tag was appended as
a duplicate, producing content like "[image] [image:/path]".
Switch to per-type regex that matches all generic placeholder shapes
while leaving path tags ([image:/path]) untouched. Also fixes the same
issue for [audio], [video], [file] tags. Added test coverage for the
various placeholder shapes.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* fix(media): skip path tag append for JSON content (Feishu cards/posts)
When content is structured JSON (interactive cards, post messages),
injectPathTags now skips the fallback append — only placeholder
replacement is attempted. This prevents corrupting JSON payloads
like {"schema":"2.0",...} with appended [image:/path] tags.
Adds looksLikeJSON() helper and three test cases covering JSON
objects, arrays, and an end-to-end resolveMediaRefs scenario with
Feishu card content.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* fix(media): prepend path tags for JSON content, narrow looksLikeJSON
Two fixes from code review:
1. looksLikeJSON now only checks for '{' prefix (not '['), avoiding
false positives on regular text like "[update] see attached".
2. For JSON content (Feishu cards/posts), path tags are prepended
before the JSON instead of being silently dropped. This ensures
the LLM can discover attached images via the path tag while the
JSON payload stays valid for downstream parsing.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Tests cover: text-only streaming with chunk accumulation, tool call
parsing with fragmented JSON, mixed text+tool responses, context
cancellation, invalid JSON fallback to raw payload, nil stream guard,
default finish reason, and all stop reason mappings.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Adds ConverseStream API support to the Bedrock provider, implementing
the StreamingProvider interface. Tokens flow via onChunk callback for
real-time delivery to streaming-capable channels.
- Extract buildConverseParams to share request logic between Chat and ChatStream
- Add converseStreamReader interface for testability
- Preserve raw payload in Arguments on JSON parse failure
- Ensure Function.Arguments is always valid JSON
- Streaming timeout only applied when explicitly configured
- Capture stream Close() errors for diagnostics
- Consistent "bedrock conversestream" / "bedrock:" log prefixes
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add detection for 'unknown variant' + 'image_url' error pattern used by
DeepSeek and other strict providers when vision is not supported.
These providers reject the image_url field at the JSON schema level
rather than returning a semantic 'not supported' message.
* feat(model): add `picoclaw model add` for custom OpenAI-compatible endpoints
Onboards a model from a user-supplied API base + key by hitting
GET <base>/models, prompting the user to pick one, and writing the entry
into model_list[] (with api_keys) plus setting it as the default model.
This was previously only available in the TUI launcher (issue #2208) and
is now accessible from the CLI:
picoclaw model add -b URL -k KEY [-m MODEL] [-n ALIAS]
* chore: remove deprecated picoclaw-launcher-tui
Per RFC #2208, the TUI launcher is deprecated in favor of the CLI; its
"online model picker" feature has been ported to `picoclaw model add` in
the previous commit. This drops the binary and all build/release/docs
references:
- delete cmd/picoclaw-launcher-tui/ and assets/launcher-tui.jpg
- Makefile: remove the `build-launcher-tui` target
- .goreleaser.yaml: drop the build entry plus the `picoclaw-launcher-tui`
ids from the launcher docker image, macOS notarize list, and nfpms
contents
- docker/Dockerfile.goreleaser.launcher: drop the COPY for the TUI binary
- READMEs (root + 8 locales): remove the "TUI Launcher" section and
screenshot link
- docs/guides/docker.*: update the "launcher image includes …" sentence
to reflect the two remaining binaries
`make build` still succeeds; `go build ./web/backend` (the launcher
target) still succeeds. `picoclaw-launcher` (web console) is unaffected.
* fix(docker): restore `make docker-build` by adding build directives and fixing Go version
docker-compose.yml only had `image:` references with no `build:` sections,
so `docker compose build` had nothing to build. Also fixed golang:1.26.0-alpine
(nonexistent) to golang:1.25-alpine in Dockerfile.full/heavy, and removed
LICENSE from .dockerignore since scripts/copydir.go needs it as a repo-root anchor
during `go generate`.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* fix(docker): inject version metadata ldflags in Dockerfile.launcher
Mirror the ldflags from web/Makefile (Version, GitCommit, BuildTime,
GoVersion) into the picoclaw-launcher go build command so Docker-built
launcher images include proper version/build metadata.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
* feat(tools): add cross-platform serial hardware tool
* feat(config): wire serial tool into runtime and dashboard
* hardware/serial: tighten validation and error handling
* hardware/serial: improve unix cancellation and timeout polling
* hardware/serial: improve windows I/O handling
* hardware/serial: fix darwin cross-compilation build
* docs(design): summarize hardware support and serial limits
* build: keep go generate on host during cross builds
* onboard: drop unrelated go generate change from serial work
* style(tools): wrap serial lines for golines
* refactor(pico): unify message kind handling of tool_calls and thought
* fix(pico): add legacy compatibility for thought payload in Send method
Co-authored-by: Copilot <copilot@github.com>
---------
Co-authored-by: Copilot <copilot@github.com>
- Return fmt.Sprintf fallback instead of {} on encoding errors to preserve visibility
- Normalize nil to empty map in FormatArgsJSON for consistent output
- Remove redundant nil check in toolFeedbackArgsPreview wrapper
- Update test expectation: nil args now return {} not null
- Use fmt.Sprintf fallback instead of {} on encoding errors
- Normalize nil args to {} in FormatArgsJSON for consistent output
- Update tests to expect {} instead of null for nil args
Based on PR #2670 review feedback from afjcjsbx
Two-phase strategy: 7 days inactive → stale warning, 7 more days → close.
Exempt labels: pinned, keep-open, wip, do-not-close, type: roadmap.
Draft PRs are also exempt. Runs daily at 03:00 JST.
Scan oldest items first (ascending: true) with 500 ops budget to avoid
backlog starvation.
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Add a non-blocking runtime publish path and switch hot-path publishers to it.
Enforce subscription timeout boundaries, keep ordered subscriber snapshots up to date on subscribe changes, expose all runtime kinds to process hooks, add safe log attrs for non-agent events, and close the gateway message bus on full shutdown.
Change incorrect object format model.primary/fallbacks to correct
flat format model_name/model_fallbacks in Agent defaults example.
The AgentDefaults struct does not support the object format used
in AgentConfig, so the documentation example was misleading.
Replace leftover SubTurnOrphanResultEvent and short subturn event references with runtime event kinds in comments, tests, and hook design notes.
Validation: GOCACHE=/tmp/picoclaw-go-cache go test ./pkg/agent -run TestSpawnSubTurn_OrphanResultRouting; make lint
Emit gateway.start, gateway.ready, and gateway.shutdown on the shared runtime event bus, while keeping reload events on the same helper path.
Update subturn architecture docs to refer to runtime event kinds instead of the removed agent EventBus names.
Validation: GOCACHE=/tmp/picoclaw-go-cache go test ./pkg/gateway ./pkg/events; GOCACHE=/tmp/picoclaw-go-cache go test ./pkg/bus ./pkg/channels ./pkg/mcp ./pkg/tools/integration ./pkg/events ./pkg/gateway; make lint
Remove the legacy EventKind/Event envelope mapping and let agent event emission build pkg/events.Event values directly.
Keep HookMeta as the shared hook metadata shape and preserve legacy observe string aliases by mapping them to runtime event kinds.
Validation: GOCACHE=/tmp/picoclaw-go-cache go test ./pkg/agent; make lint
Drop the old agent EventBus, SubscribeEvents/EventDrops public surface, legacy hook observer dispatch, and hook.event process notification path. Agent observations now flow through pkg/events runtime events.
Validation: go test ./pkg/agent; make lint
Mark the original hook design as an early record and update observer examples to pkg/events runtime events and hook.runtime_event.
Validation: make lint
Move agent domain event payload structs out of the legacy event envelope file so the remaining EventKind/Event/EventMeta compatibility layer can be removed independently later.
Validation: go test ./pkg/agent; make lint
Use RuntimeEventObserver for the normal in-process hook observer path and make the process-hook helper assert hook.runtime_event notifications.
Validation: go test ./pkg/agent; make lint
Move AgentLoop event assertions to the runtime event stream and keep the legacy SubscribeEvents test only for dual-publish compatibility.
Validation: go test ./pkg/agent; make lint
Deprecate the legacy agent event APIs and add a runtime event test helper, then migrate the follow-up queued test to the runtime event stream.
Validation: go test ./pkg/agent; make lint
Migrate hook observation to runtime events and update the process hook notification protocol. Add runtime event publication for message bus failures, channel lifecycle/outbound flow, gateway reloads, MCP server state, and MCP tool calls.
Validation: go test ./pkg/events/... ./pkg/bus ./pkg/agent ./pkg/channels ./pkg/mcp ./pkg/tools/integration ./pkg/gateway; make lint
Introduce pkg/events with filtered channels, subscription policies, backpressure, and stats. Wire AgentLoop to dual-publish legacy agent events into runtime events while preserving old event APIs.
Validation: go test ./pkg/events/... ./pkg/agent; go test -race ./pkg/events/...; make lint
- Add PrettyPrint and DisableEscapeHTML config options to ToolFeedbackConfig
- Add FormatArgsJSON helper function with configurable pretty printing and HTML escaping
- Add toolFeedbackArgsPreviewWithOptions to pass formatting options
- Update pipeline_execute.go to use new formatting options for tool feedback
This fixes the issue where '&&' would be displayed as '\u0026' in tool
feedback messages and provides optional pretty-printing for better
readability.
- Add isNetworkError detection for connection reset, broken pipe, read/write tcp, EOF
- Add retry logic with configurable exponential backoff for network errors
- Add config options max_llm_retries and llm_retry_backoff_secs in agents.defaults
- Network errors now retry with backoff (was previously not retried)
- Timeout errors now use configurable backoff instead of hardcoded 5s
- Default: 2 retries with 2s backoff (3 total attempts)
Resolves conflicts after the agent loop refactor on main:
- pkg/agent/loop.go was deleted upstream (logic split into agent.go,
agent_init.go, pipeline.go, etc.); accepted the deletion.
- Moved the delegate tool registration block from the old loop.go
into registerSharedTools in pkg/agent/agent_init.go, immediately
after the spawn/spawn_status block. Logic and gating
(len(registry.ListAgentIDs()) > 1) are unchanged.
- pkg/agent/subturn.go and pkg/agent/subturn_test.go merged cleanly
on their own; TargetAgentID field, validation, registry lookup,
and tests all preserved.
Verified locally:
- go build ./pkg/agent/... ./pkg/tools/... clean
- go vet clean
- TestDelegateTool* (17 cases) pass
- TestSpawnSubTurn_TargetAgentID_* (3 cases) pass
- TestDelegateToolRegistered_MultiAgent / _SingleAgent pass
- full pkg/agent + pkg/tools test suites green
Address latest review comments from sky5454 in PR #2654.
scripts/copydir.go:
- Improve repository root detection in a safer, more deterministic way.
- Prefer locating repo root from the script source path via runtime.Caller(), then fallback to upward search from current working directory.
- Replace .git-only root detection with repository anchor validation: go.sum, LICENSE, and .github must exist.
- Keep \ placeholder expansion and existing in-repo path guards.
- Preserve destination safety check to prevent deleting/copying to repo root.
web/backend/api:
- Rename applyLauncherWindowsProcAttrs() to applyLauncherProcAttrs() to expose a platform-independent interface name.
- Keep platform-specific behavior split by build tags: windows keeps HideWindow SysProcAttr setup, non-windows remains no-op.
- Update gateway startup path to call the renamed helper.
Why:
- Follow reviewer feedback to avoid relying on .git detection alone and prefer runtime/file-anchor based repository location.
- Improve naming clarity by making cross-platform interfaces generic while preserving OS-specific implementation details internally.
Validation:
- go test ./cmd/picoclaw/internal/onboard
- go test ./web/backend/api
* Fix Windows build flow
* build(makefile): make windows recipes shell-safe
- avoid backslash line-continuation in Windows build-launcher recipe
- replace cmd-specific if-not-exist with PowerShell check in web build-frontend
* Fix Windows build flow
* build(makefile): make windows recipes shell-safe
- avoid backslash line-continuation in Windows build-launcher recipe
- replace cmd-specific if-not-exist with PowerShell check in web build-frontend
* build(web): avoid shell-expanding powershell vars in windows recipe
- rewrite build-frontend Windows command without PowerShell local vars
- keep install-stamp hash check logic
- pid: When a container stops and leaves behind a PID file with PID 1
on a shared volume, the host's init process (PID 1) passes the
isProcessRunning check, blocking new gateway starts. Treat recorded
PID 1 as always stale in both WritePidFile and ReadPidFileWithCheck.
Added unit tests covering the PID=1 container leftover scenario.
- isolation: Fix govet shadow warning on platform_windows.go line 105
where := shadows the outer err variable. Changed to = assignment.
- gitattributes: Enforce LF line endings for shell scripts to prevent
CRLF issues when checking out on Windows (breaks Docker entrypoint).
Co-authored-by: BeaconCat <BeaconCat@users.noreply.github.com>
The launcher wired UI language changes into a process-global backend
switch that changed auto web-search provider selection and the
reported current service for every handler in the same process.
This narrows the fix to the validated leak: remove backend sync from
frontend locale changes, drop the now-unused UI endpoint, and make
auto selection fall back to a stable default when the query itself
does not contain a script hint.
Constraint: Keep the patch small and mergeable without redesigning per-user preference storage
Rejected: Add per-user backend language state | larger scope than the validated bug and unclear maintainer preference
Rejected: Persist preferred language in config | still shares mutable state across clients of the same instance
Confidence: high
Scope-risk: narrow
Reversibility: clean
Directive: If locale-aware provider routing is reintroduced later, scope it to explicit config or request context instead of package-global state
Tested: go test ./web/backend/api ./pkg/tools -count=1; pnpm lint; pnpm build
Not-tested: Full make check; live multi-browser manual launcher run after the backend endpoint removal
- Add build-macos-launcher job (runs on macOS, parallel with GoReleaser)
that builds the CGO-enabled launcher with systray support, signs and
notarizes it via rcodesign, then uploads as artifact.
- Add patch-macos-archives job (runs on cheap Linux runner, needs both
GoReleaser and build-macos-launcher) that downloads the launcher
artifact and darwin release archives, replaces the launcher binary,
and re-uploads the patched archives.
- Fix Docker image tag errors: GITHUB_REPOSITORY_OWNER is immutable in
GitHub Actions. Introduce REPO_OWNER (lowercase) in workflows and
reference it in .goreleaser.yaml for GHCR image names and nfpms
homepage.
- Make Docker Hub login conditional on DOCKERHUB_USERNAME secret being
set, so forks without Docker Hub credentials don't fail.
- Make Docker Hub image in goreleaser conditional on DOCKERHUB_IMAGE_NAME
being non-empty (empty image names are ignored by GoReleaser).
Verified on fork: both nightly and release workflows pass all jobs.
Nightly: https://github.com/BeaconCat/picoclaw/actions/runs/24848808843
Release: https://github.com/BeaconCat/picoclaw/actions/runs/24849753787
Co-authored-by: BeaconCat <BeaconCat@users.noreply.github.com>
- centralize web search provider readiness and resolution logic
- fall back when the configured provider is unavailable or invalid
- allow native-search-capable models to use built-in search without the client tool
- simplify the tools page and add direct access to web search settings
- add backend, agent, and integration tests for the new selection behavior
* refactor: support explicit model list providers
* fix(web): preserve explicit model providers
* fix(web): preserve legacy provider prefixes on model updates
fix(models): normalize explicit provider-prefixed ids
fix(api): preserve legacy model updates across providers
fix(agent): preserve config identity for explicit provider refs
* fix ci
* feat(web): download attachments in frontend
* fix: proxy pico media and force svg downloads
* feat(web): hide ephemeral media refs from persisted session history
- Add `create-tag.yml`: creates annotated tag at a specified commit or
latest main HEAD, with duplicate tag and commit validation
- Simplify `release.yml`: only accepts existing tags, removes create_tag
toggle, validates tag via GitHub API before checkout
- Always checkout main branch (fetch-depth: 0 fetches full history),
then create tag at the specified commit
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Add a context window usage indicator to the web chat UI and a /context
slash command that works across all channels.
Backend:
- Add computeContextUsage() estimating history + system + tool tokens
- Attach ContextUsage to outbound messages via the pico WebSocket protocol
- Add /context command showing context stats as formatted text
- Add EstimateSystemTokens() on ContextBuilder for system prompt estimation
Frontend:
- Add ContextUsageRing component (SVG ring + hover/tap popover)
- Show usage percentage, token counts, and compression threshold
- Hover on desktop (150ms leave delay), tap on mobile
- "View Details" sends /context with 1s cooldown
- i18n support (en/zh) for popover labels
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Add reusable channel array list controls and parsing utilities for channel forms.
Normalize channel string-array payloads in the backend, including pasted values,
numeric IDs, hidden characters, duplicates, and empty clears.
Also allow FlexibleStringSlice to unmarshal null values and cover the new behavior
with backend and config tests.
Filter raw tool messages from session history and avoid duplicate summaries for visible message-tool output. Preserve final assistant replies after tool delivery and add coverage for visible transcript counts.
Also refine the chat UI with collapsible reasoning blocks, send shortcut hints, command-style user messages, stable scroll gutters, and updated i18n strings.
- stop exposing the raw Pico token to the frontend
- add /api/pico/info for non-secret Pico connection metadata
- proxy /pico/ws through the launcher with same-origin and dashboard auth checks
- inject the upstream Pico websocket protocol server-side
- update frontend chat connection flow and Vite websocket proxy path
- refresh related docs and tests
* refactor(docs): reorganize docs by type and locale
* chore(docs): add docs layout lint target and contributor guidance
Introduce a lint-docs script and Makefile target for common
documentation naming and placement checks. Expand docs/README.md
with layout and translation conventions, and update CONTRIBUTING.md
to point contributors to the new docs guidance and validation step.
* docs: add section index pages and fix localized doc links
- add reader navigation to docs/README.md
- add index pages for guides, reference, operations, security, architecture, and migration
- update localized project README links to prefer existing translated docs
* docs: fix broken wecom link in Malay README
Introduce a lint-docs script and Makefile target for common
documentation naming and placement checks. Expand docs/README.md
with layout and translation conventions, and update CONTRIBUTING.md
to point contributors to the new docs guidance and validation step.
Add loop-split.md explaining the 12-file split of the original
4384-line loop.go, covering the file map, extraction method,
and future phase 2 plans.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Propagate the configured HTTP client and proxy settings to the
SearXNG search provider.
Allow web_fetch to connect to the configured proxy as the first hop
without bypassing the existing private-host checks for redirect
targets and fetched URLs.
Add tests for loopback proxy fetches and SearXNG proxy propagation.
- split the tools page into focused components and a shared hook
- add separate Tool Library and Web Search tabs
- refresh web search settings layout and localized copy
- make provider expansion keyboard accessible
- restore wrapping for long tool names in library cards
- allow custom styling for KeyInput
Persist channel settings through the current channel_list schema, keeping common
channel fields at the top level and channel-specific fields under settings.
Return common fields and default config shapes from channel config endpoints, and
add coverage for nested patches, missing channel defaults, and secret handling.
Replace generic mockProvider with modelRecordingProvider that captures
the model parameter passed to Chat(). After delegation from alpha to
beta, assert the recorded model is "model-beta" — proving the child
turn actually ran with the target agent's configuration, not the
caller's.
Also add wiring tests:
- TestDelegateToolNotRegistered_SingleAgent: single-agent has no
delegate in its tool registry
- TestDelegateToolRegistered_MultiAgent: both agents in a two-agent
setup have the delegate tool
Ref: #2148
Remove the IsToolEnabled("delegate") check — there is no "delegate"
entry in ToolsConfig, so the check was always true. The only real
gate is len(agents) > 1, which is the intended behavior: delegate
is auto-registered in multi-agent setups.
Ref: #2148
Add table-driven test with case and whitespace variants (ALPHA,
" Alpha ", " alpha ") that should all be caught by the self-check
after normalization.
Ref: #2148
Apply routing.NormalizeAgentID to the raw agent_id input before any
logic runs. This prevents case/whitespace variants like "ALPHA" or
" alpha " from bypassing the self-delegation guard while still
resolving to the same agent in the registry.
The normalized value is used consistently for self-check, allowlist,
SpawnSubTurn, and result attribution.
Ref: #2148
Register the delegate tool in registerSharedTools when multiple agents
are configured. Gated independently from the subagent tool — delegate
uses SubTurn directly and does not depend on SubagentManager.
Self-delegation is prevented by injecting the current agent ID.
Permission is enforced via CanSpawnSubagent (reuses allow_agents config).
Single-agent setups are unaffected: the tool is not registered when
only one agent exists in the registry.
Ref: #2148
12 test cases covering:
- success path with result attribution
- agent_id validation (missing, empty, whitespace, wrong type)
- task validation (missing, empty, whitespace)
- permission denied / allowed via allowlist checker
- self-delegation blocked
- nil spawner, spawner error, nil result from spawner
- open access when no allowlist checker is set
Ref: #2148
delegate(agent_id, task) hands off a task to a named agent and blocks
until the result is ready. The target agent runs with its own config
via the TargetAgentID mechanism in SubTurnConfig.
Key behaviors:
- Self-delegation explicitly rejected
- Permission gated by subagents.allow_agents (D4)
- Spawner errors preserve the underlying error via WithError
- Nil result from spawner handled gracefully
- Response attributed with target agent ID
Ref: #2148
Add multi-agent test setup (newMultiAgentLoop) with two agents using
distinct models (model-alpha, model-beta).
Three new tests:
- UsesTargetAgent: parent=alpha delegates to beta, event log confirms
child runs as agent_id=beta with model=model-beta
- NotFound: TargetAgentID pointing to nonexistent agent returns error
- EmptyModelAccepted: empty Model field accepted when TargetAgentID
provides the model implicitly
Ref: #2148
When TargetAgentID is set, spawnSubTurn resolves the target AgentInstance
from the registry and uses it as the base for the child turn. This gives
the child turn the target's workspace, model, tools, and system prompt
instead of inheriting from the caller.
Model validation is relaxed: empty Model is accepted when TargetAgentID
provides the model implicitly via the resolved agent instance.
Ref: #2148
* membench: add LLM-as-Judge evaluation mode
Add --eval-mode=llm to membench for LLM-based answer generation and
semantic scoring via an OpenAI-compatible API endpoint.
New files:
- llm_client.go: generic OpenAI-compatible chat completion client
with support for API key, configurable timeout, and optional
chat_template_kwargs (for llama.cpp thinking models)
- eval_llm.go: LLM answer generation + LLM-as-Judge scoring for
both legacy and seahorse retrieval modes
Changes to main.go:
- --eval-mode flag (token|llm) to select evaluation strategy
- --api-base, --api-key, --model flags with env var fallback
(MEMBENCH_API_BASE, MEMBENCH_API_KEY, MEMBENCH_MODEL)
- --no-thinking flag for llama.cpp + Qwen thinking models
- --limit flag to cap QA questions per sample for quick testing
* style: fix golangci-lint formatting (gofmt + golines)
* fix: address Copilot review feedback
- Validate --model is required for LLM eval mode
- Use rune-based truncation to preserve valid UTF-8
- Precompute totalQA count outside inner loop
- Log SearchMessages errors instead of silently skipping
* fix: address Copilot review round 2
- Validate --eval-mode accepts only 'token' or 'llm'
- Normalize base URL to avoid /v1/v1 duplication
- Separate token/LLM results for correct PrintComparison labeling
- Log ExpandMessages errors instead of silently ignoring
- Short-circuit with 0 scores when no context retrieved (match token eval)
- Add --timeout flag wired to LLMClientOptions.Timeout
* fix: address review P1+P2 — sort alignment, failure sentinel, score parser
- P1: Replace hand-rolled sortByRank with sort.Slice (ascending, best
first) matching eval.go's EvalSeahorse — ensures BudgetTruncate keeps
best-ranked messages when truncation occurs
- P2: Use -1.0 sentinel for LLM API failures and parse errors, distinct
from genuine 0.0 score; aggregateMetrics skips -1.0 entries for F1
averaging while still counting HitRate
- P2: Use regexp \b([1-5])\b for judge score extraction instead of
first-digit scan — avoids misparses on '5/5', 'Score: 3' etc.
* fix: address Copilot review round 2
- Fix F1/HitRate weighted aggregation: track ValidF1Count separately so
computeModeAgg weights F1 by valid scores only, not TotalQuestions
- No-context retrieval failure uses 0.0 (genuine bad score) instead of
-1.0 sentinel (reserved for API/parse failures)
- Validate --timeout > 0 to prevent disabling HTTP timeouts
* fix: remove hardcoded /v1 from API base URL
Users now provide the full versioned path in --api-base (e.g. /v1, /v4).
Code only appends /chat/completions. Default changed to
http://127.0.0.1:8080/v1 for backward compatibility.
* fix: address Copilot review round 3
- ValidF1Count=0 when all scores are sentinel (no forced =1)
- Backward compat: old eval JSON without ValidF1Count falls back to
TotalQuestions in computeModeAgg
- Skip empty section in PrintComparison when tokenResults is empty
- Update --api-base flag help to document /v1 default and version path
- Add sentinel aggregation unit tests (partial, all, weighted)
* feat: add --retries flag with exponential backoff for transient LLM errors
Retry on timeout, 5xx, and 429 (rate limit) with 1s/2s/4s backoff.
Default 3 retries, configurable via --retries. Context cancellation
is respected between retries.
* fix: address Copilot review round 4
- runReport splits results by mode suffix into token/llm for PrintComparison
- backward compat fallback (ValidF1Count=0 -> TotalQuestions) only for
non-LLM modes; LLM modes keep ValidF1Count=0 when all scores sentinel
- MaxRetries==0 means no retry; only negative falls back to default 3
- truncateStr uses []rune to avoid cutting multi-byte UTF-8 characters
- Complete() returns error on empty LLM response (vs silent empty string)
* feat: --no-thinking adapts to llama.cpp, Ollama, and GLM backends
Send all three disable-thinking fields simultaneously:
- chat_template_kwargs.enable_thinking=false (llama.cpp, GLM)
- think=false (Ollama 0.9+)
- thinking.type=disabled (GLM/Zhipu)
Each backend picks the field it recognizes and ignores the rest.
Also bumps max_tokens from 512 to 2048 for thinking models.
* feat: mixed model eval + concurrent QA workers
- Add --judge-model, --judge-api-base, --judge-api-key flags for separate judge model
- Add --concurrency flag (default 1) with semaphore-based goroutine pool
- Add reasoning_content fallback for GLM/DeepSeek style responses
- Prepend /no_think to system prompt for Ollama /v1 compatibility
- Reduce default MaxTokens from 2048 to 512 (answers are 1-3 sentences)
- Extract evalQAWorker and buildSeahorseContext for shared concurrent logic
---------
Co-authored-by: BeaconCat <BeaconCat@users.noreply.github.com>
Scope tool result deduplication to each assistant tool-call block so providers
that reuse call IDs across separate turns do not lose valid tool results. Also
drop invalid empty tool call IDs and orphaned tool messages after validation.
Default the sample web search provider to auto, route Sogou vs DuckDuckGo dynamically based on query/UI language, and sync frontend language changes back to the backend so Current Service and runtime selection stay aligned.
* feat(web): show disabled reasons in tooltips when buttons are disabled
- Add disabled reason tooltips for model card actions (set default, delete)
- Add disabled reason tooltips for marketplace skill card install button
- Add disabled reason display for chat input when disabled
- Add internationalization support for all disabled reasons (en/zh)
- Model card: Show specific reasons when set-default or delete buttons are disabled
- Marketplace skill card: Show specific reasons when install button is disabled
- Chat composer: Show reason text below input when input is disabled
* fix: show disabled action reasons via tooltips
* fix(web): restore accessible labels for model action tooltips
* ci(workflows): use pnpm/action-setup in build and release pipelines
Replace the corepack-based pnpm setup with pnpm/action-setup
and pin pnpm to v10.33.0 in the create_dmg, nightly, and
release GitHub Actions workflows.
* docs(readme): update pnpm setup instructions across translated READMEs
Fix hiddenValues in manager_channel.go — use comma-ok type assertions to avoid panics │
Add GetDecoded() error handling in weixin.go saveWeixinConfig for consistency with wecom.go │
Fix stray quotes in docs/configuration.md JSON examples │
Add V2→V3 migration section to docs/config-versioning.md
Fix feishu init with 32bit wrong signature cause build fail
- build the Android universal bundle from GoReleaser hooks
- attach the bundle as a release asset
- remove the separate post-release upload step
- simplify Make targets around cross-platform builds
Verifies that databases created with the old buggy FTS5 DELETE trigger
body are correctly migrated by runSchema: the old trigger causes DELETE
to fail, and after re-running runSchema (which drops and recreates the
triggers with the corrected body) DELETE works normally.
`CREATE TRIGGER IF NOT EXISTS` does not replace an existing trigger body.
On databases created with the old (buggy) DELETE-FROM-FTS syntax, the
bad trigger body persisted after code updates. Now we explicitly DROP
each trigger before CREATE, so any existing DB gets the corrected body
on next startup — no manual DB deletion required.
- add a dedicated build-release-artifacts target for Android bundle packaging
- switch CI and release workflows to Corepack-managed pnpm with cache support
- pin the frontend pnpm version and make dependency installs deterministic
- inject version metadata into launcher binaries in GoReleaser
- update build documentation to reflect the new workflow
- Add Clear(ctx, sessionKey) to ContextManager interface
- Implement Clear for legacy (JSONL) and seahorse (DB + JSONL)
- Add Engine.ClearSession + Store.ClearConversation
- Fix FTS5 DELETE trigger syntax in schema (was using wrong
external-content FTS5 syntax; now uses standard DELETE FROM)
- Fix ClearSession to skip sessions never ingested (was creating
blank conversations record via GetOrCreateConversation)
- Simplify summary_parents DELETE into single OR statement
- Add TestStoreClearConversation unit test
- CONTRIBUTING.md: change link from zh-hans to en locale
- CONTRIBUTING.zh.md: fix NBSP causing surrounding text to be absorbed into the link
- Both files now use proper markdown link syntax
- Add build-android-arm64, build-launcher-android-arm64, build-all-android
targets to Makefile and web/Makefile
- Use -tags stdjson (no goolm) for Android; CGO_ENABLED=0 throughout
- Output staged as build/android-staging/arm64-v8a/libpicoclaw{,-web}.so
for JNI consumption; zip packaging handled by CI
- Exclude Matrix channel from android builds (channel_matrix.go) to avoid
modernc.org/sqlite CGO dependency
- Exclude systray from android builds; use headless stub instead
(systray.go / systray_stub_nocgo.go)
Cron session keys "agent:cron-{id}-{uuid}" were being silently ignored by
resolveScopeKey, which only recognizes keys prefixed with "agent:". This
caused multiple executions of the same job to share a session. Also
switch from timestamp to UUID to avoid collisions in concurrent scenarios.
Previously all executions of the same cron job reused the session key
"cron-{jobID}", causing conversation history to accumulate across runs.
Now each run gets a unique key "cron-{jobID}-{timestamp}", preventing
cross-execution interference.
User input containing FTS5 operators (-, +, *, OR, NOT, :, quotes,
parentheses) could cause query errors or unexpected search results.
Wrap each token in double quotes to force literal matching while
preserving user-quoted phrases.
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Pin react and react-dom to 19.2.5 to avoid runtime crashes caused by a version mismatch.
Refresh the pnpm lockfile to keep frontend dependencies in sync.
Handle platforms where the dashboard password store is unavailable
by treating legacy token auth as initialized, rejecting password
setup, and adding platform-specific store stubs and tests.
The self-built docker/Dockerfile and docker/Dockerfile.heavy created a
dedicated picoclaw user (uid 1000) and stored config at
/home/picoclaw/.picoclaw, while the released images from
Dockerfile.goreleaser (and Dockerfile.full) run as root at
/root/.picoclaw. Both docker-compose files mount ./data:/root/.picoclaw,
so self-built images silently broke when used with the shared compose.
Drop the picoclaw user switch and align both Dockerfiles on root +
/root/.picoclaw. Dockerfile also adopts the release entrypoint.sh so
first-run behavior matches between self-built and release tags.
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* feat(launcher): replace token-in-logs auth with standard HTTP login flow
## Problem
Previously users had to find the one-time token from console logs or
log files to access the dashboard - a non-standard, error-prone workflow
with no clear path for changing credentials.
## Solution: standard HTTP API login with bcrypt-backed password store
### Auth flow (new)
1. First run: browser opens, session guard detects uninitialized state,
redirects to /launcher-setup
2. User sets a password (min 8 chars) via POST /api/auth/setup {password, confirm},
bcrypt(cost=12) hash stored in ~/.picoclaw/launcher-auth.db (SQLite)
3. Subsequent logins: POST /api/auth/login {password}, HttpOnly cookie
picoclaw_launcher_auth (HMAC-SHA256 signed, 7-day expiry)
4. 401 on any API call, frontend redirects to /launcher-login
5. Logout: POST /api/auth/logout, cookie cleared, redirect to login
### Backend changes
- web/backend/api/auth.go: renamed Token to Password; added handleSetup;
launcherAuthStatusResponse now includes Initialized bool; PasswordStore
interface wires bcrypt store into handlers
- web/backend/dashboardauth/: new package - Store with New(dir) / Open(path);
SetPassword (bcrypt cost=12), VerifyPassword, IsInitialized
- sql.go: all DB-layer constants (DBFilename, sqliteDriver, bcryptCost,
four SQL query strings) - compile-time constants, zero runtime overhead
- web/backend/middleware/launcher_dashboard_auth.go: /launcher-setup and
/api/auth/setup added to public paths
- web/backend/main.go:
- dashboardauth.New(picoHome) replaces manual path construction
- maskSecret(): suffix only revealed when >=5 chars hidden (length >= 12),
preventing 8-char minimum passwords from leaking their tail
- web/backend/main_test.go: TestMaskSecret updated with boundary cases
### Forward-compatibility: pkg/credential integration
If the dashboard password is later reused as the enc:// passphrase,
the bcrypt hash in launcher-auth.db becomes an offline oracle.
Recommended mitigation (not yet implemented): derive two independent
subkeys via HKDF before use:
bcrypt(HKDF(password, info="picoclaw-dashboard-login-v1")) stored in DB
HKDF(password, info="picoclaw-credential-enc-v1") passed to PassphraseProvider
This isolates the two domains: cracking the bcrypt hash yields only the
login subkey, which is computationally independent of the enc:// subkey.
* fix(auth): replace wastedassign ok := false with var ok bool
* refactor(tray): remove copy-token clipboard feature
Dashboard login now uses standard web auth (bcrypt + session cookie).
The system tray 'Copy dashboard token' menu item is no longer needed.
- Delete tray_offers_copy.go and tray_offers_copy_stub.go
- Remove mCopyTok menu item and clipboard handler from systray.go
- Remove launcherDashboardTokenForClipboard var from main.go
- Remove MenuCopyToken/MenuCopyTokenHint keys from i18n.go
* feat(launcher-ui): standard HTTP login/setup/logout flow for dashboard
Replaces the previous "find token in logs" workflow with a proper
browser-based authentication UI backed by the new /api/auth/* endpoints.
### New pages
- /launcher-setup: first-run password initialization form (password +
confirm, min 8 chars); calls POST /api/auth/setup; redirects to login
on success
- /launcher-login: standard password login form; calls POST /api/auth/login;
sets HttpOnly session cookie on success
### Session guard (src/routes/__root.tsx)
A useEffect on every non-auth page load calls GET /api/auth/status:
- initialized=false -> redirect to /launcher-setup
- authenticated=false -> redirect to /launcher-login
This ensures the setup/login UI is shown even when the ?token= URL
mechanism auto-logs in (first-run case).
### Logout button (src/components/app-header.tsx)
IconLogout button added to the header with a confirm AlertDialog;
calls POST /api/auth/logout then redirects to /launcher-login.
### API layer
- src/api/launcher-auth.ts: LauncherAuthStatus gains initialized bool;
postLauncherDashboardSetup() added; LauncherAuthTokenHelp removed
- src/api/http.ts: 401 guard uses isLauncherAuthPathname() (covers both
/launcher-login and /launcher-setup) to prevent redirect loops
- src/lib/launcher-login-path.ts: isLauncherSetupPathname() and
isLauncherAuthPathname() added
### Routing
- src/routeTree.gen.ts: /launcher-setup route registered throughout
- src/routes/launcher-login.tsx: tokenHelp UI removed; useEffect added
to redirect to setup when initialized=false
### i18n
- en.json / zh.json: launcherSetup block added; launcherLogin keys
updated to use passwordLabel/passwordPlaceholder
* fix(lint): ts lint fixed 1
* fix(auth): detail auth error handle
* fix(login): frontend web auth error handle
* fix(frontend): auth error handler 5xx
When the message tool sent to a different chat (e.g., a group), the
agent's final response to the originating chat was incorrectly skipped
because HasSentInRound() was a simple bool that didn't distinguish
targets. Replace with HasSentTo(channel, chatID) that tracks all
send targets per round and only suppresses when the target matches.
Fixes cross-conversation message causing "Processing..." to hang.
* * completed
* * optimzie
* * fix format
* * fix pr check
* try to fix ci
* * Indicates that Windows does not support expos_paths, adding more mount paths for the Linux platform.
* fix isolation startup lifecycle and MCP transport wrapping
* fix isolation startup cleanup and optional Linux mounts
* fix isolation path handling for relative hooks
Preserve relative command and working-directory semantics when Linux isolation wraps subprocesses, and restore absolute argv path exposure to avoid startup regressions. Add hook coverage and docs updates so isolation-enabled process hooks keep working as configured.
* * fix ci
* fix(feishu): enrich reply context for card and file replies
* refactor(feishu): extract reply functions to feishu_reply.go
- Move reply-related functions to new feishu_reply.go
- Move corresponding tests to feishu_reply_test.go
- Extract magic number 600 to maxReplyContextLen constant
- Unify replyTargetID/replyTargetFromMessage (prefer parent_id, fallback root_id)
- Add source comment for containsFeishuUpgradePlaceholder
* fix(feishu): skip API fallback for non-thread messages, prepend replied media refs
- resolveReplyTargetMessageID: only call fetchMessageByID fallback when
ThreadId is set, avoiding unnecessary API calls for non-reply messages
- prependReplyContext: prepend replied media refs before current media refs
to maintain correct ordering
* fix(feishu): add message cache for fetchMessageByID to avoid repeated downloads
- Add messageCache (sync.Map) to FeishuChannel struct
- Cache fetched messages with 30s TTL to avoid re-downloading attachments
when multiple users reply to the same parent message in a thread
- Cleanup expired entries on read access (no background goroutine needed)
* fix(feishu): early-return for non-reply messages, add cache and fetchMessageByID comment
* fix: remove duplicate test and fix gci import order
* fix(feishu): remove duplicate prependReplyContext call
* fix(gateway): validate PID ownership and clean stale pid files
- include `pid` in health responses for runtime PID verification
- add `RemovePidFileIfPID` to safely delete PID files only on PID match
- sanitize gateway PID data via process-command checks with health fallback
- ignore and remove stale/non-gateway PID files before gateway operations
- refuse stop/restart actions when the attached process is not a gateway
- update gateway and websocket tests to cover PID validation and safety paths
* test(seahorse): use shared in-memory SQLite DB in tests to fix async compaction failures
* test: remove unused sendMediaErr field from hook test mock
* feat(hooks): add respond action for tool execution bypass
Add a new HookActionRespond that allows hooks to return tool results directly, skipping actual tool execution. This enables plugin tool injection, caching, and mocking capabilities.
- Add HookActionRespond constant and support in HookManager
- Extend ToolCallHookRequest with HookResult field
- Implement respond action handling in process hooks and agent loop
- Add comprehensive tests for respond and deny_tool actions
- Update documentation with hook actions table and examples
* docs(hooks): add JSON-RPC protocol and plugin tool injection documentation
Add comprehensive documentation for hook JSON-RPC protocol and plugin tool injection capabilities:
- Add "Hook Actions" section to README.zh.md explaining respond action for tool execution bypass
- Create hook-json-protocol.md/.zh.md detailing JSON-RPC 2.0 protocol for all hook methods
- Create plugin-tool-injection.md/.zh.md with complete examples for external tool implementation
- Document how hooks can inject tool definitions and return results via respond action
- Include Python and Go examples for weather query plugin implementation
* feat(agent): emit tool events and feedback for hook results
Add ToolExecStart event emission and tool feedback for hook results to ensure consistent behavior between normal tool execution and hook bypass scenarios. This maintains parity in event tracking and user feedback when tools are executed via hooks.
* style(agent): format whitespace in hook structs and constants
Remove trailing whitespace and standardize spacing in JSON struct tags, constants, and test data for improved code consistency.
* feat(hooks): add media support for plugin tool injection
Extend the hook respond action to support media file handling:
- Add `media` field for returning images and files from hooks
- Add `response_handled` field to control turn completion behavior
- When response_handled=true, media is automatically delivered to user
- When response_handled=false, media is passed to LLM for vision requests
This enables plugins to directly return generated images, downloaded
files, and other media content either to users or for LLM analysis.
* docs(hooks): document security implications of respond action
Add security boundary documentation explaining that the respond action
bypasses ApproveTool checks, allowing hooks to return results for any
tool without approval. Include recommendations for secure hook
implementation and code comments marking the security considerations.
Changes:
- Add "Security Boundaries" section to plugin-tool-injection docs
- Document bypass of approval checks and associated risks
- Provide security recommendations and example code
- Add inline security comments in hooks.go and loop.go
* refactor(agent): improve completeness of tool result cloning and hook processing
Extend cloneToolResult to properly copy ArtifactTags and Messages fields,
ensuring deep copies of all ToolResult data. Consolidate event emission
and user message handling to match the normal tool execution flow.
* fix(agent): align hook respond path with normal tool execution flow
The hook respond code path was missing several critical behaviors that
existed in normal tool execution:
- Add logging for tool calls with arguments preview
- Add is_tool_call metadata to user-facing messages
- Handle attachment delivery failures by setting error state and
notifying LLM
- Set ResponseHandled=false when using bus for media delivery
- Check for steering messages and graceful interrupts after tool
execution, skipping remaining tools when appropriate
- Poll for SubTurn results that arrived during tool execution
This ensures consistent behavior between hook-responded tool calls and
normally executed tool calls.
* test(agent): add tests for hook respond media error handling
Add comprehensive tests for the hook respond code path when media
delivery fails. Tests cover error media channel scenarios and verify
proper error state handling.
Also document that AfterTool is not called when using respond action,
as it provides the final answer directly (design decision).
* fix(agent): disable seahorse context manager on freebsd/arm
Exclude freebsd/arm from the seahorse-enabled build and route it to the
unsupported stub implementation.
This avoids freebsd/arm build failures caused by modernc sqlite/libc while
keeping picoclaw buildable on that target.
* build: bump Go version from 1.25.8 to 1.25.9
* ci: install and run govulncheck directly in PR workflow
Replace hand-rolled HTTP/HMAC/JSON code (~270 lines) with the official
line-bot-sdk-go v8, reducing maintenance burden and eliminating potential
bugs in signature verification, request construction, and response parsing.
This continues the work started in #500 by @xiaket, addressing all review
feedback and rebasing onto current main.
Changes:
- Replace bytes/crypto/json/io imports with line-bot-sdk-go/v8
- Use webhook.ParseRequest for body reading + signature verification
- Use messaging_api.MessagingApiAPI for ReplyMessage/PushMessage/ShowLoadingAnimation/GetBotInfo
- Type-switch on webhook.MessageEvent message types (TextMessageContent,
ImageMessageContent, etc.) instead of JSON unmarshalling
- Type-switch on webhook.SourceInterface (UserSource/GroupSource/RoomSource)
- Type-switch on webhook.Mentionee (UserMentionee/AllMentionee)
Review feedback addressed (from #500):
- Use WithContext(ctx) on all SDK calls to preserve cancellation/timeout
- Fix variable shadowing of isMentioned (declared at function scope)
- Remove reflect-based message ID extraction (use type switch + msg.Id)
- Use mentionee.IsSelf for cleaner bot mention detection
- Preserve body size security check via http.MaxBytesReader before
webhook.ParseRequest (compatible with #1413)
All existing tests pass without modification.
* fix: use per-candidate provider for model_fallbacks
Each fallback model now uses its own api_base and api_key from
model_list instead of inheriting the primary model's provider config.
Previously, a single LLMProvider was created from the primary model's
ModelConfig and reused for all fallback candidates — only the model ID
string was swapped. This caused all fallback requests to be routed to
the primary provider's endpoint, making cross-provider fallback chains
non-functional (e.g., OpenRouter primary with Gemini fallback would
send the Gemini request to OpenRouter's API).
Fix: pre-create a per-candidate LLMProvider at agent initialization
time by looking up each candidate's ModelConfig from model_list. The
fallback run closure now selects the correct provider per candidate
via CandidateProviders map, falling back to agent.Provider when no
override is found.
Fixes#2140
Made-with: Cursor
test: add test for instance.go
fix: fix test
refactor: optimize
fix: fix Golang lint issues
chore: comment cleanup
* refactor: use resolvedModelConfig() instead of buildModelIndex()
* fix
Add Microsoft Teams webhook integration via Power Automate workflows.
Features:
- Output-only channel for sending notifications to Teams
- Multiple webhook targets with named configuration
- Required "default" target with automatic fallback
- Rich Adaptive Card formatting with full-width rendering
- Markdown table conversion to native Adaptive Card Tables
- Column widths based on header content length
- HTTPS-only webhook URL validation
- Proper error classification for retry behavior
Configuration:
- channels.teams_webhook.enabled: bool
- channels.teams_webhook.webhooks: map of named targets
- Each target has webhook_url (SecureString) and optional title
Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
The frontend previously used ws_url returned by /api/pico/token, which
is built from the launcher's own port. Behind a reverse proxy this can
produce incorrect URLs (e.g. ws://localhost:18800 instead of the
proxy's public address).
Since the launcher already proxies /pico/ws on the same port, the
frontend can simply use window.location.host to construct the
WebSocket URL, which is always correct regardless of proxy layers.
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- treat `EPERM` from `signal(0)` as “process exists” on Unix
- classify malformed PID files as invalid and auto-remove them during read
- keep cached `pidData` only for transient races and downgrade `running` to `stopped` when the tracked process is gone
- refresh PID data on WebSocket proxy requests and reject stale cached gateway state
- add regression tests for invalid PID files, status downgrade, on-demand PID loading, and stale proxy rejection
SQLite FTS5 bm25() returns negative values where numerically smaller
(more negative) indicates a better match. The official docs state:
"The better the match, the numerically smaller the value returned."
Two comments incorrectly stated "closer to 0 = better match" and
"lower = better match". Updated all rank descriptions to use the
unambiguous "more negative = higher relevance" phrasing.
This matters because these comments are used as tool prompt hints
for LLM agents, and incorrect semantics could lead to wrong ranking
decisions.
- add build tags to exclude context_seahorse.go on mipsle and netbsd
- add context_seahorse_unsupported.go to keep registration and return a clear runtime error
- remove unused indirect dependency github.com/reiver/go-porterstemmer from go.mod and go.sum
- Add -console to Dockerfile CMD so launcher outputs logs to stdout,
making docker logs work as expected
- Remove 127.0.0.1 bind from ports to allow public network access
- Add commented PICOCLAW_LAUNCHER_TOKEN env var example
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* feat(seahorse): implement short-term memory engine of seahorse
Add pkg/seahorse/ module implementing a SQLite-backed DAG-based summary
hierarchy for context management, ported from lossless-claw's LCM design:
- types.go + short_constants.go: core types (Message, Summary, Conversation,
ContextItem) and configuration constants (fanout, token targets, thresholds)
- migration.go: idempotent DB schema with FTS5 trigram tokenizer for CJK
- store.go: full SQLite CRUD (conversations, messages, summaries DAG,
context_items with ordinal gap numbering, FTS5 search)
- short_engine.go: Engine lifecycle (NewEngine, Ingest, Assemble, Compact),
session pattern filtering (ignore/stateless glob→regex compilation),
per-session mutex via sync.Map
- short_assembler.go: budget-aware context assembly with fresh tail protection
(32 messages), oldest-first eviction, summary XML formatting, RebuildContextItems
- short_compaction.go: leaf compaction (messages→summary) and condensed
compaction (summaries→higher-level summary), 3-level LLM escalation,
CompactUntilUnder for emergency overflow
- short_retrieval.go: lookupByID, FTS5/LIKE search, recursive expand with
token cap
- context_seahorse.go: agent.ContextManager adapter, registered as "seahorse",
provider↔seahorse message type conversion (ToolCalls, tool_result)
* fix(seahorse): correct 3 adapter bugs in context management
- TokenCount: use full message (Content+ToolCalls+Media) instead of Content-only
- Empty Content: rebuild Content from tool_result Parts when stored empty
- Duplicate summaries: summaries only in Summary field, not in History messages
- Grep: fix SearchResult.Snippet→Content for summaries
- Schema: fix FTS5 SQL uses VIRTUAL TABLE not TEMP TABLE
- TestFTS5SQLConstants: verify FTS5 SQL syntax correctness
- Test: fix flaky TestCompactLeaf
* fix(agent): ingest steering messages into seahorse SQLite
Steering messages were only persisted to session JSONL but not ingested
into seahorse SQLite, causing them to be missing from context assembly.
Added `ts.ingestMessage(turnCtx, al, pm)` call in the steering message
injection block alongside the existing JSONL persistence.
Test: TestSeahorseSteeringMessageIngested verifies steering messages
appear in seahorse SQLite DB after being processed.
* fix(seahorse): address 3 blocking bugs from code review
- Fix resequenceContextItemsTx scan error handling (store.go:850)
Changed `return err` to `return scanErr` to properly propagate scan errors
instead of returning nil (which silently corrupts data)
- Fix sql.NullString for INTEGER column (store.go:847)
Changed `mid` from sql.NullString to sql.NullInt64 since message_id
is INTEGER in schema. Removed unnecessary strconv.ParseInt call.
- Fix compactCondensed fallback deleting non-candidate items
Added ReplaceContextItemsWithSummary method for per-item deletion
when candidates are not contiguous in ordinal space.
Optimized to use range deletion when candidates are consecutive.
* fix(seahorse): pass Budget to Compact for correct condensed threshold
Issue #4 from PR review: When Budget was not passed to seahorse.Compact,
it defaulted to `tokensBefore * 0.75`, making `tokensBefore > budget`
always true and causing condensed compaction to trigger unnecessarily.
Changes:
- context_seahorse.go: Forward Budget from CompactRequest to CompactInput
- loop.go: Pass Budget (ContextWindow) in all 3 Compact calls
- Add test verifying condensed is skipped when tokens < threshold
- Fix lint issues in store.go and store_test.go
* fix(seahorse): add mutex for assembler lazy initialization
Issue #5 from PR review: The check-then-create pattern for e.assembler
was a data race when multiple goroutines called Assemble() concurrently:
if e.assembler == nil {
e.assembler = &Assembler{...}
}
Changes:
- Add assemblerMu sync.Mutex to Engine struct
- Add initAssemblerOnce() using double-checked locking (same pattern as initCompactionOnce)
- Add TestAssemblerLazyInitRace to verify thread-safety
* fix(seahorse): handle non-consecutive depths in selectShallowestCondensationCandidate
Issue #8 from PR review: the loop iterated depth 0, 1, 2... assuming
consecutive keys, but break when key was missing caused deeper depths
to never be checked.
Fix: collect all existing depth keys, sort, then iterate in order.
* fix(seahorse): wrap DeleteMessagesAfterID and appendContextItems in transactions
- DeleteMessagesAfterID: wrap all DELETE operations in a transaction for
atomicity, remove redundant manual FTS delete (handled by trigger)
- appendContextItems: use transaction to fix read-then-write race condition
- Add GetMaxOrdinalTx and resolveItemTokenCountTx for transaction-scoped queries
- Remove unused resolveItemTokenCount function
Fixes PR review issues 6 and 7.
* fix(seahorse): derive readable content from Parts and cap CompactUntilUnder iterations
- Derive readable content from MessageParts in AddMessageWithParts so
FTS5 indexing and summary formatting can access tool call information
- formatMessagesForSummary and truncateSummary now fall back to Parts
when Content is empty, fixing blank summaries for Part-based messages
- Add MaxCompactIterations (20) to prevent CompactUntilUnder infinite
loops; exceeded iterations are logged as warnings
* feat(mcp): store oversized text results as artifacts
* feat(mcp): fix doc
* fix(mcp): preserve raw MCP payload in text artifacts
* fix(mcp): avoid leaking large text when artifact persistence fails
* chore(mcp): clarify inline text limit and cover artifact edge cases
- forward refs through ScrollArea so logs can access the viewport
- keep logs pinned to the bottom only when the user is already near it
- apply import and className ordering cleanup across frontend components
- add `launcher_token` to launcher config API/schema and save/load flow
- update dashboard token resolution order: env var -> launcher config -> random
- expose token source in startup logs and auth help metadata (including config path)
- add launcher token input to the config page and wire frontend form/API updates
- update login help/i18n copy and extend backend tests for new token-source behavior
* feat: add VK channel support
- Add VK channel implementation using vksdk
- Support text messages and media attachments
- Implement Long Poll API for real-time messaging
- Add group chat support with trigger prefixes
- Add user whitelist (allow_from) configuration
- Add VK channel documentation
Files:
- pkg/channels/vk/: VK channel implementation
- pkg/config/config.go: Add VKConfig structure
- pkg/channels/manager.go: Register VK channel
- pkg/gateway/gateway.go: Import VK channel package
- docs/channels/vk/: Usage documentation
* test: add unit tests for VK channel
- Test channel initialization with various configurations
- Test allow_from whitelist functionality
- Test group trigger configuration
- Test max message length (4000 chars)
- Test message splitting logic
- Test attachment processing
All tests passing ✓
* fix: resolve linting issues in VK channel
- Format VKConfig struct tags to comply with golines
- Remove unused mu sync.Mutex field
- Remove unused stripPrefix method
All tests passing ✓
* style: format VKConfig with golines
- Align struct tags to match project style
- Match formatting with other channel configs (Telegram, etc.)
- Fix golines linting error
* style: fix struct tag formatting in config.go
* docs: update VK channel docs to use secure token storage
* feat(vk): add voice capabilities support
- Implement VoiceCapabilities() method for VK channel
- Add audio_message attachment handling in processAttachments
- Add comprehensive tests for voice capabilities
- Support both ASR (speech-to-text) and TTS (text-to-speech)
* docs: add VK channel to documentation and update voice support
- Add VK channel to README.md and README.zh.md channel lists
- Update VK channel documentation with voice message support
- Document ASR and TTS capabilities for VK channel
- Add voice transcription configuration reference
* refactor(web): load channel configs without exposing secret values
- add a dedicated channel config API that returns sanitized config plus
configured secret metadata
- update channel config pages and forms to use secret presence for
placeholders, validation, reset, and save behavior
- refresh the channel settings layout and clean up related i18n copy
- add backend tests for the new channel config endpoint
* fix(config): restore missing strings import
- remove version details from the sidebar footer
- show the current app version as a badge in the config page header
- add a reusable Badge UI component for the new version label
* feat(provider): add Venice AI support and update related documentation
* revert(asr): restore asr files to previous commit
* feat(config): add Venice API base URL and local LM Studio configuration
* fix(config): update Venice API base URL to correct endpoint
* feat(updater): add web self-update endpoint and updater package
* feat(selfupgrade): when url empty, using GetTestReleaseAPIURL for test .
* feat(selfupgrade): only GetTestReleaseAPIURL .
* feat(upgrade): cli $0 update work well!
* fix(ci): fix ci err
* fix(test): fix ci test
* fix(ci): fix ci lint fmt err
* test(updater): add test for updater
* fix(ci): fix ci lint var copy err
* fix(ci): retry ci
* updater: require checksum verification, prefer API digest, verify SHA256, fix zip extraction, update tests
* fix(lint): lint fixed
* fix(lint): lint fixed2
* updater: stream download and verify sha256; add http client timeout and progress
Avoid double-download by streaming asset into temp file while computing SHA256 and verifying against checksum; replace http.Get with shared httpClient (2m timeout) to prevent hangs; add simple stderr progress display; remove unused helpers.
* feat: add load_image tool for local file vision
* fix: address load_image PR review feedback
- Exclude load_image from sub-agent tools via Unregister after Clone,
since RunToolLoop does not call resolveMediaRefs
- Add ToolRegistry.Unregister() method
- Fix scope collision: use channel:chatID instead of filename
- Add channel/chatID context resolution matching send_file pattern
- Add comment explaining iteration > 1 guard on resolveMediaRefs
- Remove emoji from ForUser for consistency with send_file
- Add load_image_test.go
* feat: enable load_image for subagents via MediaResolver in RunToolLoop
Instead of removing load_image from sub-agent tools (28f69e71), inject a
MediaResolver into the legacy RunToolLoop fallback path so media:// refs
are resolved to base64 before each LLM call — matching the main agent
loop behavior.
- Add MediaResolver field to ToolLoopConfig and call it on iteration > 1
- Add SubagentManager.SetMediaResolver() and wire it through runTask
- Remove ToolRegistry.Unregister() (no longer needed)
- Restore load_image in sub-agent tool set (revert Clone+Unregister)
- Add TestSubagentManager_SetMediaResolver_StoresResolver
* refactor(load_image): remove prompt parameter from tool schema
* test(tools): add success-path test for LoadImageTool
Add TestLoadImage_SuccessPath that creates a real PNG file with valid
magic bytes, calls Execute with WithToolContext, and verifies:
- result.IsError == false
- ToolResult.Media contains a media:// ref
- ToolResult.ForLLM contains the [image: marker
- media ref is resolvable in the store
Add explanatory comment in loop.go for why Media and ArtifactTags
coexist on non-ResponseHandled tool results (e.g. load_image).
* fix: preallocate slice in tests and add ResponseHandled guard in toolloop
Fix prealloc linter failure in load_image_test.go.
Prevent double-resolving media by checking ResponseHandled in toolloop.go.
* Register TTS tool if provider is available
---------
Co-authored-by: Reusu <admin@yumao.name>
Co-authored-by: 美電球 <hoshina@evaz.org>
- add backend APIs for searching and installing registry skills, including origin metadata and concurrency-safe workspace writes
- introduce /agent/hub as the default agent entry with marketplace search and install UI
- refactor the skills and tools pages with filtering, dialogs, detail views, import validation, and updated i18n
- expand backend tests for search, install, import, rollback, and concurrent requests
* fix(api): enhance model availability probing with backoff and caching mechanisms
* fix(lint): resolve gci and predeclared issues in model probe
* fix(api): address copilot review feedback on probe cache key and test stability
* fix(api): reduce probe cache key fragmentation
Addresses reviewer concerns regarding silent message loss by narrowing the
error swallowing logic in EditMessage:
- Excludes context.DeadlineExceeded and context.Canceled from being swallowed,
ensuring local timeouts before transmission still trigger a fallback send.
- Adds an explicit check for the 'message is not modified' error to safely
identify edits that have already landed on Telegram's servers.
- Narrowly targets confirmed post-connect dropouts (e.g., connection reset)
instead of broad network-ish string matching.
- Fixes the missing isPostConnectError definition and required errors import.
* feat(web): clarify model availability and status display
- Rename model availability field from configured to available across backend API and frontend usage
- Keep status as reason classification (configured/unconfigured/unreachable) and show unreachable in UI
- Preserve API key preview even when local service is unreachable
- Update backend tests to assert both availability and status semantics
* fix(web): clarify unreachable model status and wording
- Show unreachable status in model cards instead of API key preview when service is down
- Keep API key placeholder preview in model settings whenever an API key is already saved
- Rename model status wording from configured to available across backend, frontend, and i18n
- Update backend model status tests to match renamed status semantics
* style(web): standardize formatting in handleListModels function
* refactor(web): enforce status field as required to follow backend behavior
- centralize gateway log level resolution and normalization
- propagate debug flags to spawned launcher and gateway processes
- add a log level selector to the logs page
- cover the new behavior with backend and config tests
Treat SystemParts as an alternative representation of message Content
rather than an additive one. This prevents systematic overestimation
of system message tokens which could trigger premature context
pruning or summarization.
- Picks the maximum of Content vs. SystemParts to stay conservative.
- Adds a per-part overhead (20 chars) to account for JSON metadata.
- Streamlines the ReasoningContent counting logic.
Fixes a deficiency where structured blocks for cache-aware adapters
caused overestimated budgets or hidden overflows.
Load the Pico token from config before validating websocket proxy requests
when the launcher attaches to an existing gateway and the in-memory cache
is still empty
* feat(provider): add lmstudio vendor and local no-key behavior
* refactor(provider): consolidate protocol metadata and local tests
* fix(provider): sync lmstudio probing and model normalization
* test(web): format lmstudio model status cases for golines
reItalic (_text_) ran after reLink converted [text](url) to <a href>,
injecting <i> tags into URLs containing underscores (e.g. Google Flights
URL-safe base64 in the tfs param). Telegram silently dropped such malformed
<a> tags, causing only 1 of 3 links to appear in messages.
Fix: extract markdown links into placeholders before any formatting runs,
restore them as <a href> last — same pattern used for code blocks.
* feat(channels): Channel.Send and MediaSender.SendMedia return delivered message IDs
Change Channel.Send signature from (ctx, msg) error to (ctx, msg) ([]string, error)
and MediaSender.SendMedia similarly, so callers can capture platform message IDs
for threading, reactions, and history annotation.
Adapters that return real IDs: Telegram (per-chunk MessageID), Discord (Message.ID),
Slack Send (ts), QQ (sentMsg.ID), Matrix (EventID). Slack SendMedia returns nil
because UploadFileV2 does not expose the posted message timestamp in its response.
All other adapters return nil IDs.
preSend and sendWithRetry in manager.go updated to propagate ([]string, bool).
README examples updated for both English and Chinese docs.
* style: apply golangci-lint fixes (golines)
* docs: fix Send migration guide — restore old error-only signature in before/after example
- Add tour guide component with floating bubbles
- Guide users through: Welcome -> Configure Models -> Start Gateway -> View Docs
- Use localStorage to persist tour state
- Support i18n (Chinese and English)
- Highlight target elements with spotlight mask
- Allow skipping tour at any time
* feat(web): display backend version info in sidebar
* fix(web): improve version parsing and timeout behavior
* refactor(web): remove useless --version fallback
* feat(web): implement version info caching and improve retrieval logic
* fix(web): clarify version timeout rationale
* fix(web): harden gateway version probing and tests
* style(web): split regexp to two lines for lint
- Add `reaction` tool that reacts to a message (defaults to current inbound message via context)
- Extend `message` tool with optional `reply_to_message_id` parameter
- Introduce `WithToolInboundContext` to inject inbound message IDs into tool execution context
- Surface `MessageID` and `ReplyToMessageID` in `processOptions` for tool-surface consumption
Refs #2137
- upgrade Vite, ESLint, React plugin, and related frontend packages to secure versions
- refresh the pnpm lockfile to pull in patched transitive dependencies
- raise the required Node.js version to match the patched toolchain
- update the web README with the new frontend runtime requirement
When the message tool sent to a different chat (e.g., a group), the
agent's final response to the originating chat was incorrectly skipped
because HasSentInRound() was a simple bool that didn't distinguish
targets. Replace with HasSentTo(channel, chatID) that tracks all
send targets per round and only suppresses when the target matches.
Fixes cross-conversation message causing "Processing..." to hang.
- delegate root launcher builds to the web Makefile
- add dedicated frontend and dev picoclaw build targets
- document the WebUI architecture, runtime behavior, and build workflow
* docs: document gateway.log_level in all READMEs and i18n configuration docs
Add gateway log level note to Channels section in all 9 READMEs and
add Gateway Log Level section to zh/fr/ja/pt-br/vi configuration docs.
- gateway.log_level (default: fatal) controls log verbosity
- Supported values: debug, info, warn, error, fatal
- Can also be set via PICOCLAW_LOG_LEVEL env var
- English docs/configuration.md already had this section
* fix(docs): correct gateway.log_level default from fatal to warn
DefaultConfig() sets Gateway.LogLevel to "warn", not "fatal".
Update all READMEs and i18n configuration docs to reflect the
actual default value.
---------
Co-authored-by: BeaconCat <BeaconCat@users.noreply.github.com>
The agent path now publishes to outbound bus directly (since #2100),
making the deliver=true direct-to-bus shortcut and the directive type
prompt wrapping redundant. All cron jobs now uniformly route through
the agent. This is an intentional behavior change: old jobs with
deliver=true will execute through the agent instead of bypassing it.
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
* fix(cron): publish agent response to outbound bus for cron-triggered jobs
When a cron job triggers agent execution via ProcessDirectWithChannel,
the agent response was silently discarded — the code assumed AgentLoop
would auto-publish it, but SendResponse is false on this path.
Delegate to PublishResponseIfNeeded (exported from AgentLoop) so the
response reaches the originating channel (e.g. Telegram) only when the
message tool did not already deliver content in the same round.
Also adds a "directive" message type to CronPayload, allowing cron jobs
to instruct the agent to execute a task rather than echo static text.
* fix(cron): add type validation and directive test coverage
Address reviewer blocking feedback:
1. Server-side whitelist for `type` parameter — the `enum` in
Parameters() is only an LLM schema hint; any string was persisted.
Now `addJob` rejects values other than "message" and "directive".
2. Comprehensive test coverage for the directive code path:
- directive adds prompt prefix to ProcessDirectWithChannel
- deliver=true + directive routes through agent (not direct publish)
- directive prompt content, sessionKey, channel, chatID are correct
- invalid type is rejected; valid types ("", "message", "directive") pass
- deliver=true message type goes directly to bus (regression)
- agent error path does not trigger publish (regression)
Also merge the two UpdateJob calls in addJob into one to avoid
redundant disk I/O (non-blocking suggestion from review).
* fix(cron): remove omitempty from CronPayload.Type for consistent JSON
Empty string and "message" are semantically equivalent defaults;
always serializing the field avoids asymmetric JSON output.
* test(cron): remove redundant test, strengthen error path coverage
- Remove ExecuteJobDirectivePassesCorrectContent: its assertions on
sessionKey/channel/chatID duplicate ExecuteJobPublishesAgentResponse;
its prompt check duplicates DirectiveAddsPromptPrefix.
- Strengthen DirectiveAddsPromptPrefix with exact prompt match and
publish response assertion.
- Fix ReturnsErrorWithoutPublish: set non-empty stub response so the
test verifies the error branch early-return, not the response==""
guard.
* fix(ci): satisfy golines and gosmopolitan in cron code
Add token-based authentication for the Launcher's embedded Web Dashboard.
- Ephemeral token generated in-memory each run (or via PICOCLAW_LAUNCHER_TOKEN env var)
- HMAC-SHA256 session cookie (HttpOnly, SameSite=Lax, Secure when HTTPS)
- Bearer token support for API/script access
- Rate limiting on login (10 attempts/IP/min)
- Referrer-Policy: no-referrer on all responses
- POST-only logout with JSON content-type (CSRF-safe)
- System tray "Copy dashboard token" action
- Login page shows contextual help (console/tray/log file path)
- Path traversal protection via path.Clean
- X-Forwarded-Host/Port/Proto support for reverse proxy deployments
- Full i18n support (English, Chinese)
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Feishu returns 231001 when emoji_type is empty. Config slices like
["", "Pin"] could randomly select an empty string; filter and
trim entries and fall back to Pin when none remain.
Made-with: Cursor
Unified restart-required detection and notification mechanism so that model, tool, and configuration changes all follow the same signature-based comparison logic.
Migrate Azure OpenAI provider from legacy Chat Completions API to the OpenAI Responses API.
- Switch API endpoint from `/openai/deployments/{deployment}/chat/completions` to `/openai/v1/responses`
- Change auth header from `Api-Key` to `Authorization: Bearer`
- Use `responses.ResponseNewParams` SDK types for request construction
- Extract shared Responses API utilities into `openai_responses_common` package
- Deduplicate 178 lines from codex_provider.go by reusing shared package
- Add 593 lines of comprehensive test coverage for the shared package
Closes#2111
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- loop_test.go: replace undefined WithSecurity/SecurityConfig/ModelSecurityEntry
with direct APIKeys field using SimpleSecureStrings()
- dingtalk_test.go: use ClientSecret.String() and ClientSecret.Set()
instead of non-existent ClientSecret() and SetClientSecret() methods
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Add README.my.md (full Malay translation from English, including
macOS guide, MiMo provider, unified WeCom row, all sections)
- Add docs/my/ (chat-apps, configuration, debug, docker, spawn-tasks,
troubleshooting) from upstream PR #1770
- Add [Malay](README.my.md) language link to all 8 existing READMEs
- Add v0.2.4 news entry to all 9 READMEs (en/zh/fr/ja/pt-br/vi/id/it/my)
- Move 2026-02-26 20K Stars entry into Earlier news in all READMEs
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* fix(dingtalk): honor @mention flag in mention-only groups
* fix(dingtalk): strip leading mentions in group payloads
---------
Co-authored-by: Alix-007 <267018309+Alix-007@users.noreply.github.com>
- Rewrite docs/channels/wecom/README.md and README.zh.md with unified
3-option setup guide (Web UI QR / CLI QR / manual config), full config
table with defaults and env vars, runtime behavior details, and
migration notes from legacy wecom_bot/wecom_app/wecom_aibot
- Add assets/wecom-qr-binding.jpg screenshot for Web UI QR binding flow
- Remove obsolete docs/channels/wecom/wecom_bot/, wecom_app/, wecom_aibot/
subdirectories (18 files, all language variants)
- Update Channels table in all 8 READMEs: replace 3 legacy WeCom rows
with single unified WeCom row; zh README links to README.zh.md,
others link to README.md
- Add Xiaomi MiMo (mimo/) to Providers table in all 8 READMEs
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* Add comprehensive documentation for PicoClaw configuration, chat applications, debugging, Docker setup, async tasks, and troubleshooting on MY language:
- Introduced a new document on MY language for chat applications configuration detailing setup for Telegram, Discord, WhatsApp, and others.
- Created a configuration guide on MY language outlining environment variables, workspace structure, and security settings.
- Added a debugging section to assist users in troubleshooting and understanding agent interactions on MY language.
- Provided a Docker guide on MY language for easy deployment using Docker Compose.
- Documented the use of spawn on MY language for asynchronous tasks and how to configure heartbeat settings.
- Included a troubleshooting section on MY language for common model-related errors.
* docs: add Malay language support to documentation
* Potential fix for pull request finding
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
---------
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Add a collapsible macOS security warning guide under the WebUI Launcher
section in all 8 README languages (en/zh/fr/ja/pt-br/vi/id/it).
- New assets: macos-gatekeeper-warning.jpg, macos-gatekeeper-allow.jpg
- Updated asset: launcher-tui.jpg
- Two-step guide: shows Gatekeeper warning screenshot, then
Privacy & Security → Open Anyway flow
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* docs: update WeChat QR code and TUI launcher screenshot
* docs: convert launcher-tui.jpg from PNG to proper JPEG format
---------
Co-authored-by: BeaconCat <BeaconCat@users.noreply.github.com>
## Summary
- Add zai-coding to Providers table
- Add Z.AI Coding Plan to All Supported Vendors table
- Add Z.AI Coding Plan configuration example with troubleshooting note
* Add multi-message sending via split marker
* Add marker and length split integration tests
Tests that SplitByMarker and SplitMessage work together correctly, and
that code block boundaries are preserved during marker splitting.
* Simplify message chunking logic in channel worker
Extract splitByLength helper function and remove goto-based control
flow.
The logic now flows more naturally - try marker splitting first, then
fall
back to length-based splitting.
* Update multi-message output instructions in agent context
* Add split_on_marker to config defaults
* Add split_on_marker config option
* Rename 'Multi-Message Sending' setting to 'Chatty Mode'
* Add SplitOnMarker config option
- Upgrade github.com/creack/pty from v1.1.9 to v1.1.24
- Move github.com/mattn/go-sqlite3 to indirect dependency
- Move rsc.io/qr from indirect to direct dependency
* feat: add Xiaomi MiMo provider support
- Add 'mimo' protocol prefix support in factory_provider.go
- Add default API base URL for MiMo: https://api.xiaomimimo.com/v1
- Update provider-label.ts to include Xiaomi MiMo label
- Add MiMo to provider tables in both English and Chinese documentation
- Add comprehensive unit tests for MiMo provider
MiMo API is compatible with OpenAI API format, making it easy to integrate
with the existing HTTPProvider infrastructure.
Users can now use MiMo by configuring:
{
"model_name": "mimo",
"model": "mimo/mimo-v2-pro",
"api_key": "your-mimo-api-key"
}
* hassas dosyaları kaldırma
* Add .security.yml and onboard to .gitignore
Add a prominent reference to security_configuration.md at the beginning
of the configuration guide. This helps new users quickly find
information about storing API keys in .security.yml.
Fixes#1986
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
GoReleaser was picking nightly tags as the "previous tag" when
generating changelogs, causing release changelogs to be incomplete.
Add git.ignore_tags to skip nightly tags.
Allow PlaceholderConfig.Text to accept either a single string or an
array of strings, from which one is randomly selected at runtime.
This maintains backward compatibility with existing single-string configs
while enabling random placeholder selection.
Changes:
- Modify PlaceholderConfig.Text type from string to FlexibleStringSlice
- Add GetRandomText() helper method for random selection
- Update SendPlaceholder in all channels to use GetRandomText()
- Update config.example.json with array placeholder examples
- Update Matrix channel documentation
Exclude the Matrix gateway shim from freebsd/arm builds because
modernc.org/libc currently fails to compile on that target.
Document the upstream 32-bit FreeBSD codegen mismatch as well.
- add backend WeCom QR flow endpoints and in-memory flow state management
- add frontend WeCom binding UI with QR polling and channel enable toggle
- update channel config behavior and i18n strings for WeCom and WeChat
- apply minor formatting cleanup in model-related components
Separate embedded tray icons into platform-specific files, rename the
no-cgo systray stub for consistency, and add the app version to the
launcher startup log.
* Add extraBody field to model configuration forms
This adds a new field allowing users to specify additional JSON fields
to inject into the request body when configuring models.
* Handle ExtraBody clearing when frontend sends empty object
The backend now interprets an empty object sent from the frontend as a
signal to clear the ExtraBody field, while nil/undefined preserves the
existing value. Frontend changed to send {} instead of undefined when
the field is empty.
* Add command pattern testing endpoint and UI tool
Adds a new API endpoint `/api/config/test-command-patterns` that tests a
command against configured whitelist and blacklist patterns, along with
a frontend UI component to interactively test patterns.
* Only process deny patterns when enableDenyPatterns is true
Virtual models generated from multi-key expansion are now marked and
filtered during config persistence. Virtual models display with a badge
in the UI and cannot be set as default.
* add handler for empty message
* fix undefined: time
* fix linter
* update test to remove 100ms wait time since the handleMessage publishes synchronously
* perf(pico): implement O(1) session lookup for pico connections
- Replace `sync.Map` with `connections` and `sessionConnections`.
- Add `addConnection`, `removeConnection`, `sessionConnectionsSnapshot`, and `takeAllConnections` with `connsMu` for concurrency.
- `broadcastToSession` now dispatches directly to `sessionConnections`.
- Add `newUniqueConnID` to avoid UUID collision/overwrites.
- Ensure `Stop` and `readLoop` use the new helpers for safe cleanup and correct `connCount` updates.
* refactor(pico): replace addConnection with createAndAddConnection for atomic connID generation
* refactor(pico): clear connections in one time to improve perf
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
* fix(pico): keep connCount consistent with connection indexes
* refactor(pico): make connCount a regular int guarded by connsMu
* fix(pico): enforce MaxConnections atomically on registration
* fix(pico): use temporary over-limit error and remove conn counter
---------
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
- Add `crypto_database_path` and `crypto_passphrase` configuration
- Integrate cryptohelper for decrypting `m.room.encrypted` events
- Handle both plaintext and encrypted messages in `handleMessageEvent`
- Enable `goolm` build tag for libsignal crypto support
Fixes#1840.
- persist Weixin bindings, enable the channel automatically, and try to restart the gateway
- refresh frontend channel and gateway state after successful binding
- harden QR polling state handling and update related channel UI behavior
- localize sidebar channel priority, add Weixin icon support, and add backend test coverage
Validate tool call arguments against each tool's Parameters() JSON Schema
in ExecuteWithContext() before calling Execute(). This prevents type
confusion, argument injection, and missing-field errors from reaching tools.
Validates: required fields, type matching (string/integer/number/boolean/
array/object), enum membership, nested objects (recursive), array element
types. Rejects unexpected extra properties unless additionalProperties is
set to true (for MCP tool compatibility).
Returns ToolResult{IsError: true} on failure so the LLM can self-correct.
Ref: Security Hardening > Tool abuse prevention via strict parameter validation
Export EnsurePicoChannel and reuse it during launcher and gateway startup
so the Pico channel is initialized earlier with a generated token when
needed.
Also extend backend tests to cover startup-time Pico setup behavior and
keep the setup path idempotent.
Make POST /api/models capture the request's api_key and store it via
ModelConfig.SetAPIKey before saving config, so newly added models keep
their credentials in the security config.
Add a backend API test covering model creation with api_key persistence.
Normalize missing security sections when attaching, loading, and saving
security config so existing config files without `.security.yml` can still
be updated safely. This fixes Pico channel setup for legacy/existing configs
and adds coverage for the missing security file path and unexported JSON
field behavior.
- ensure at least 40% of the characters are masked for secrets of length 4 or more
- secrets with length <= 6 now show first and last char with mask
- secrets with length <= 12 now show first two and last two chars
- longer secrets show 3 prefix and 4 suffix
* feat: add ElevenLabs Scribe STT transcriber and Telegram SendVoice support
Add ElevenLabsTranscriber as an alternative speech-to-text provider using
the ElevenLabs Scribe API (scribe_v1). This enables voice message
transcription for users who already have an ElevenLabs API key, without
requiring a separate Groq account.
Changes:
- Add ElevenLabsTranscriber implementing the Transcriber interface
- Update DetectTranscriber to check providers.elevenlabs.api_key first,
falling back to Groq for backward compatibility
- Add ElevenLabs to ProvidersConfig
- Add "voice" media type for OGG files with "voice" in filename
- Add SendVoice support in Telegram channel for voice bubble messages
- Add comprehensive tests for ElevenLabs transcriber
Configuration:
"providers": {
"elevenlabs": {
"api_key": "sk_your_key_here"
}
}
Closes#1503 (partial)
* fix: move voice-bubble detection into Telegram channel to avoid regression in other channels
Address review feedback: keep inferMediaType returning "audio" for all
OGG files. Voice-bubble detection (SendVoice vs SendAudio) is now done
inside the Telegram channel based on filename, so other channels that
map "audio" explicitly are unaffected.
* fix: align VoiceConfig struct tags to pass golines formatter
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* fix(agent): use ModelName in loop test added by upstream
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
---------
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Add support for AWS Bedrock as an LLM provider using the Converse API.
The implementation is behind a build tag (-tags bedrock) to keep the
default binary size small.
Features:
- AWS SDK v2 with automatic credential chain (env vars, profiles, IAM roles)
- Converse API for unified access to Claude, Llama, Mistral models
- Tool/function calling support with proper document handling
- Image support with base64 decoding and size limits
- Request timeout configuration
- Region validation and endpoint resolution for all AWS partitions
Usage:
go build -tags bedrock
model: bedrock/us.anthropic.claude-sonnet-4-20250514-v1:0
api_base: us-east-1 (or full endpoint URL)
LLM
Prevent LLM from seeing its own credentials (API keys, tokens, secrets)
by filtering sensitive values from tool call results before sending to
the
model. Values are collected from .security.yml and replaced with
[FILTERED] using an efficient strings.Replacer (O(n+m)).
- Add FilterSensitiveData and FilterMinLength to ToolsConfig
- Implement SensitiveDataReplacer() with sync.Once caching in
SecurityConfig
- Use reflection to collect all sensitive values (Model API keys,
channel
tokens, web tool API keys, skills tokens)
- Apply filtering in agent loop at 4 tool result locations
- Add comprehensive tests covering all token types
- Move SecurityCopyFrom() before validateConfig() in PUT and PATCH handlers
- Make SecurityCopyFrom() call applySecurityConfig() to populate private fields
- Add tests for config save with security-only channel tokens
Without this fix, saving config via the web UI fails with 'channels.pico.token
is required' (and similar for Telegram/Discord) when tokens are stored in
.security.yml, because the validation ran before security credentials were
copied to the config struct.
* feat(chat): render mixed Markdown+HTML in assistant messages using rehype-raw + rehype-sanitize (safe default)
* build: remove irrelevant changes of pnpm-lock.yaml
* feat(skills): enable rendering of Markdown with HTML in skill details using rehype-raw and rehype-sanitize
* fix(agent): use ModelName in loop tests
Anthropic API returns 400 when multiple tool_result blocks share the same
tool_use_id, or when consecutive tool results are sent as separate user
messages. This fix:
1. Adds ToolCallID deduplication in sanitizeHistoryForProvider (context.go)
to drop duplicate tool results before sending to any provider.
2. Merges consecutive tool result messages into a single user message with
multiple tool_result content blocks in Anthropic's buildRequestBody,
for both "user" (with ToolCallID) and "tool" role messages.
3. Adds tests for both behaviors.
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Allow configuring provider-specific fields like reasoning_split for minimax via
the model config's extra_body map. These fields are merged into the request
body last, giving them precedence over default values.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Allow configuring provider-specific fields like reasoning_split for minimax via
the model config's extra_body map. These fields are merged into the request
body last, giving them precedence over default values.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Allow configuring provider-specific fields like reasoning_split for minimax via
the model config's extra_body map. These fields are merged into the request
body last, giving them precedence over default values.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* fix(media): track cleanup ownership per path
Add explicit cleanup policy handling to MediaStore and count refs by path before deleting the underlying file. This prevents cleanup from removing shared files until the final ref is gone.
Refs #1886
* fix(tools): keep send_file refs forget-only
Mark send_file media registrations as forget-only so cleanup drops the ref without deleting the original workspace file.
Refs #1886
* fix(channels): declare managed media cleanup policy
Explicitly mark downloaded and managed channel media as delete-on-cleanup so media ownership is visible at each registration site.
Refs #1886
- Add hardware-banner.jpg, launcher-webui.jpg, launcher-tui.jpg (lost in
previous force push)
- Add io.LimitReader (1MB) to BaiduSearchProvider response body read
- Add no-results fallback and "Results for: ... (via Baidu Search)" header
- Add api_keys field to Brave and Perplexity tables in fr/ja/pt-br/vi
tools_configuration.md
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
golangci-lint v2.10.1 treats golines as a formatter. Running
`golangci-lint fmt` normalizes struct tag alignment in GLMSearchConfig,
WebToolsConfig, and MCPConfig — removing manual padding that golines
flagged as improperly formatted.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Run golines then gci to reach a stable state that satisfies both linters.
BaiduSearchConfig field caused gofumpt to re-align the struct, shifting
ToolConfig tag spacing and triggering golines on each subsequent fix.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Remove extra alignment space on ToolConfig field introduced by gofumpt
when BaiduSearchConfig was added, keeping all lines under 120 chars.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Add BaiduSearchConfig struct and register in WebToolsConfig/defaults
- Insert Baidu Search in priority chain: DuckDuckGo > Baidu > GLM Search
- Use perplexityTimeout (30s) — Qianfan is LLM-based
- Fix response parsing: use references[] field per API spec
- Add baidu_search block to config.example.json
docs: sync configuration.md and README Documentation table across all languages
- Complete truncated configuration.md for fr/ja/pt-br/vi/zh: add Spawn
async flow diagram, Providers table, Model Configuration (all vendors,
examples, load balancing, migration), Provider Architecture, Scheduled
Tasks, and Advanced Topics links
- Add Hooks/Steering/SubTurn entries to Documentation table in all 8
READMEs (en/zh/fr/id/it/ja/pt-br/vi), ordered before Troubleshooting
- Add Baidu Search row to web search table in all 8 READMEs and
tools_configuration.md (en + 5 i18n); zh README reorders search
engines with China-friendly options first
- Add Matrix channel docs translations (fr/ja/pt-br/vi)
- Add Weixin channel to chat-apps.md and all README Channels tables
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Two gateway tests were flaky due to race conditions:
- TestGatewayStatusReturnsRestartingDuringRestartGap
- TestGatewayRestartReturnsErrorStatusWhenReplacementFailsToStart
The handleGatewayStatus function calls getGatewayHealth which can
override the test's expected status. By mocking gatewayHealthGet
to return an error, the tests now reliably verify the expected
status values.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- spawnSubTurn: set result=nil on panic instead of constructing a non-nil ToolResult
- HardAbort: roll back session history to initialHistoryLength after Finish()
- drainBusToSteering: switch to non-blocking reads after first message so function
returns promptly when the inbound channel is empty
- remove obsolete documentation files
- Add `AudioModelTranscriber` for model-based audio transcription via LLM providers
- Support selecting a transcription model with `voice.model_name` in config
- Keep Groq transcription as a fallback and move it into dedicated files with focused tests
- Serialize `data:audio/...` media as input_audio for OpenAI-compatible providers
- Improve transcription logging by rendering error fields as strings
- Add coverage for transcriber detection, audio-model behavior, provider audio serialization, and Groq transcription
Fixes#1890.
- Add Tailwind `whitespace-pre-wrap` to the user message bubble of web chat so spaces and blank lines can be rendered correctly.
- Update chat input placeholders in en.json and zh.json to show Enter vs Shift+Enter.
Allow configuring provider-specific fields like reasoning_split for minimax via
the model config's extra_body map. These fields are merged into the request
body last, giving them precedence over default values.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add agent-browser skill to the default workspace with complete CLI
reference for browser automation via Chrome/Chromium CDP. The skill
includes a runtime guard that checks for the binary before use.
Add Dockerfile.heavy — a batteries-included container image with:
- Node.js 24 + npm
- Python 3 + pip + uv
- Chromium + Playwright (for agent-browser)
- agent-browser CLI pre-installed
- Non-root picoclaw user (UID/GID 1000)
- Default workspace with all skills
- Persistent workspace volume
This complements the existing minimal Dockerfile and Dockerfile.full
for deployments that need browser automation and rich tool support.
On Android, /etc/resolv.conf does not exist, causing Go's default DNS
resolution to fail. This adds an init() hook that:
1. Detects missing /etc/resolv.conf (Android environment)
2. Configures a custom resolver with PreferGo: true
3. Supports multiple DNS servers via PICOCLAW_DNS_SERVER env var
- Semicolon-separated: "8.8.8.8:53;1.1.1.1:53"
- Single server also works: "8.8.8.8"
- Auto-appends :53 if port omitted
4. Round-robin rotation across configured servers
5. Defaults to Google DNS + Cloudflare DNS
Also patches http.DefaultTransport to use the custom resolver.
* feat(telegram): stream LLM responses in real-time via sendMessageDraft
Implements real-time token streaming to Telegram using the sendMessageDraft
API (telego v1.6.0). Instead of showing only a "Thinking..." placeholder
until the full response arrives, users now see partial LLM output appear
in the chat as it's generated.
The streaming pipeline threads through all layers:
- StreamingProvider interface (providers/types.go): opt-in ChatStream()
method that receives an onChunk callback with accumulated text
- OpenAI-compatible SSE streaming (openai_compat/provider.go): parses
SSE events with stream:true, handles text deltas and tool call assembly
- Anthropic native streaming (anthropic/provider.go): uses SDK's
NewStreaming() for direct Anthropic API connections
- HTTPProvider delegation (http_provider.go): delegates ChatStream to
the underlying openai_compat provider
- StreamingCapable + Streamer interfaces (channels/interfaces.go):
opt-in channel capability like TypingCapable/PlaceholderCapable
- Telegram streamer (telegram/telegram.go): BeginStream returns a
telegramStreamer that throttles sendMessageDraft calls (3s/200 chars)
with graceful degradation on API errors
- StreamDelegate bridge (bus/bus.go): decouples agent loop from channel
manager without tight imports
- Manager integration (manager.go): implements StreamDelegate, tracks
streamActive state, coordinates with placeholder editing
- Agent loop (loop.go): uses ChatStream when both provider and channel
support streaming, cancels stream on tool calls, skips PublishOutbound
when Finalize already delivered the message
Graceful degradation:
- Bots without forum/topics mode: first sendMessageDraft error sets
failed=true, subsequent Updates become no-ops, Finalize still delivers
via SendMessage. User sees normal non-streaming behavior.
- Non-streaming providers: type assertion fails, falls back to Chat()
- Config opt-out: streaming.enabled (default true) in telegram config
Closes#1098
* fix(telegram): delete placeholder message when streaming delivers response
When streaming was active, the "Thinking..." placeholder message stayed
in the chat because preSend only deleted the tracking entry without
removing the actual Telegram message. Now preSend deletes the placeholder
via the new MessageDeleter interface when streamActive is set.
* refactor(streaming): remove dead code and simplify streaming wiring
- Delete unused Anthropic ChatStream/parseStream (-131 lines) — factory
creates HTTPProvider for all OpenAI-compat providers including OpenRouter
- Simplify runLLMIteration from 4 to 3 return values (remove unused
streamed bool)
- Replace managerStreamer struct with finalizeHookStreamer using embedding
(Update/Cancel promoted, only Finalize overridden)
* fix(streaming): skip streamer acquisition when SendResponse is false
Heartbeat messages set SendResponse=false but the streaming path
was unconditionally acquiring a streamer, causing HEARTBEAT_OK to
leak to Telegram via streamer.Finalize().
* fix(streaming): guard streamer for non-sendable messages, add streaming config
Skip streamer acquisition for heartbeat (NoHistory=true), preventing
HEARTBEAT_OK from leaking to Telegram via streamer.Finalize().
Add streaming.enabled to Telegram defaults and example config.
* feat(telegram): stream LLM responses in real-time via sendMessageDraft
Implements real-time token streaming to Telegram using the sendMessageDraft
API (telego v1.6.0). Instead of showing only a "Thinking..." placeholder
until the full response arrives, users now see partial LLM output appear
in the chat as it's generated.
The streaming pipeline threads through all layers:
- StreamingProvider interface (providers/types.go): opt-in ChatStream()
method that receives an onChunk callback with accumulated text
- OpenAI-compatible SSE streaming (openai_compat/provider.go): parses
SSE events with stream:true, handles text deltas and tool call assembly
- Anthropic native streaming (anthropic/provider.go): uses SDK's
NewStreaming() for direct Anthropic API connections
- HTTPProvider delegation (http_provider.go): delegates ChatStream to
the underlying openai_compat provider
- StreamingCapable + Streamer interfaces (channels/interfaces.go):
opt-in channel capability like TypingCapable/PlaceholderCapable
- Telegram streamer (telegram/telegram.go): BeginStream returns a
telegramStreamer that throttles sendMessageDraft calls (3s/200 chars)
with graceful degradation on API errors
- StreamDelegate bridge (bus/bus.go): decouples agent loop from channel
manager without tight imports
- Manager integration (manager.go): implements StreamDelegate, tracks
streamActive state, coordinates with placeholder editing
- Agent loop (loop.go): uses ChatStream when both provider and channel
support streaming, cancels stream on tool calls, skips PublishOutbound
when Finalize already delivered the message
Graceful degradation:
- Bots without forum/topics mode: first sendMessageDraft error sets
failed=true, subsequent Updates become no-ops, Finalize still delivers
via SendMessage. User sees normal non-streaming behavior.
- Non-streaming providers: type assertion fails, falls back to Chat()
- Config opt-out: streaming.enabled (default true) in telegram config
Closes#1098
* fix(telegram): delete placeholder message when streaming delivers response
When streaming was active, the "Thinking..." placeholder message stayed
in the chat because preSend only deleted the tracking entry without
removing the actual Telegram message. Now preSend deletes the placeholder
via the new MessageDeleter interface when streamActive is set.
* refactor(streaming): remove dead code and simplify streaming wiring
- Delete unused Anthropic ChatStream/parseStream (-131 lines) — factory
creates HTTPProvider for all OpenAI-compat providers including OpenRouter
- Simplify runLLMIteration from 4 to 3 return values (remove unused
streamed bool)
- Replace managerStreamer struct with finalizeHookStreamer using embedding
(Update/Cancel promoted, only Finalize overridden)
* fix(streaming): skip streamer acquisition when SendResponse is false
Heartbeat messages set SendResponse=false but the streaming path
was unconditionally acquiring a streamer, causing HEARTBEAT_OK to
leak to Telegram via streamer.Finalize().
* fix(streaming): guard streamer for non-sendable messages, add streaming config
Skip streamer acquisition for heartbeat (NoHistory=true), preventing
HEARTBEAT_OK from leaking to Telegram via streamer.Finalize().
Add streaming.enabled to Telegram defaults and example config.
* fix(picoclaw): add missing closing brace for StreamingProvider interface
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* fix: resolve golangci-lint formatting issues
Fix gci import ordering in telegram and anthropic provider, and break
long function signature in openai_compat provider to satisfy golines.
* fix: address code review feedback on streaming PR
- Deduplicate Streamer interface: alias channels.Streamer to bus.Streamer
to prevent type drift across packages
- Increase SSE scanner buffer to 10MB max to handle large single-line
responses that exceed bufio.Scanner's 64KB default
- Switch draftID generation from math/rand to crypto/rand for
collision-resistant random IDs
- Add context cancellation check in SSE parsing loop so cancelled
streams stop processing immediately
- Log Finalize failures with chat_id and content length for debugging
silent message delivery failures
* feat: make streaming throttle interval and min growth configurable
Move hardcoded streamThrottleInterval (3s) and streamMinGrowth (200)
into StreamingConfig so they can be tuned per deployment via config
or environment variables.
* fix(telegram): use parseTelegramChatID in DeleteMessage and BeginStream
These two functions called undefined parseChatID. Use
parseTelegramChatID with _ for the unused threadID instead of adding
a wrapper function. Fixes all three CI checks.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* fix(streaming): set streamActive only after successful Finalize
Move onFinalize hook to run after Streamer.Finalize succeeds, so that
if Finalize fails the streamActive flag stays false and the regular
placeholder fallback path remains available.
Addresses review feedback from @alexhoshina.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
* feat(pico): add pico_client outbound WebSocket channel
Add a client-mode counterpart to the existing pico server channel.
pico_client connects to a remote Pico Protocol WebSocket server,
enabling picoclaw to bridge messages with external Pico-compatible
services.
Includes config, factory registration, manager wiring, 8 unit tests,
and a minimal echo-server example for interactive testing.
* fix(pico): address PR #1198 review — goroutine leak, race, auth
- Add per-connection context cancel to picoConn to prevent pingLoop
goroutine leak on disconnect
- Re-acquire mutex in StartTyping stop closure to avoid stale conn race
- Remove query-param token auth from echo server (header-only)
- Move ListenAndServe to main goroutine where log.Fatal is safe
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* fix: replace ConsumeInbound with InboundChan select in client test
MessageBus does not expose a ConsumeInbound method. Use a select on
InboundChan() with context cancellation, matching the pattern used in
the bus package tests.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
- Deleted channel management UI from channel.go, including all associated forms and menu items.
- Removed platform-specific gateway process management from gateway_posix.go and gateway_windows.go.
- Eliminated menu structure and item management from menu.go.
- Removed model management and configuration handling from model.go.
- Deleted style definitions and application logic from style.go.
- Cleared main entry point in main.go.
Refactor agent loop execution around runTurn, add explicit turn state and interrupt semantics, and automatically continue queued steering that misses the current turn boundary.
* feat(feishu): add interactive card message parsing
Add support for parsing inbound Feishu interactive card messages.
When a user sends a card message, the text content is now extracted
and passed to the LLM for processing.
- Add extractCardText() to recursively extract text from card JSON
- Support both JSON 1.0 (legacy) and JSON 2.0 schema formats
- Handle nested elements: header, body, actions, columns
- Extract text from markdown, lark_md, and plain_text elements
- Add comprehensive unit tests for card parsing
Fixes #<issue_number>
💘 Generated with Crush
Assisted-by: GLM-5 via Crush <crush@charm.land>
* feat(feishu): extract and download images from interactive cards
When receiving interactive card messages, extract embedded images
(img_key, src, icon_key) and download them for LLM processing.
- Add extractCardImageKeys() to recursively extract image keys from card JSON
- Support img elements (img_key, src) and icon elements (icon_key)
- Update downloadInboundMedia() to handle MsgTypeInteractive
- Add comprehensive unit tests for image extraction
Images are downloaded and stored via MediaStore, then appended to
the message content as [image: photo] tags for LLM visibility.
💘 Generated with Crush
Assisted-by: GLM-5 via Crush <crush@charm.land>
* fix(feishu): simplify card parsing - pass raw JSON, only extract images
Address review feedback: text extraction cannot exhaustively handle all
card formats (i18n_elements, div.fields, etc.). Pass raw JSON to LLM
instead - same approach as MsgTypePost. Only image extraction remains
as images must be downloaded for LLM to process.
- Remove extractCardText() and helper functions
- extractContent() now returns raw JSON for MsgTypeInteractive
- Keep extractCardImageKeys() for downloading embedded images
- Update tests to expect raw JSON for interactive cards
* fix(feishu): don't append media tags to interactive card JSON
Appending media tags like "[attachment]" to raw JSON content produces
invalid JSON format. For interactive cards, the JSON already contains
image information and media refs are downloaded separately.
- Skip appendMediaTags for MsgTypeInteractive to preserve valid JSON
- Add test case for interactive card with images
* fix(feishu): filter out external URLs from card image extraction
Only Feishu-hosted image keys (img_xxx, icon_xxx) can be downloaded via
the Feishu API. External URLs in src field (https://...) should be
filtered out to avoid download failures.
- Add isFeishuImageKey() to detect Feishu-hosted keys vs external URLs
- Update extractImageKeysRecursive to skip external URLs in src field
- Add tests for external URL filtering and mixed scenarios
* feat(feishu): support downloading external images from interactive cards
Previously only Feishu-hosted images (img_key, icon_key) could be
downloaded. Now external URLs in src field are also downloaded via
HTTP and made available to the LLM.
- extractCardImageKeys now returns two slices: Feishu keys and external URLs
- Add downloadExternalImage to download images from HTTP URLs
- Update downloadInboundMedia to handle both Feishu API and HTTP downloads
- Update tests for new function signature
* fix(feishu): use HTTP client with timeout for external image downloads
Replaced http.DefaultClient with a client that has a 30-second timeout
to prevent hanging on unresponsive external URLs.
Generated with Crush
Assisted-by: GLM-5 via Crush <crush@charm.land>
* fix(feishu): resolve lint errors for shadow and formatting
- Rename err variables to avoid shadowing in downloadExternalImage
- Fix struct field alignment in TestExtractCardImageKeys
Generated with Crush
Assisted-by: GLM-5 via Crush <crush@charm.land>
* refactor(feishu): pass external image URLs to LLM instead of downloading
Instead of downloading external images from interactive cards, pass
the URLs directly to LLM. This reduces network overhead and lets
vision-capable models fetch images as needed.
- Remove downloadExternalImage function
- Append external URLs to card content for LLM processing
- Only download Feishu-hosted images via API
💘 Generated with Crush
Assisted-by: GLM-5 via Crush <crush@charm.land>
* fix(feishu): add blank line between functions for gci formatting
* fix(feishu): keep interactive card content as valid JSON
Changed newTestAgentLoop calls from using 3 blank identifiers to 2 by
assigning the unused provider parameter and explicitly marking it as
unused with `_ = provider`. This fixes the dogsled linter violations
that were causing CI failures.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Rewrite README.id.md to match current upstream structure (~250 lines)
- Detailed docs moved to docs/*.md, README is quick-start only
- Sync badges (Go 1.25+, LoongArch), news (v0.2.3), Termux instructions
- Add Bahasa Indonesia + Italiano to language selectors in all 8 READMEs
When both providers and model_list are configured, model_list entries
with empty api_key or api_base now automatically inherit from the
matching provider (matched by protocol prefix in the Model field).
Example: a model_list entry with model='deepseek/deepseek-chat' and
no api_key will inherit from providers.deepseek.api_key.
Explicit model_list values always take precedence.
Changes:
- Add InheritProviderCredentials() in migration.go
- Call it in LoadConfig() after provider-to-model-list conversion
- Add protocolProviderMapping for all 25 supported protocols
- 6 new tests covering inheritance, precedence, and edge cases
Closes#1635
* feat(wecom): add WebSocket long-connection support for WeCom AI Bot
- Introduced WeComAIBotWSChannel to handle WebSocket connections.
- Updated NewWeComAIBotChannel to prioritize WebSocket mode when BotID and Secret are provided.
- Enhanced WeComAIBotConfig to include BotID and Secret for WebSocket mode.
- Implemented message handling for text, image, voice, and mixed messages in WebSocket mode.
- Added tests for WebSocket mode functionality and ensured backward compatibility with webhook mode.
- Refactored existing code to improve clarity and maintainability.
* feat(wecom): implement periodic processing hints and enforce WeCom stream deadline
* feat(wecom): update WeCom AI Bot setup instructions and configuration parameters
* feat(wecom): enhance WeCom AI Bot with image handling and media support
* feat(wecom): refactor WeCom AI Bot task management to use req_id for concurrent message handling
* feat(wecom): refactor WeCom AI Bot to manage request states and late replies
* feat(wecom): add response timeout handling and improve WebSocket command acknowledgment
* fix(wecom): improve error handling for late reply proactive push delivery
* refactor(wecom): reorganize WeCom AI Bot configuration fields for improved readability
* fix(wecom): update error message for websocket delivery failure in late reply proactive push
* feat(wecom): implement shared HTTP clients for WeCom image handling and response URL posting
* refactor(wecom): simplify image download and storage process in storeWSImage
* fix(wecom): improve error logging for WebSocket message handling and proactive push delivery
* fix(wecom): enhance WebSocket connection stability and task cancellation handling
* fix(wecom): improve WS image message handling by ensuring proper error response and initializing mediaRefs
* feat(wecom): enhance WeCom AIBot WebSocket handling with message deduplication and support for file and video messages
* refactor(wecom): rename image handling functions to media handling and enhance media type support
* feat(wecom): implement byte-aware content splitting for WeCom AI Bot stream messages
* refactor(wecom): remove max message length constraint from WeCom AIBot WS channel
Add nil checks in NewSpawnTool and NewSubagentTool constructors to
handle nil manager gracefully. Fix spelling errors (cancelled->canceled)
and remove unused test code. Update tests to use mock spawner.
Replace hardcoded constants with config-driven parameters in agents.defaults:
- MaxDepth, MaxConcurrent, DefaultTimeout, DefaultTokenBudget, ConcurrencyTimeout
- Support JSON config and env vars (PICOCLAW_AGENTS_DEFAULTS_SUBTURN_*)
- Add getSubTurnConfig() for runtime config resolution with defaults
- Apply defaultTokenBudget when no explicit budget is provided
Rationale: SubTurn is agent execution infrastructure, not a tool, so it belongs
in agents.defaults rather than tools config.
Example:
{
"agents": {
"defaults": {
"subturn": {
"max_depth": 5,
"max_concurrent": 10,
"default_timeout_minutes": 10
}
}
}
}
Add ActualSystemPrompt and InitialMessages fields to SubTurnConfig to enable
stateful worker context passing across multiple evaluation iterations.
Changes:
- Add ActualSystemPrompt field to separate system role from user task description
- Add InitialMessages field to preload ephemeral session history before agent loop starts
- Add Messages field to ToolResult for carrying session history (internal use, not serialized)
- Update runTurn to inject system prompt and preload history from InitialMessages
- Update AgentLoopSpawner to map new fields from tools.SubTurnConfig to agent.SubTurnConfig
This enables the evaluator-optimizer execution strategy in team tool to maintain
worker context across iterations while keeping SubTurn isolation intact.
* feat(config): support multiple API keys for failover
Add api_keys field to ModelConfig to support multiple API keys with
automatic failover. When multiple keys are configured, they are expanded
into separate model entries with fallbacks set up for key-level failover.
Example config:
{
"model_name": "glm-4.7",
"model": "zhipu/glm-4.7",
"api_keys": ["key1", "key2", "key3"]
}
Expands internally to:
- glm-4.7 (key1) -> fallbacks: [glm-4.7__key_1, glm-4.7__key_2]
- glm-4.7__key_1 (key2)
- glm-4.7__key_2 (key3)
Backward compatible: single api_key still works as before.
* fix(providers): change cooldown tracking from provider to ModelKey
This enables proper key-switching when multiple API keys share the same
provider. Previously, when one key failed, all keys were blocked because
cooldown was tracked per-provider.
Now each (provider, model) combination has independent cooldown, allowing
fallback to alternate keys when one is rate limited.
Includes TestMultiKeyWithModelFallback and related failover tests.
* feat(feishu): add Lark (international) support via IsLark config field
Add IsLark field to FeishuConfig to switch between Feishu and Lark
domains. Also fix domain inconsistency where WS client defaulted to
LarkBaseUrl while HTTP client used FeishuBaseUrl.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* docs: update documentation and web UI for Lark support
Add is_lark field to config example, feishu docs, i18n translations,
and web frontend form.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
* fix(tools): propagate tool registry to subagents via Clone
SubagentManager was created with an empty ToolRegistry and SetTools()
was never called, causing all subagent tool invocations to fail with
"tool not found". This was a regression from the multi-agent refactor.
Fix: clone the parent agent's tool registry into the subagent manager
after creation but before spawn/spawn_status registration — giving
subagents access to file, exec, web, and other tools while preventing
recursive subagent spawning.
- Add ToolRegistry.Clone() for independent shallow copies
- Call subagentManager.SetTools(agent.Tools.Clone()) in registerSharedTools
- Add tests for Clone isolation, empty clone, and hidden tool state
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* fix(tools): fix cron_test build error and add TTL clone test
- Fix cron_test.go:229 — replace non-existent SubscribeOutbound(ctx)
with select on OutboundChan(), matching the MessageBus channel API
- Add TestToolRegistry_Clone_PreservesTTLValue per reviewer feedback
- Add version reset note to Clone() doc comment
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
* feat(agent): add structured agent definition loader
Parse AGENT.md frontmatter into a runtime definition and pair it with SOUL.md while keeping a legacy AGENTS.md fallback for transition.
Refs #1218
* refactor(agent): build context from structured agent files
Use AGENT.md and SOUL.md as the structured bootstrap source, ignore IDENTITY.md for structured agents, remove USER.md from the new context flow, and update pkg/agent tests accordingly.
Refs #1218
* refactor(onboard): switch workspace templates to AGENT.md
Replace the legacy AGENTS.md, IDENTITY.md, and USER.md templates with a structured AGENT.md plus SOUL.md, and update the onboard template test to assert the new generated files.
Refs #1218
* docs(readme): update workspace layout for AGENT.md
Refresh the documented workspace tree across the README translations so onboarding now points to AGENT.md and SOUL.md instead of the retired AGENTS.md, IDENTITY.md, and USER.md files.
Refs #1218
* feat(agent): restore workspace USER.md context
* docs(readme): document workspace USER.md layout
* fix: sort agent definition imports for gci
When building parameters for Anthropic API calls, tool calls with empty
names would cause 400 Bad Request errors with the message:
'tool_use.name: String should have at least 1 character'
This fix adds a check to skip tool calls that have empty names, preventing
the API error and allowing the conversation to continue normally.
Fixes#1658
The Lark SDK v3's built-in token retry loop does not clear stale tokens
from cache when the server returns error 99991663 (tenant_access_token
invalid), causing all API calls to fail until the token naturally
expires (~2 hours).
- Add tokenCache struct (implementing larkcore.Cache) with
Get/Set/InvalidateAll methods and proper expired-entry cleanup
- Wire custom cache into lark.NewClient via WithTokenCache()
- Add invalidateTokenOnAuthError helper called in all API methods
* Add Novita provider support
- Add 'novita' prefix to normalizeModel switch in openai_compat provider
- Add Novita provider to all_supported_vendors table in README.md
- Add test cases for Novita model prefix stripping
Novita endpoint: https://api.novita.ai/openai
Default models: deepseek/deepseek-v3.2, zai-org/glm-5, minimax/minimax-m2.5
* feat: complete Novita provider integration
* chore: drop README changes from Novita PR
* fix: remove duplicate function declarations in openai_compat provider
The functions buildToolsList, SupportsNativeSearch, and isNativeSearchHost
were declared twice, causing compilation failures in all CI checks.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* fix: break long line in novita test to satisfy golines linter
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
- Critical flag was declared but never acted on; non-critical SubTurns
now break out of the iteration loop when IsParentEnded() returns true
- tools.SubTurnConfig was missing Critical/Timeout/MaxContextRunes,
making those fields unreachable from the tools layer; added fields and
wired them through AgentLoopSpawner.SpawnSubTurn
- Removed subTurnResults sync.Map from AgentLoop — it was a redundant
alias for the same channel already stored in turnState.pendingResults;
dequeuePendingSubTurnResults now reads directly via activeTurnStates
- Replace hardcoded concurrencySem size 5 with maxConcurrentSubTurns constant
- Update affected tests to match new dequeuePendingSubTurnResults API
* fix(telegram): improve HTML chunking and preserve word boundaries
* fix(telegram): address copilot feedback, filter empty chunks and add word-boundary regression test
* style(telegram): fix gofmt and gci lint errors in tests
* fix to feedback
- Added `/subagents` platform command to visualize the active task tree.
- Implemented GetAllActiveTurns and FormatTree in AgentLoop to support cross-session observability.
- Fixed a bug where sub-turns spawned via tools were not registered in the global `activeTurnStates` map, making them invisible to system queries.
- Enhanced tree rendering logic to identify and display "orphaned" subagents (children that outlive their parent turns).
- Registered the new command in `builtin.go` and injected the turn state provider into the commands runtime.
Modified Files:
- pkg/agent/turn_state.go: Added TurnInfo snapshotting and recursive tree formatting.
- pkg/agent/loop.go: Injected GetActiveTurn hook and implemented multi-root forest rendering.
- pkg/agent/subturn.go: Added child turn registration into activeTurnStates.
- pkg/commands/cmd_subagents.go: New command implementation.
- pkg/commands/builtin.go: Command registration.
This commit addresses several critical concurrency and state management bugs within the SubTurn execution and delivery logic.
1. Fix Goroutine Leak & Deadlock in deliverSubTurnResult:
- Replaced non-blocking select with a safe blocking select that listens to `resultChan` and a new `<-parentTS.Finished()` channel.
- This ensures results are not arbitrarily dropped when the channel is full (preventing orphaned valid results), while also guaranteeing the child goroutine safely unblocks and exits if the parent finishes execution early.
2. Prevent "Send on Closed Channel" Fatal Panics:
- Removed `close(pendingResults)` and `drainPendingResults` from `turnState.Finish()`.
- The pendingResults channel is now naturally garbage collected, completely eliminating the race condition panic when a child attempts delivery at the exact moment the parent finishes.
- Added a `defer recover()` failsafe inside deliverSubTurnResult to gracefully emit Orphan events in extreme edge cases.
3. Fix Truncation Recovery Prompt Drop:
- Fixed the runTurn truncation retry logic by introducing an explicit `promptAlreadyAdded` boolean.
- Ensures that the dynamically generated `recoveryPrompt` is correctly injected into the LLM history sequence on subsequent iterations, adhering to API roles without duplicating arrays.
4. Test Suite Stabilization:
- Fixed TestDeliverSubTurnResultNoDeadlock to accurately wait for deterministic deliveries instead of racing timeouts.
- Replaced defunct closed-channel tests with TestFinishedChannelClosedState matching the new Finished() mechanism.
- Fixed the Finish(true) parameter in TestGrandchildAbort_CascadingCancellation to correctly validate Context cascade behavior.
- All tests now pass cleanly without hanging or emitting false positives.
Problem:
During subturn context limit or truncation recoveries, the recovery loops repeatedly
called `runAgentLoop` with the same or modified `UserMessage`. Because `runAgentLoop`
unconditionally adds the `UserMessage` to the session history, this resulted in:
1. Duplicate User Messages polluting the history upon `context_length_exceeded` retries.
2. The possibility of injecting empty User Messages if `opts.UserMessage` was artificially blanked out to work around the duplication.
3. Messy or duplicate entries during `finish_reason="truncated"` recovery injections.
Solution:
- Introduce `SkipAddUserMessage` boolean to `processOptions` to explicitly control whether the agent loop should write the user prompt to history.
- Add an explicit `opts.UserMessage != ""` check in `runAgentLoop` to prevent polluting history with empty message content.
- In `subturn.go`'s recovery loop, set `SkipAddUserMessage: contextRetryCount > 0` to skip writing the user message on context
* config: add prefer_native and NativeSearchCapable for model-native search
* providers: implement native web search for OpenAI and Codex
* agent: use provider-native search when prefer_native and supported
* tests: add coverage for model-native search
* fix: Golang lint errors
* fix: update the code based on the review
* fix: update codex_provider_test
* fix: Fixed the bug where the bus was closed and consumers had unfinished messages.
* fix: remove unnecessary blank line in Close method
* fix: refactor message bus and channel handling for improved performance and reliability
* fix: improve message handling and bus closure logic for better reliability
* fix: reduce sleep duration in agent loop for improved responsiveness
* fix the test case
* feat(cron): enhance CronService with wake channel and improve job scheduling logic
* fix(cron): update file permission mode to use octal notation in test and fix some lint errors
* fix(cron): improve wake channel handling and enhance concurrency in tests
Problem:
When parent turn finishes early, all child SubTurns receive "context canceled"
error,because child context was derived from parent context.
Solution:
Implement a lifecycle management system that distinguishes between:
- Graceful finish (Finish(false)): signals parentEnded, children continue
- Hard abort (Finish(true)): immediately cancels all children
Changes:
- turn_state.go:
- Add parentEnded atomic.Bool to signal parent completion
- Add parentTurnState reference for IsParentEnded() checks
- Modify Finish(isHardAbort bool) to distinguish abort types
- subturn.go:
- Add Critical bool to SubTurnConfig (Critical SubTurns continue after parent ends)
- Add Timeout time.Duration for SubTurn self-protection
- Use independent context (context.Background()) instead of derived context
- SubTurns check IsParentEnded() to decide whether to continue or exit
- loop.go:
- Call Finish(false) for normal completion (graceful)
- Add IsParentEnded() check in LLM iteration loop
- steering.go:
- HardAbort calls Finish(true) to immediately cancel children
Behavior:
- Normal finish: parentEnded=true, children continue, orphan results delivered
- Hard abort: all children cancelled immediately via context
- Critical SubTurns: continue running after parent finishes gracefully
- Non-Critical SubTurns: can exit gracefully when IsParentEnded() returns true
Includes JSONL session persistence (#1170), spawn_status tool, Azure provider,
credential encryption, and various fixes. SubTurn features preserved and
integrated with new spawn_status functionality.
Add a clear identity statement to all 6 README files clarifying that
PicoClaw is an independent open-source project by Sipeed, written
entirely in Go, and not a fork of OpenClaw, NanoBot, or any other
project. This addresses common AI hallucinations found during testing
of 11 AI tools. Also normalizes [nanobot] to [NanoBot] for consistent
capitalization.
Co-authored-by: BeaconCat <BeaconCat@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
- add a dedicated exec settings section in the config page
- support timeout and custom allow/deny regex patterns for exec
- validate custom exec regex patterns in the config API
- block cron command scheduling and execution when exec is disabled
- update tests and i18n strings for the new command settings
* feat(gateway): support hot reload and empty startup
- extract gateway runtime into pkg/gateway
- add gateway.hot_reload config with default and example values
- allow starting the gateway without a default model via --allow-empty
- stop treating missing enabled channels as a startup error
- update related tests
* feat: replace gateway SSE updates with polling-based state sync
- remove gateway SSE broadcasting and event endpoint
- add polling-based gateway status refresh with stopping state handling
- detect when gateway restart is required after default model changes
- resolve gateway health and websocket proxy targets from configured host
- update gateway UI labels and add backend/frontend test coverage
- Modify buildWsURL to use web server port (18800) instead of gateway port (18790)
- Add WebSocket proxy handler to forward /pico/ws to gateway
- Gateway port is read from config (cfg.Gateway.Port), defaults to 18790
- This allows WebSocket connections through the same port as the web UI,
avoiding the need to expose extra ports for Tailscale/Docker
Separate web Go commands from the default Go toolchain so web builds,
tests, and vet can enable CGO on Darwin without affecting the rest of
the project. Also ensure frontend backend builds recreate backend/dist
with a .gitkeep file so the embedded output directory remains tracked.
* feat(tools): add SpawnStatusTool for reporting subagent statuses
* feat(tools): enhance SpawnStatusTool to restrict task visibility by conversation context
* feat(tests): add Unicode result truncation and channel filtering tests for SpawnStatusTool
* Potential fix for pull request finding
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
* feat(tools): enhance SpawnStatusTool with task ID validation and sorting by creation timestamp
* feat(tools): update SpawnStatusTool description and parameter documentation for clarity
* refactor(tests): improve comments for clarity in ChannelFiltering test case
* fix(tools): update no subagents message for clarity and remove unnecessary locking in runTask
* fix(tools): improve description clarity for SpawnStatusTool regarding task context
* feat(tools): add spawn_status tool configuration and registration
* Potential fix for pull request finding
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
* fix(agent): improve subagent management for spawn and spawn_status tools
* Potential fix for pull request finding
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
* Potential fix for pull request finding
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
* fix(tests): update ResultTruncation_Unicode test to use valid CJK character
---------
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Co-authored-by: lxowalle <83055338+lxowalle@users.noreply.github.com>
- add defensive nil check for tool call Arguments field
- replace nil input with empty object to comply with Anthropic spec
- prevent API errors when GLM models return null input in tool_use blocks
Zhipu AI's GLM series models may return tool_use blocks with null input field,
which causes their API to reject subsequent requests with error:
"ClaudeContentBlockToolResult object has no attribute id"
This fix ensures compatibility by converting nil inputs to empty objects {},
matching the Anthropic Messages API specification while maintaining backward
compatibility with other providers.
When the entire session history is a single Turn (e.g. one user message
followed by a massive tool response), findSafeBoundary returns 0 and
forceCompression previously did nothing — leaving the agent stuck in
a context-exceeded retry loop.
Now falls back to keeping only the most recent user message when no
safe Turn boundary exists. This breaks Turn atomicity as a last resort
but guarantees the agent can recover.
Also updates docs/agent-refactor/context.md to document this behavior.
Ref #1490
- add tools.cron.allow_command config with a default value of true
- require command_confirm only when cron command execution is disabled
- expose cron command permission and timeout settings in the config UI
- add backend tests and update i18n strings
- Fix synchronous SubTurn calls placing results in pendingResults channel,
causing double delivery. Now only async calls (Async=true) use the channel.
- Move deliverSubTurnResult into defer to ensure result delivery even when
runTurn panics. Add TestSpawnSubTurn_PanicRecovery to verify.
- Fix ContextWindow incorrectly set to MaxTokens; now inherits from
parentAgent.ContextWindow.
- Add TestSpawnSubTurn_ResultDeliverySync to verify sync behavior.
Critical fixes (5):
- Fix turnState hierarchy corruption in nested SubTurns by checking context
before creating new root turnState in runAgentLoop
- Fix deadlock risk in deliverSubTurnResult by separating lock and channel ops
- Fix session rollback race in HardAbort by calling Finish() before rollback
- Fix resource leak by closing pendingResults channel in Finish() with recovery
- Add thread-safety docs for childTurnIDs and isFinished fields
Medium priority fixes (5):
- Move globalTurnCounter to AgentLoop.subTurnCounter to prevent ID conflicts
- Improve semaphore acquisition to ensure release even on early validation failures
- Document design choice: ephemeral sessions start empty for complete isolation
- Add final poll before Finish() to capture late-arriving SubTurn results
- Remove duplicate channel registration in spawnSubTurn to fix timing issues
Testing:
- Add 6 new tests covering hierarchy, deadlock, ordering, channel lifecycle,
final poll, and semaphore behavior
- All 12 SubTurn tests passing with race detector
This resolves 10 critical and medium issues (5 race conditions, 2 resource leaks,
3 timing issues) identified in code review, bringing SubTurn to production-ready state.
- Fix turnState hierarchy corruption when SubTurns recursively call runAgentLoop
by checking context for existing turnState before creating new root
- Fix deadlock risk in deliverSubTurnResult by separating lock and channel operations
- Fix session rollback race in HardAbort by calling Finish() before rollback
- Fix resource leak by closing pendingResults channel in Finish() with panic recovery
- Add thread-safety documentation for childTurnIDs and isFinished fields
- Move globalTurnCounter to AgentLoop.subTurnCounter to prevent ID conflicts
- Improve semaphore acquisition to ensure release even on early validation failures
- Document design choice: ephemeral sessions start empty for complete isolation
- Add 5 new tests: hierarchy, deadlock, order, channel close, and semaphore
- Add initialHistoryLength field to turnState to snapshot session state at turn start
- Save initial history length in runAgentLoop when creating root turnState
- Implement session rollback in HardAbort via SetHistory, truncating to initial length
- Add TestHardAbortSessionRollback to verify history rollback after abort
- Import providers package in subturn_test.go for Message type
This ensures that when a user triggers hard abort, all messages added during
the aborted turn are discarded, restoring the session to its pre-turn state.
- Add maxConcurrentSubTurns constant (5) and concurrencySem channel to turnState
- Acquire/release semaphore in spawnSubTurn to limit concurrent child turns per parent
- Add activeTurnStates sync.Map to AgentLoop for tracking root turn states by session
- Implement HardAbort(sessionKey) method to trigger cascading cancellation via turnState.Finish()
- Register/unregister root turnState in runAgentLoop for hard abort lookup
- Add TestSubTurnConcurrencySemaphore to verify semaphore capacity enforcement
- Add TestHardAbortCascading to verify context cancellation propagates to child turns
- Add subTurnResults sync.Map to AgentLoop for per-session channel tracking
- Add register/unregister/dequeue methods in steering.go
- Poll SubTurn results in runLLMIteration at loop start and after each tool,
injecting results as [SubTurn Result] messages into parent conversation
- Initialize root turnState in runAgentLoop, propagate via context
(withTurnState/turnStateFromContext), call rootTS.Finish() on completion
- Wire Spawn Tool to spawnSubTurn via SetSpawner in registerSharedTools,
recovering parentTS from context for proper turn hierarchy
- Refactor subagent.go to use SetSpawner pattern
- Add TestSubTurnResultChannelRegistration and TestDequeuePendingSubTurnResults
- move chat controller, state, protocol, history, and websocket logic into a dedicated chat feature module
- improve chat reconnection, session hydration, and send gating based on actual websocket state
- preserve gateway status during transient SSE disconnects and update stop state immediately
- generate wss websocket URLs behind HTTPS proxies and add backend tests for forwarded proto handling
Document the semantic boundaries of context management as called for
in the agent-refactor README (suggested document split, item 5):
- context window region definitions and history budget formula
- ContextWindow vs MaxTokens distinction
- session history contents (no system prompt stored)
- Turn as the atomic compression unit (#1316)
- three compression paths and their ordering
- token estimation approach and its limitations
- interface boundaries between budget functions and BuildMessages
Also documents known gaps: summarization trigger not using the full
budget formula, heuristic-only token estimation, and reactive retry
not preserving media references.
Ref #1439
Session history only stores user/assistant/tool messages — the system
prompt is built dynamically by BuildMessages. Remove the incorrect
system message from TestAgentLoop_ContextExhaustionRetry test data
to match the real data model that forceCompression operates on.
When the entire history is a single Turn (one user message followed by
tool calls and responses, no subsequent user message), the only Turn
boundary is at index 0. Previously the fallback returned targetIndex,
which could land on a tool or assistant message — splitting the Turn.
Return 0 instead, so callers (forceCompression, summarizeSession) see
mid <= 0 and skip compression rather than cutting inside the Turn.
Two estimation bugs fixed:
1. Media tokens were added to the chars accumulator before the chars*2/5
conversion, resulting in 256*2/5=102 tokens per item instead of 256.
Fix: add media tokens directly to the final token count, bypassing
the character-based heuristic.
2. estimateMessageTokens counted both tc.Name and tc.Function.Name for
tool calls, but providers only send one (OpenAI-compat uses
function.name, Anthropic uses tc.Name). Fix: count tc.Function.Name
when Function is present, fall back to tc.Name only otherwise.
Also fix i18n hint text: "auto-detect" was misleading — the backend
uses a 4x max_tokens heuristic, not actual model detection.
Introduce parseTurnBoundaries() which identifies each Turn start index
in the session history. A Turn is a complete "user input → LLM iterations
→ final response" cycle (as defined in the agent refactor design #1316).
findSafeBoundary now uses Turn boundaries instead of raw role-scanning,
making the intent explicit: "find the nearest Turn boundary."
forceCompression drops the oldest half of Turns (not arbitrary messages),
which is simpler and more intuitive. The Turn-based approach naturally
prevents splitting tool-call sequences since each Turn is atomic.
Add tests that reflect actual session data shape: history starts with
user messages (no system prompt), includes chained tool-call sequences,
reasoning content, and media items. Exercises the proactive budget check
path with BuildMessages-style assembled messages.
Add context_window to config.example.json, the web configuration page
(form model, input field, save handler), and i18n strings (en/zh).
The field is optional — leaving it empty falls back to the 4x max_tokens
heuristic.
estimateMessageTokens now counts ReasoningContent (extended thinking /
chain-of-thought) which can be substantial and is persisted in session
history. Media items get a fixed per-item overhead (256 tokens) since
actual cost depends on provider-specific image tokenization.
Session history (GetHistory) contains only user/assistant/tool messages.
The system prompt is built dynamically by BuildMessages and is never
stored in session. The previous code incorrectly treated history[0] as
a system prompt, skipping the first user message and appending a
compression note to it.
Fix: operate on the full history slice, and record the compression
note in the session summary (which BuildMessages already injects into
the system prompt) rather than modifying any history message.
Separate context_window from max_tokens — they serve different purposes
(input capacity vs output generation limit). The previous conflation caused
premature summarization or missed compression triggers.
Changes:
- Add context_window field to AgentDefaults config (default: 4x max_tokens)
- Extract boundary-safe truncation helpers (isSafeBoundary, findSafeBoundary)
into context_budget.go — pure functions with no AgentLoop dependency
- forceCompression: align split to safe boundary so tool-call sequences
(assistant+ToolCalls → tool results) are never torn apart
- summarizeSession: use findSafeBoundary instead of hardcoded keep-last-4
- estimateTokens: count ToolCalls arguments and ToolCallID metadata,
not just Content — fixes systematic undercounting in tool-heavy sessions
- Add proactive context budget check before LLM call in runAgentLoop,
preventing 400 context-length errors instead of reacting to them
- Add estimateToolDefsTokens for tool definition token cost
Closes#556, closes#665
Ref #1439
- Replace duplicate types (ToolResult/Session/Message) with real project types
- Implement ephemeralSessionStore satisfying session.SessionStore interface
- Connect runTurn to real AgentLoop via runAgentLoop + AgentInstance
- Fix subturn_test.go to match updated signatures and types
Co-Authored-By: Claude Sonnet 4 <noreply@anthropic.com>
* feat(credential): add AES-GCM encryption, SecureStore, and onboard keygen
- pkg/credential: new package with AES-256-GCM enc:// credential format,
HKDF-SHA256 key derivation (passphrase + optional SSH key binding),
ErrPassphraseRequired / ErrDecryptionFailed sentinel errors,
and PassphraseProvider hook for runtime passphrase injection
- pkg/credential/store: lock-free SecureStore via atomic.Pointer[string];
passphrase never written to disk or os.Environ
- pkg/credential/keygen: ed25519 SSH key generation helper used by onboard
- pkg/config: replace os.Getenv(PassphraseEnvVar) with
credential.PassphraseProvider() at all three call sites so that
LoadConfig and SaveConfig use whatever passphrase source is active
- cmd/picoclaw/onboard: prompt for passphrase with echo-off, generate
picoclaw-specific SSH key, re-encrypt existing config on re-onboard
- docs/credential_encryption.md: design doc for the enc:// format
* fix(credential): address Copilot review comments on PR #1521
- credential.go: decouple ErrPassphraseRequired from env var name;
message is now 'enc:// passphrase required' since PassphraseProvider
may come from any source, not just os.Environ
- credential.go: Resolver resolves symlinks via EvalSymlinks before the
isWithinDir containment check, preventing symlink-based path traversal
for file:// credential references
- store.go: tighten comment to describe only what SecureStore guarantees
(in-memory only); remove claims about how callers transport the value
- store_test.go: replace the meaningless GetReturnsCopy test (Go strings
are immutable, equality across two calls proves nothing) with
TestSecureStore_ConcurrentSetGet that exercises atomic.Pointer under
10-goroutine concurrent Set/Get load
- config_test.go: update error-message assertion to match new sentinel text
- docs/credential_encryption.md: remove reference to non-existent
'picoclaw encrypt' subcommand; describe the onboard flow instead
* fix(config): encryptPlaintextAPIKeys: struct-based encryption, fail-fast, remove raw []byte
* fix(credential): require SSH private key for encryption/decryption, remove passphrase-only mode
* lint: fix credential keygen lint, fix test keygen
* onboard: make encryption opt-in via --enc flag
Encryption (passphrase prompt + SSH key generation) is now only
triggered when the user passes --enc to 'picoclaw onboard'.
Without the flag, onboard skips the credential-encryption setup and
writes a plain config + workspace templates directly.
- Add --enc BoolFlag in NewOnboardCommand()
- Pass encrypt bool into onboard()
- Guard passphrase prompt, SSH key generation, and related env-var
setup behind the encrypt branch
- Adjust 'Next steps' output so the passphrase reminder only appears
when --enc was used
* fix: Use secure defaults for Pico channel setup and stop leaking the token in the URL
* fix: Derive default allow_origins from the setup request's Origin header instead of hardcoding localhost ports
* Add support for azure openai provider
* Add checks for deployment model name
* Apply suggestion from @Copilot
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
* Addressing @Copilot suggestion to remove the init() function which seemed redundant
* Fix readme
* Fix linting checks
---------
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
- centralize Pico chat connection and session state in a shared store
- move chat lifecycle control out of usePicoChat
- hydrate and restore the active session across the app
- add a dedicated /api/gateway/logs endpoint for incremental log polling
- keep /api/gateway/status focused on runtime and health data only
- update frontend log fetching to use the new API and add backend tests covering the status/logs separation and cleared-log behavior
* fix: safety guard incorrectly blocks commands with URLs
The absolutePathPattern regex was matching URL path components like
//github.com as file system paths, causing commands containing URLs
to be incorrectly blocked by the workspace restriction safety guard.
For example, 'agent-browser open https://github.com' would be blocked
because //github.com was treated as an absolute file path outside
the working directory.
The fix adds a check to skip any path match that starts with '//',
as these are URL path components, not file system paths.
Fixes#1203
* fix: handle file:// URIs correctly in safety guard
The previous fix skipped all paths starting with '//', which incorrectly
also skipped file:// URIs that could escape the workspace sandbox.
Changes:
- Only skip '//' paths when preceded by web URL schemes (http:, https:, ftp:, etc.)
- file:// URIs are now properly checked against workspace boundaries
- Added TestShellTool_FileURISandboxing to verify the fix
Fixes security issue raised by @alexhoshina in PR #1254
* style: fix gofumpt formatting
* fix(safety-guard): use exact match position to prevent URL exemption bypass
Using strings.Index(cmd, raw) always returned the first occurrence of the
matched substring, allowing a bypass where the same //path appeared both
inside a URL and as a standalone shell path (e.g. echo https://etc/passwd
&& cat //etc/passwd would skip the second match).
Switch to FindAllStringIndex so each match is evaluated at its actual
position in the command string.
Adds TestShellTool_URLBypassPrevented to cover the exploit scenario.
- track boot and config default models in gateway status/events
- preserve running, starting, and restarting states during health checks
- add safer gateway restart handling with stronger backend test coverage
- expose restart-required UI and refresh model state after default model update
* make gateway aware of config.json change
* fix according to code review
* fix lint
* fix review comment
* fix for review
* refactor to fix review
* fix for review
* fix for review
* add model command to set default model
* fix for ci
* fix test for model
* fix active agent not recognized
* implement test for model command
* fix local-model can not set as default issue
* fix review comment
* fix for comment
* feat: add anthropic-messages protocol support
Add native Anthropic Messages API format support to enable
compatibility with custom endpoints that only support Anthropic's
native message format (not OpenAI-compatible format).
Changes:
- Add new pkg/providers/anthropic_messages package with HTTP-based provider
- Implement Anthropic Messages API request/response format conversion
- Add anthropic-messages protocol support in factory_provider.go
- Include comprehensive unit tests (64.2% coverage)
Features:
- Support for system, user, assistant, and tool messages
- Support for tool calls (tool_use blocks)
- Proper header handling (x-api-key, anthropic-version)
- Configurable max_tokens and temperature
- Automatic base URL normalization
Configuration example:
model: "anthropic-messages/claude-opus-4-6"
api_base: "https://api.anthropic.com"
api_key: "sk-..."
Tested with actual API endpoint, verified compatibility
with Anthropic Messages API specification.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
* docs: add anthropic-messages protocol examples to README and config
Add configuration examples and documentation for the new
anthropic-messages protocol:
- config.example.json: Add claude-opus-4.6 example with anthropic-messages
- README.md: Add "Anthropic Messages API (native format)" section
- README.zh.md: Add Chinese version of the documentation
This helps users understand when to use anthropic-messages vs
anthropic protocol and fixes issue #269.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
* fix: format code with gofmt -s
- Align constant definitions in provider.go
- Align struct fields in test cases
- Fix gofmt formatting issues reported in review
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
* fix: address linter errors
- Fix HTTP header canonical form: "x-api-key" → "X-API-Key"
- Fix HTTP header canonical form: "anthropic-version" → "Anthropic-Version"
- Format imports with gci (standard, default, localmodule order)
- Format code with golines (max line length 120)
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
* fix: resolve golangci-lint errors in anthropic-messages provider
- add nolint comment for canonicalheader rule on X-API-Key header (Anthropic API requires exact casing)
- fix golines formatting issues in provider_test.go (split long lines under 120 chars)
- fix long comment line in factory_provider.go (split into two lines)
Resolves CI linter failures for the anthropic-messages protocol implementation.
* fix(providers): address review comments in anthropic-messages provider
- fix normalizeBaseURL edge case that incorrectly appends /v1 to URLs already containing /v1 path (e.g., https://api.example.com/v1/proxy)
- remove dead code for apiBase empty check as normalizeBaseURL() always provides a default value
- update test to use proper constructor instead of direct struct initialization
- add detailed comments explaining the URL normalization logic
Resolves review comments on PR #1284
* fix(providers): remove hardcoded max_tokens in anthropic-messages provider
- remove hardcoded max_tokens value (4096) from buildRequestBody
- read max_tokens directly from options parameter
- add error handling when max_tokens is missing from options
- update test cases to include max_tokens in options
This fix ensures the provider respects the config default value (32768)
or system fallback (8192) instead of always using the hardcoded 4096.
* fix(providers): improve error handling and add edge case tests
- fix ToolCalls nil vs empty slice issue to ensure consistent JSON serialization
- add detailed HTTP error handling for common status codes (401, 429, 400, 404, 500, 503)
- add edge case tests for buildRequestBody and parseResponseBody
- clarify anthropic vs anthropic-messages protocol differences in docs
---------
Co-authored-by: Claude <noreply@anthropic.com>
When the claude CLI exits with a non-zero status, the previous error
handler only checked stderr. However, the CLI writes its output
(including error details) to stdout, especially when invoked with
--output-format json. This left the caller with only "exit status 1"
and no actionable information.
Now includes both stderr and stdout in the error message so the actual
failure reason is visible in logs.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* fix(line): limit webhook request body size to prevent DoS
Add io.LimitReader with 1 MB cap on the LINE webhook handler to prevent
unauthenticated memory exhaustion via oversized POST requests.
Follows the same pattern used in the WeCom channel (io.LimitReader).
Requests exceeding the limit are rejected with 413 Request Entity Too Large.
Fixes#1407
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* refactor(line): hoist body size const, add boundary tests
- Move maxWebhookBodySize to package-level const
- Add TestWebhookAcceptsMaxBodySize (exact limit → 403, not 413)
- Add TestWebhookRejectsOversizedBodyBeforeSignatureCheck
- Use const in test instead of magic number
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Go's built-in mime.TypeByExtension returns 'image/svg' for .svg files,
but the correct MIME type per RFC 6838 is 'image/svg+xml'. This fix
registers the correct MIME type when setting up the static file server.
Fixes#1410
- Disable goreleaser GitHub release for nightly (Docker still pushed)
- Use GORELEASER_CURRENT_TAG with local-only tag for version/validation
- Force-update single `nightly` git tag instead of creating per-day tags
- Docker tags use only `nightly`/`nightly-launcher`, no per-day versions
- Set --latest=false on nightly release to avoid occupying latest
- Simplify workflow from 3 jobs to 1 job, remove all cleanup steps
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
* docs: swap header logo to webp, move meme logo to bottom
Replace header logo with assets/logo.webp across all 6 README
language variants and move the original meme logo (logo.jpg)
to the bottom of each file.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* docs: update GPT model names to gpt-5.4 and refine provider descriptions
Update all 6 language README variants:
- Correct GPT model references from gpt-5.2/gpt4 to gpt-5.4
- Refine provider descriptions in API Key comparison tables
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* chore: update default model to gpt-5.4, codex to gpt-5.3-codex
Update OpenAI default model references from gpt-5.2 to gpt-5.4
across source code, config examples, tests, and docs. Set Codex
default model to gpt-5.3-codex.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
TOOLS.md was intentionally removed in 21d60f6 and #771, as tools are
now provided to the LLM via JSON schema through ToProviderDefs().
These references were missed during that cleanup.
Suggested by @yinwm in #1355.
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
@@ -35,6 +35,8 @@ We are committed to maintaining a welcoming and respectful community. Be kind, c
For substantial new features, please open an issue first to discuss the design before writing code. This prevents wasted effort and ensures alignment with the project's direction.
For documentation contributions, prefer the layout and naming conventions in [`docs/README.md`](docs/README.md). Run `make lint-docs` after adding or moving Markdown files to catch common consistency issues early.
---
## Getting Started
@@ -64,26 +66,30 @@ For substantial new features, please open an issue first to discuss the design b
```bash
make build # Build binary (runs go generate first)
make generate # Run go generate only
make check # Full pre-commit check: deps + fmt + vet + test
make check # Full pre-commit check: deps + fmt + vet + test + docs consistency checks
```
### Running Tests
```bash
make test # Run all tests
make integration-test # Run Docker-backed integration suites
go test -run TestName -v ./pkg/session/ # Run a single test
go test -bench=. -benchmem -run='^$' ./... # Run benchmarks
```
Docker-backed integration suites are auto-discovered from [`integration/suites/`](integration/suites/). See [`integration/README.md`](integration/README.md) for the suite layout and the conventions used by CI.
### Code Style
```bash
make fmt # Format code
make vet # Static analysis
make lint # Full linter run
make lint-docs # Check common documentation layout and naming conventions
```
All CI checks must pass before a PR can be merged. Run `make check` locally before pushing to catch issues early.
All CI checks must pass before a PR can be merged. Run `make check` locally before pushing to catch issues early, including the common docs consistency checks from `make lint-docs`.
---
@@ -108,7 +114,7 @@ Use descriptive branch names, e.g. `fix/telegram-timeout`, `feat/ollama-provider
- Reference the related issue when relevant: `Fix session leak (#123)`.
- Keep commits focused. One logical change per commit is preferred.
- For minor cleanups or typo fixes, squash them into a single commit before opening a PR.
constanswerSystemPrompt=`You are a helpful assistant. Given conversation context, answer the question concisely and accurately. If the answer is not in the context, say "I don't know". Answer in 1-3 sentences maximum.`
constjudgeSystemPrompt=`You are an impartial judge evaluating answer quality.
Compare the candidate answer against the reference answer.
Consider semantic equivalence — different wording expressing the same meaning should score high.
Output ONLY a single integer score from 1 to 5:
1 = completely wrong or irrelevant
2 = partially related but mostly incorrect
3 = partially correct, missing key details
4 = mostly correct with minor omissions
5 = fully correct, semantically equivalent
Output ONLY the number, nothing else.`
// generateAnswer asks the LLM to answer a question given retrieved context.
evalCmd.Flags().StringVar(&flagOut,"out","./bench-out","output working directory")
evalCmd.Flags().StringVar(&flagMode,"mode","all","modes to evaluate: legacy, seahorse, or all")
evalCmd.Flags().IntVar(&flagBudget,"budget",4000,"token budget for retrieval")
evalCmd.Flags().
StringVar(&flagEvalMode,"eval-mode","token","evaluation mode: token (direct match) or llm (LLM-as-Judge)")
evalCmd.Flags().
StringVar(&flagAPIBase,"api-base","","API base URL with version path, e.g. http://host/v1 (default: http://127.0.0.1:8080/v1, env: MEMBENCH_API_BASE)")
evalCmd.Flags().StringVar(&flagAPIKey,"api-key","","API key for the LLM endpoint (env: MEMBENCH_API_KEY)")
evalCmd.Flags().StringVar(&flagModel,"model","","model name for LLM eval (env: MEMBENCH_MODEL)")
evalCmd.Flags().
BoolVar(&flagNoThinking,"no-thinking",false,"disable thinking mode via chat_template_kwargs (llama.cpp + Qwen)")
evalCmd.Flags().IntVar(&flagLimit,"limit",0,"max QA questions per sample (0 = all)")
evalCmd.Flags().IntVar(&flagTimeout,"timeout",120,"HTTP timeout in seconds for LLM requests")
evalCmd.Flags().IntVar(&flagRetries,"retries",3,"max retry attempts for transient LLM errors (timeout/5xx/429)")
evalCmd.Flags().StringVar(&flagJudgeModel,"judge-model","","model for judge scoring (defaults to --model)")
evalCmd.Flags().
StringVar(&flagJudgeAPIBase,"judge-api-base","","API base URL for judge model (defaults to --api-base)")
evalCmd.Flags().StringVar(&flagJudgeAPIKey,"judge-api-key","","API key for judge model (defaults to --api-key)")
evalCmd.Flags().IntVar(&flagConcurrency,"concurrency",1,"number of concurrent QA evaluations")
reportCmd:=&cobra.Command{
Use:"report",
Short:"Output comparison results from evaluation",
RunE:runReport,
}
reportCmd.Flags().StringVar(&flagOut,"out","./bench-out","output working directory")
runCmd:=&cobra.Command{
Use:"run",
Short:"Convenience: eval + report (ingestion is done inline)",
runCmd.Flags().StringVar(&flagOut,"out","./bench-out","output working directory")
runCmd.Flags().StringVar(&flagMode,"mode","all","modes to run: legacy, seahorse, or all")
runCmd.Flags().IntVar(&flagBudget,"budget",4000,"token budget for retrieval")
runCmd.Flags().
StringVar(&flagEvalMode,"eval-mode","token","evaluation mode: token (direct match) or llm (LLM-as-Judge)")
runCmd.Flags().
StringVar(&flagAPIBase,"api-base","","API base URL with version path, e.g. http://host/v1 (default: http://127.0.0.1:8080/v1, env: MEMBENCH_API_BASE)")
runCmd.Flags().StringVar(&flagAPIKey,"api-key","","API key for the LLM endpoint (env: MEMBENCH_API_KEY)")
runCmd.Flags().StringVar(&flagModel,"model","","model name for LLM eval (env: MEMBENCH_MODEL)")
runCmd.Flags().
BoolVar(&flagNoThinking,"no-thinking",false,"disable thinking mode via chat_template_kwargs (llama.cpp + Qwen)")
runCmd.Flags().IntVar(&flagLimit,"limit",0,"max QA questions per sample (0 = all)")
runCmd.Flags().IntVar(&flagTimeout,"timeout",120,"HTTP timeout in seconds for LLM requests")
runCmd.Flags().IntVar(&flagRetries,"retries",3,"max retry attempts for transient LLM errors (timeout/5xx/429)")
runCmd.Flags().StringVar(&flagJudgeModel,"judge-model","","model for judge scoring (defaults to --model)")
runCmd.Flags().
StringVar(&flagJudgeAPIBase,"judge-api-base","","API base URL for judge model (defaults to --api-base)")
runCmd.Flags().StringVar(&flagJudgeAPIKey,"judge-api-key","","API key for judge model (defaults to --api-key)")
runCmd.Flags().IntVar(&flagConcurrency,"concurrency",1,"number of concurrent QA evaluations")
require.NotNil(t,encFlag,"expected --enc flag to be registered")
assert.Equal(t,"false",encFlag.DefValue,"--enc should default to false")
assert.False(t,cmd.HasSubCommands())
}
Some files were not shown because too many files have changed in this diff
Show More
Reference in New Issue
Block a user
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.