The reviewer identified two bugs in the original PR:
1. PATCH /api/config leaves session.dimensions stale: LoadConfig()
derives dimensions from the old dm_scope, and the merge carries
those stale dimensions forward. ApplyDmScope() then exits early
because dimensions is already populated, causing a mismatch between
dm_scope (new) and dimensions (old).
2. Legacy/default configs omit dm_scope in GET response: configs with
explicit dimensions but no dm_scope (including DefaultConfig) return
no dm_scope field, causing the frontend to fall back to its default
('per-channel-peer'), which may not match the actual dimensions.
Fix:
- Add DeriveDmScope() to reverse-map known dimensions arrays to
dm_scope when dm_scope is empty.
- Call it in LoadConfig(), PUT handler, PATCH handler, and
ResetToDefaults() for consistent normalization.
- In PATCH handler, clear stale dimensions from the merge result when
the patch contains session.dm_scope but not session.dimensions,
allowing ApplyDmScope() to re-derive from the new scope.
- Add comprehensive unit tests for DeriveDmScope() and scope
transition scenarios.
The dm_scope field was stored in config but never translated into the
dimensions array that the routing layer actually consumes. This meant
changing the session isolation scope in the UI had no effect at runtime.
Add ApplyDmScope() to SessionConfig which maps the user-facing dm_scope
values (per-channel-peer, per-channel, per-peer, global) to the
corresponding dimension arrays. Call it in LoadConfig post-processing
and in both the PATCH and PUT API handlers.
Includes table-driven tests covering all dm_scope values and the
precedence rule (explicit dimensions > derived from dm_scope).
The frontend sends dm_scope as part of the session config, but the
backend SessionConfig struct lacked the corresponding field. Go's
encoding/json silently discards unknown fields, so the value was lost
on every PATCH request. Additionally, MarshalJSON only emitted the
session block when Dimensions or IdentityLinks were set, so even a
stored dm_scope would not appear in GET responses.
- Add DmScope string field with json tag 'dm_scope' to SessionConfig
- Update MarshalJSON condition to include session when DmScope is set
Replace raw log.Printf and fmt.Printf calls in pkg/state, pkg/agent, and pkg/tools with structured logger calls (WarnCF/InfoCF). This ensures warnings and info messages are routed through the configured logging infrastructure instead of raw stderr/stdout.
errutil.go: Change %v to %w in ClassifySendError and ClassifyNetError so callers can use errors.Is/errors.As on the underlying HTTP/network error.
isolated_command_transport.go: Change %v to %w in Close() and Write() error paths for the same reason.
When os.Getwd fails, wd is empty and builtinSkillsDir resolves to relative path, causing confusing downstream errors. Fall back to config.GetHome on error.
Add a warning log when the type assertion from sync.Map.LoadAndDelete fails in UnsubscribeEvents, per review suggestion. This makes a mismatched type observable for debugging.
Add 3 tests covering scenarios that previously panicked: 1) missing enabled key in settings 2) enabled field with non-bool type 3) teams_webhook with webhooks using map[string]any from JSON unmarshal
Address remaining review feedback: 1) Add HistoryTokens field to ContextUsage/ContextStats, showing history-only token count in /context and frontend UI alongside SummarizeAtTokens so users can see the actual summarization trigger comparison. 2) Remove .codebuddy/github-contribute/ state files accidentally included in the PR.
sync.Map.LoadAndDelete returns any; unprotected type assertion could panic if an unexpected type were stored. Add ok check to safely handle mismatched types.
Two type assertions in toChannelHashes could panic when channel config values had unexpected types from JSON unmarshal: 1) value[enabled].(bool) panics if the key is missing or not a bool 2) vv.(map[string]string) panics when JSON unmarshal produces map[string]any. Add ok checks to safely handle both cases.
When an incoming group message is received, the inbound context ChatID was set to the raw group number without the group: prefix. This caused the outbound reply to use send_private_msg instead of send_group_msg. Fix by using the prefixed chatID as inbound context ChatID. Closes#3002
The SDK renamed ReceiveIdTypeChatId to CreateMessageV1ReceiveIDTypeChatId
in v3.9.4. Update all 5 usages in feishu_64.go and bump the dependency
version.
This fixes the build failure for Dependabot PR #3005.
isProcessRunning() previously only checked whether a PID existed via signal(0)/OpenProcess, without confirming the process was actually picoclaw. When the PID was reused by an unrelated process (e.g., systemd-resolved after a kill -9), the gateway would refuse to start with 'already running'.
Add isPicoclawProcess() that verifies the process name matches picoclaw:
- Unix: reads /proc/<pid>/comm
- Windows: calls QueryFullProcessImageNameW
If the running process is not picoclaw, treat the PID file as stale and proceed with normal startup. Falls back to trusting the liveness check when identity verification is unavailable (e.g., /proc unreadable, API call fails).
Fixes#2720.
The workspace guard's absolutePathPattern regex matches /Beijing?T in commands like 'curl wttr.in/Beijing'. Since 'wttr.in' is not a recognized web scheme, the path was routed through workspace sandbox validation, which could block legitimate scheme-less URL usage (curl allows bare domains without http://).
Add detection for domain-like tokens preceding /path matches:
- looksLikeDomain: checks for dot-separated tokens that don't end with common file extensions (.py, .go, .exe, etc.)
- localPathExists: verifies the token does not exist as a local filesystem entry
This dual guard prevents the symlink bypass identified in PR #2965 review: if 'foo.bar' exists as a local symlink or directory, the path still undergoes full workspace validation.
Fixes#1042.
Replace 7 instances of ignored json.Marshal errors with proper error handling. Previously, if marshaling an ExecResponse failed, a nil byte slice would be silently converted to an empty string in the LLM response. Now each site returns ErrorResult with the marshal error message.
The /context command previously showed only the hard budget compression
threshold (contextWindow - maxTokens), which confused users who expected
to see the soft summarization trigger from summarize_token_percent.
This commit adds SummarizeAtTokens alongside the existing CompressAtTokens
so that both thresholds are visible:
- Compress at: contextWindow - maxTokens (hard budget, triggers proactive
compression when exceeded)
- Summarize at: contextWindow * summarizeTokenPercent / 100 (soft trigger,
matches maybeSummarize's threshold)
The fix updates the /context command output, the Web UI popover, and the
pico channel WebSocket payload.
Fixes#2968
The PromoteAliasHistory method previously promoted the first non-empty alias session into a new canonical session. When a user upgraded, the migrated main session contained old messages that were copied into every new Web UI session because agent:main:main is always the first alias.
Add isMainSessionAlias() to detect and skip the main session alias during promotion. Fixes#2972.
The Stop() method previously used a select/default pattern which was not
safe under concurrent calls — two goroutines could both pass the check
and attempt to close the same channel, causing a panic.
Replace with sync.Once to guarantee exactly-once close semantics,
matching the documented contract of being safe for concurrent use.
Review feedback: afjcjsbx
Previously, only timeout and network errors (matched via string
patterns) were retried. HTTP 500 server errors from
OpenRouter/OpenAI-compatible providers would fail the agent turn
immediately when no model fallback candidate was available.
This commit replaces the separate timeout/network retry branches
with a unified transientLLMRetryReason() helper that:
1. Uses providers.ClassifyError() to detect server_error (HTTP >=500),
timeout, network, and rate_limit errors
2. Falls back to the existing string-based detection for errors
not classified by the provider
A regression test (TestPipeline_CallLLM_HTTP5xxRetry) verifies that
HTTP 500 errors are retried and recover successfully.
This is a clean rebase of the approach originally proposed in #2768
by afjcjsbx.