Add a warning log when the type assertion from sync.Map.LoadAndDelete fails in UnsubscribeEvents, per review suggestion. This makes a mismatched type observable for debugging.
Add 3 tests covering scenarios that previously panicked: 1) missing enabled key in settings 2) enabled field with non-bool type 3) teams_webhook with webhooks using map[string]any from JSON unmarshal
sync.Map.LoadAndDelete returns any; unprotected type assertion could panic if an unexpected type were stored. Add ok check to safely handle mismatched types.
Two type assertions in toChannelHashes could panic when channel config values had unexpected types from JSON unmarshal: 1) value[enabled].(bool) panics if the key is missing or not a bool 2) vv.(map[string]string) panics when JSON unmarshal produces map[string]any. Add ok checks to safely handle both cases.
When an incoming group message is received, the inbound context ChatID was set to the raw group number without the group: prefix. This caused the outbound reply to use send_private_msg instead of send_group_msg. Fix by using the prefixed chatID as inbound context ChatID. Closes#3002
The SDK renamed ReceiveIdTypeChatId to CreateMessageV1ReceiveIDTypeChatId
in v3.9.4. Update all 5 usages in feishu_64.go and bump the dependency
version.
This fixes the build failure for Dependabot PR #3005.
isProcessRunning() previously only checked whether a PID existed via signal(0)/OpenProcess, without confirming the process was actually picoclaw. When the PID was reused by an unrelated process (e.g., systemd-resolved after a kill -9), the gateway would refuse to start with 'already running'.
Add isPicoclawProcess() that verifies the process name matches picoclaw:
- Unix: reads /proc/<pid>/comm
- Windows: calls QueryFullProcessImageNameW
If the running process is not picoclaw, treat the PID file as stale and proceed with normal startup. Falls back to trusting the liveness check when identity verification is unavailable (e.g., /proc unreadable, API call fails).
Fixes#2720.
The workspace guard's absolutePathPattern regex matches /Beijing?T in commands like 'curl wttr.in/Beijing'. Since 'wttr.in' is not a recognized web scheme, the path was routed through workspace sandbox validation, which could block legitimate scheme-less URL usage (curl allows bare domains without http://).
Add detection for domain-like tokens preceding /path matches:
- looksLikeDomain: checks for dot-separated tokens that don't end with common file extensions (.py, .go, .exe, etc.)
- localPathExists: verifies the token does not exist as a local filesystem entry
This dual guard prevents the symlink bypass identified in PR #2965 review: if 'foo.bar' exists as a local symlink or directory, the path still undergoes full workspace validation.
Fixes#1042.
go env GOVERSION may return values like go1.25.10 X:nodwarf5 with an embedded space on some toolchain configurations, breaking -ldflags. Use firstword to extract only the first token. Fixes#2976.
Replace 7 instances of ignored json.Marshal errors with proper error handling. Previously, if marshaling an ExecResponse failed, a nil byte slice would be silently converted to an empty string in the LLM response. Now each site returns ErrorResult with the marshal error message.
The PromoteAliasHistory method previously promoted the first non-empty alias session into a new canonical session. When a user upgraded, the migrated main session contained old messages that were copied into every new Web UI session because agent:main:main is always the first alias.
Add isMainSessionAlias() to detect and skip the main session alias during promotion. Fixes#2972.
The Stop() method previously used a select/default pattern which was not
safe under concurrent calls — two goroutines could both pass the check
and attempt to close the same channel, causing a panic.
Replace with sync.Once to guarantee exactly-once close semantics,
matching the documented contract of being safe for concurrent use.
Review feedback: afjcjsbx
Previously, only timeout and network errors (matched via string
patterns) were retried. HTTP 500 server errors from
OpenRouter/OpenAI-compatible providers would fail the agent turn
immediately when no model fallback candidate was available.
This commit replaces the separate timeout/network retry branches
with a unified transientLLMRetryReason() helper that:
1. Uses providers.ClassifyError() to detect server_error (HTTP >=500),
timeout, network, and rate_limit errors
2. Falls back to the existing string-based detection for errors
not classified by the provider
A regression test (TestPipeline_CallLLM_HTTP5xxRetry) verifies that
HTTP 500 errors are retried and recover successfully.
This is a clean rebase of the approach originally proposed in #2768
by afjcjsbx.
This fixes issue #2943 where WeChat channel image requests to Zhipu
GLM-5-Turbo vision API failed with error code 1210 (parameter error)
without triggering the fallback mechanism.
Changes:
- Added error code 1210 pattern matching to formatPatterns
- This allows the fallback mechanism to recognize Zhipu API parameter
errors and fall back to alternative vision models
Closes#2943
The SessionManager's background cleanup goroutine previously had no
shutdown mechanism. Each call to NewSessionManager() started a ticker
goroutine that ran indefinitely. In tests, where multiple
SessionManagers are created, this caused goroutine leaks.
This commit adds a Stop() method that cleanly shuts down the background
cleanup goroutine via a channel. Stop() is safe to call multiple times.
All existing tests now call t.Cleanup(sm.Stop) to ensure cleanup.