The previous dedupe map rotation logic completely cleared the map when it reached max size, causing an 'amnesia cliff' where immediately arriving duplicates of just-forgotten messages would be processed.
This change replaces that with a MessageDeduplicator struct that uses a circular queue (ring buffer) to track insertions. When the limit is reached, it only evicts the absolute oldest message from the map, completely resolving the cliff issue.
This also cleans up the WeCom Bot and App webhook handlers by encapsulating the mutex and map state.
When the dedupe map rotates, the previous logic entirely cleared the map, meaning the message that triggered the rotation was immediately forgotten and could be duplicated immediately.
This change seeds the new map with the current message to prevent that. Also adds a defensive nil check.
Match rotation semantics to prior behavior by fully resetting the dedupe map
once the size limit is exceeded, and add focused tests for duplicate detection
and boundary rotation behavior.
Centralize dedupe map access behind a mutex-safe helper and use it in both
WeCom bot and WeCom app channels to eliminate concurrent map access races while
preserving current dedupe behavior.
- Introduced WeCom AIBot channel configuration in config.go with relevant fields.
- Implemented WeCom AIBot channel factory registration in init.go.
- Created unit tests for WeCom AIBot channel functionalities including initialization, start/stop behavior, webhook path handling, message encryption/decryption, and signature generation.
- Set default values for WeCom AIBot configuration in defaults.go.
* fix(tools): allow /dev/null redirection and add read/write sandbox split
- Remove deny pattern that incorrectly blocked redirects to /dev/null
- Expand block device write pattern to cover nvme, mmcblk, vd, xvd,
hd, loop, dm-, md, sr and nbd in addition to sd
- Add safe path whitelist for kernel pseudo-devices so workspace path
check does not reject /dev/null, /dev/zero, /dev/random, /dev/urandom,
/dev/stdin, /dev/stdout and /dev/stderr
- Add allow_read_outside_workspace config option (default true) so file
read and list tools are unrestricted while write tools stay sandboxed
Closes https://github.com/sipeed/picoclaw/issues/964
Closes https://github.com/sipeed/picoclaw/issues/965
Signed-off-by: Huang Rui <vowstar@gmail.com>
* feat(tools): add configurable allow patterns and path whitelists
- Add custom_allow_patterns to exec config so users can exempt specific
commands from deny pattern checks
- Add allow_read_paths and allow_write_paths regex lists to tools config
for whitelisting specific paths outside the workspace
- Introduce whitelistFs that wraps sandboxFs and falls through to hostFs
for paths matching whitelist patterns
- Use variadic constructor signatures to keep backward compatibility
Suggested-by: lxowalle
Signed-off-by: Huang Rui <vowstar@gmail.com>
---------
Signed-off-by: Huang Rui <vowstar@gmail.com>
* fix: return fetched content to LLM in web_fetch tool
WebFetchTool.Execute was setting ForLLM to a summary string
("Fetched N bytes from URL ...") instead of the actual extracted
text. This meant the LLM never saw the page content and could not
answer questions based on fetched web pages.
Return the extracted text in ForLLM so the model can use it.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* fix: put full JSON result in ForLLM, summary in ForUser
Accept suggestion from afjcjsbx: the LLM should receive the full JSON
result (including extracted text) while the user sees a short summary.
Update tests to match the new field assignment.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
* feat(config): Add support for env var configuration
This commit introduces support for two environment variables,
allowing users to override the default paths for picoclaw's home
directory and configuration file.
- `PICOCLAW_CONFIG`: Directly specifies the path to the `config.json` file.
This is initialised first, takes precedence over the hardcoded path, and is ideal
for containerized deployments or custom config management.
- `PICOCLAW_HOME`: Overrides the root directory for all picoclaw data, (except the config)
(e.g., `~/.picoclaw`). This is useful for portable installations or placing
data in non-standard locations.
This change provides greater flexibility for running picoclaw in various environments without
being tied to the default home directory structure.
* `README.md` updated explain PICOCLAW_CONFIG and PICOCLAW_HOME
* docs: translate environment variables section to multiple languages
---------
Co-authored-by: picoclaw <picoclaw@sipeed.com>
- Remove Feishu from webhook channel list in README.md and README.zh.md;
add clarifying note that Feishu uses WebSocket/SDK mode instead
- Replace Chinese text in README.vi.md header with Vietnamese equivalent
- Translate mixed-language WeCom note in README.vi.md to full Vietnamese
- Mark webhook_path as optional (否) in docs/channels/line/README.zh.md
- Remove incorrect yaml struct tags from new-channel example in
pkg/channels/README.md and README.zh.md (config uses json tags only)
- Fix multi-mode initChannel example to use whatsapp/whatsapp_native
(matching the "WhatsApp Bridge vs Native" comment) instead of matrix
- Correct ReasoningChannelID description: list the 12 channels that
have the field and note that PicoConfig does not expose it
* fix(pkg/providers):do regex precompile insteadd on the fly
* fix(providers): replace HTTP-specific regex with standalone status code matcher
The precompiled HTTP regex used uppercase "HTTP" which never matched
because ClassifyError lowercases the input. Replace it with a
case-insensitive word-boundary pattern that matches any standalone
3-digit status code (300-599), which also subsumes the HTTP/x.x case.
Add test case for standalone status code extraction.
* fix(providers): restore http regex and add standalone status code matcher
Restore the http-prefixed regex (without unnecessary (?i) flag since
input is already lowercased by ClassifyError) as a mid-priority pattern
to reduce false positives. Add a standalone word-boundary matcher as a
fallback for bare status codes like "429". Fix test to use lowercased
input matching the actual calling convention.
* perf(tools): move path regex compilation from per-call to package init
The path regex in guardCommand was compiled on every call. Hoist it
to a package-level var (absolutePathPattern) alongside defaultDenyPatterns
in a single var block, so it is compiled once at init time.
* style(tools): move inline comment to fix golines formatting error
- Fix ignored error from SendAndWait call
- Improve error message for unimplemented stdio mode with helpful guidance
- Add TODO comment with reference link for future stdio implementation