Commit Graph

1267 Commits

Author SHA1 Message Date
RussellLuo 4d2b244522 refactor(voice): share audio format support and restrict transcriber selection 2026-03-22 23:40:13 +08:00
RussellLuo 92678d1700 docs(voice): Update docs for audio-transcription 2026-03-22 21:04:10 +08:00
RussellLuo 8ad4b9b497 feat(voice): add audio-model transcription support
- Add `AudioModelTranscriber` for model-based audio transcription via LLM providers
- Support selecting a transcription model with `voice.model_name` in config
- Keep Groq transcription as a fallback and move it into dedicated files with focused tests
- Serialize `data:audio/...` media as input_audio for OpenAI-compatible providers
- Improve transcription logging by rendering error fields as strings
- Add coverage for transcriber detection, audio-model behavior, provider audio serialization, and Groq transcription

Fixes #1890.
2026-03-22 20:07:22 +08:00
Hua Audio dd82794255 Feat/weixin openclaw port (#1873)
* init

* fix lint

* fix go test

* update docs

* incorporate pr review

* Update pkg/channels/weixin/weixin.go

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Apply suggestions from code review

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* feat(weixin): add media sync and typing support

* test(weixin): cover media and sync helpers

---------

Co-authored-by: zhangmikoto <i@electromaster.me>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Hoshina <hoshina@evaz.org>
2026-03-22 14:23:39 +08:00
daming大铭 931eee92a0 Merge pull request #1853 from kunalk16/feat-configurable-logger
feat(logging): add configurability for log levels preference
2026-03-22 13:27:06 +08:00
Caize Wu 9107740781 Merge pull request #1857 from lc6464/main
docs: clean up README and update QRCode
2026-03-22 11:12:30 +08:00
Mauro c0bb8d6df9 Merge pull request #1617 from yzxlr/codex/fix-1561-heartbeat-template-idle
fix(heartbeat): ignore untouched default template
2026-03-21 20:18:39 +01:00
Mauro e6ea9c4ff3 Merge pull request #1855 from badgerbees/fix/telegram-group-id-validation
fix(identity): support negative integers in isNumeric for Telegram group IDs
2026-03-21 20:16:49 +01:00
Mauro 7a47d7a55c Merge pull request #1782 from biisal/chore/docker-data-in-gitignore
chore: Ignore the `docker/data` directory.
2026-03-21 19:52:25 +01:00
Mauro 5286464bfc Merge pull request #1861 from amirmamaghani/feat/agent-browser-skill-heavy-dockerfile
feat: add agent-browser skill and Dockerfile.heavy
2026-03-21 19:47:07 +01:00
daming大铭 3cd674e3b8 Merge pull request #1865 from sipeed/revert-1752-feat/exec-tool-enhancement
Revert "feat(tools): add exec tool enhancement with background execution and PTY support"
2026-03-22 00:46:56 +08:00
daming大铭 ebcd5645f1 Revert "feat(tools): add exec tool enhancement with background execution and …"
This reverts commit f901af8cbc.
2026-03-22 00:39:47 +08:00
Liu Yuan f901af8cbc feat(tools): add exec tool enhancement with background execution and PTY support (#1752)
- Unified exec tool with actions: run/list/poll/read/write/send-keys/kill
- PTY support using creack/pty library
- Process session management with background execution
- Process group kill for cleaning up child processes
- Session cleanup: 30-minute TTL for old sessions
- Output buffer: 100MB limit with truncation

Actions:
- run: execute command (sync or background)
- list: list all sessions
- poll: check session status
- read: read session output
- write: send input to session stdin
- send-keys: send special keys (up, down, ctrl-c, enter, etc.)
- kill: terminate session

Tests:
- PTY: allowed commands, write/read, poll, kill, process group kill
- Non-PTY: background execution, list, read, write, poll, kill, process group kill
- Session management: add/get/remove/list/cleanup
2026-03-21 22:38:03 +08:00
Amir Mamaghani 520391643b feat: add agent-browser skill and Dockerfile.heavy with full runtime
Add agent-browser skill to the default workspace with complete CLI
reference for browser automation via Chrome/Chromium CDP. The skill
includes a runtime guard that checks for the binary before use.

Add Dockerfile.heavy — a batteries-included container image with:
- Node.js 24 + npm
- Python 3 + pip + uv
- Chromium + Playwright (for agent-browser)
- agent-browser CLI pre-installed
- Non-root picoclaw user (UID/GID 1000)
- Default workspace with all skills
- Persistent workspace volume

This complements the existing minimal Dockerfile and Dockerfile.full
for deployments that need browser automation and rich tool support.
2026-03-21 15:14:32 +01:00
LC f71a6ff76c docs: update alt text of wechat.png with a more meaningful description
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2026-03-21 18:52:39 +08:00
lc6464 e2e3e6d5b0 docs: update WeChat QRCode for README 2026-03-21 18:40:10 +08:00
lc6464 ab93c235ae docs: clean up README by removing duplicate sections 2026-03-21 18:36:29 +08:00
Badgerbees bc0be17e88 fix(identity): support negative integers in isNumeric for Telegram group IDs 2026-03-21 17:09:02 +07:00
Kunal Karmakar 073ae4864f Fix spelling 2026-03-21 07:20:59 +00:00
Kunal Karmakar 4c8526d917 Merge branch 'feat-configurable-logger' of https://github.com/kunalk16/picoclaw into feat-configurable-logger 2026-03-21 06:54:55 +00:00
Kunal Karmakar 647071d342 Add default value for config 2026-03-21 06:54:49 +00:00
Kunal Karmakar 92b7687068 Add configurable logger 2026-03-21 06:54:49 +00:00
Kunal Karmakar 650827103d Merge branch 'main' of https://github.com/sipeed/picoclaw into feat-configurable-logger 2026-03-21 06:53:17 +00:00
Kunal Karmakar f35516c5c9 Add default value for config 2026-03-21 06:53:05 +00:00
BeaconCat 6148ccc529 docs(feishu): note that Feishu channel does not support 32-bit devices (#1851)
Co-authored-by: BeaconCat <BeaconCat@users.noreply.github.com>
2026-03-21 14:36:51 +08:00
Kunal Karmakar 8490084640 Merge branch 'main' of https://github.com/sipeed/picoclaw into feat-configurable-logger 2026-03-21 05:18:55 +00:00
Kunal Karmakar 329322075d Add configurable logger 2026-03-21 05:18:25 +00:00
Mauro 100720bb74 Merge pull request #1818 from Alix-007/fix/issue-1815-empty-response-message
fix(agent): separate empty-response and tool-limit fallbacks
2026-03-20 23:23:48 +01:00
BeaconCat 403ceb39be docs: fix inaccuracies, add translations, and expand channel docs (#1837)
## Config field fixes (cross-verified against Go source)
- MaixCam: server_address → host + port
- IRC: use_tls → tls, channels_to_join → channels (all 6 languages)
- WeCom AI Bot: callback port 18791 → 18790
- credential_encryption: base_url → api_base, add required model field,
  remove incorrect passphrase-only mode docs
- providers.md: agents.defaults.model → model_name (×4), remove
  non-existent session.backlog_limit
- migration guide, troubleshooting: agents.defaults.model → model_name
- ANTIGRAVITY_AUTH: fix file path, Go 1.21 → 1.25, model → model_name
- spawn-tasks: fix truncated file, add Heartbeat introduction
- tools_configuration: add Tavily/SearXNG/GLMSearch, exec allow_remote/
  timeout_seconds/custom_allow_patterns, cron allow_command, skills
  github/search_cache, clawhub timeout/max_zip_size/max_response_size
- configuration: fix builtin skills path (build-time embedded, not cwd),
  HEARTBEAT.md marked auto-generated

## Broken link fixes (15 total)
- chat-apps.md: WeCom/Matrix links with wrong relative paths
- providers.md: migration link with extra docs/ prefix
- hardware-compatibility.md: README links with wrong depth (all 5 langs)
- chat-apps.md: WhatsApp dead links → anchor links (zh/ja)

## Getting-started accuracy
- README (all 6 langs): add picoclaw.io as recommended download,
  add missing picoclaw model CLI command
- docker.md: clarify first-run trigger condition (all 6 langs)
- configuration.md: fix builtin skills path description (all 6 langs)

## QQ channel
- Add quick setup via q.qq.com/qqbot/openclaw (one-click bot creation)
- Add manual setup as fallback (all 6 languages)

## Feishu channel
- Update setup flow: WebSocket/SDK mode, no webhook URL needed
- Preserve Lark international domain note (all 6 languages)

## chat-apps.md
- Add Feishu, Slack, IRC, OneBot detail sections (all 6 languages)
- Add MaixCam section to ja/fr/pt-br/vi
- Fix all channel doc links to point to correct language version

## New translations (25 files, 5 docs × 5 languages)
debug.md, credential_encryption.md, hardware-compatibility.md,
ANTIGRAVITY_AUTH.md, ANTIGRAVITY_USAGE.md → zh/ja/fr/pt-br/vi

## Channel docs (6 languages each, 60 new files)
telegram, discord, qq, feishu, maixcam, dingtalk, line, slack, onebot,
wecom/wecom_aibot, wecom/wecom_app, wecom/wecom_bot

Co-authored-by: BeaconCat <BeaconCat@users.noreply.github.com>
2026-03-20 22:37:05 +08:00
liqianjie 0fe058254c fix: add fallback DNS resolver for Android with multi-DNS support (#1835)
On Android, /etc/resolv.conf does not exist, causing Go's default DNS
resolution to fail. This adds an init() hook that:

1. Detects missing /etc/resolv.conf (Android environment)
2. Configures a custom resolver with PreferGo: true
3. Supports multiple DNS servers via PICOCLAW_DNS_SERVER env var
   - Semicolon-separated: "8.8.8.8:53;1.1.1.1:53"
   - Single server also works: "8.8.8.8"
   - Auto-appends :53 if port omitted
4. Round-robin rotation across configured servers
5. Defaults to Google DNS + Cloudflare DNS

Also patches http.DefaultTransport to use the custom resolver.
2026-03-20 22:32:21 +08:00
Amir Mamaghani 71134babb9 feat(telegram): stream LLM responses via sendMessageDraft (#1101)
* feat(telegram): stream LLM responses in real-time via sendMessageDraft

Implements real-time token streaming to Telegram using the sendMessageDraft
API (telego v1.6.0). Instead of showing only a "Thinking..." placeholder
until the full response arrives, users now see partial LLM output appear
in the chat as it's generated.

The streaming pipeline threads through all layers:

- StreamingProvider interface (providers/types.go): opt-in ChatStream()
  method that receives an onChunk callback with accumulated text
- OpenAI-compatible SSE streaming (openai_compat/provider.go): parses
  SSE events with stream:true, handles text deltas and tool call assembly
- Anthropic native streaming (anthropic/provider.go): uses SDK's
  NewStreaming() for direct Anthropic API connections
- HTTPProvider delegation (http_provider.go): delegates ChatStream to
  the underlying openai_compat provider
- StreamingCapable + Streamer interfaces (channels/interfaces.go):
  opt-in channel capability like TypingCapable/PlaceholderCapable
- Telegram streamer (telegram/telegram.go): BeginStream returns a
  telegramStreamer that throttles sendMessageDraft calls (3s/200 chars)
  with graceful degradation on API errors
- StreamDelegate bridge (bus/bus.go): decouples agent loop from channel
  manager without tight imports
- Manager integration (manager.go): implements StreamDelegate, tracks
  streamActive state, coordinates with placeholder editing
- Agent loop (loop.go): uses ChatStream when both provider and channel
  support streaming, cancels stream on tool calls, skips PublishOutbound
  when Finalize already delivered the message

Graceful degradation:
- Bots without forum/topics mode: first sendMessageDraft error sets
  failed=true, subsequent Updates become no-ops, Finalize still delivers
  via SendMessage. User sees normal non-streaming behavior.
- Non-streaming providers: type assertion fails, falls back to Chat()
- Config opt-out: streaming.enabled (default true) in telegram config

Closes #1098

* fix(telegram): delete placeholder message when streaming delivers response

When streaming was active, the "Thinking..." placeholder message stayed
in the chat because preSend only deleted the tracking entry without
removing the actual Telegram message. Now preSend deletes the placeholder
via the new MessageDeleter interface when streamActive is set.

* refactor(streaming): remove dead code and simplify streaming wiring

- Delete unused Anthropic ChatStream/parseStream (-131 lines) — factory
  creates HTTPProvider for all OpenAI-compat providers including OpenRouter
- Simplify runLLMIteration from 4 to 3 return values (remove unused
  streamed bool)
- Replace managerStreamer struct with finalizeHookStreamer using embedding
  (Update/Cancel promoted, only Finalize overridden)

* fix(streaming): skip streamer acquisition when SendResponse is false

Heartbeat messages set SendResponse=false but the streaming path
was unconditionally acquiring a streamer, causing HEARTBEAT_OK to
leak to Telegram via streamer.Finalize().

* fix(streaming): guard streamer for non-sendable messages, add streaming config

Skip streamer acquisition for heartbeat (NoHistory=true), preventing
HEARTBEAT_OK from leaking to Telegram via streamer.Finalize().

Add streaming.enabled to Telegram defaults and example config.

* feat(telegram): stream LLM responses in real-time via sendMessageDraft

Implements real-time token streaming to Telegram using the sendMessageDraft
API (telego v1.6.0). Instead of showing only a "Thinking..." placeholder
until the full response arrives, users now see partial LLM output appear
in the chat as it's generated.

The streaming pipeline threads through all layers:

- StreamingProvider interface (providers/types.go): opt-in ChatStream()
  method that receives an onChunk callback with accumulated text
- OpenAI-compatible SSE streaming (openai_compat/provider.go): parses
  SSE events with stream:true, handles text deltas and tool call assembly
- Anthropic native streaming (anthropic/provider.go): uses SDK's
  NewStreaming() for direct Anthropic API connections
- HTTPProvider delegation (http_provider.go): delegates ChatStream to
  the underlying openai_compat provider
- StreamingCapable + Streamer interfaces (channels/interfaces.go):
  opt-in channel capability like TypingCapable/PlaceholderCapable
- Telegram streamer (telegram/telegram.go): BeginStream returns a
  telegramStreamer that throttles sendMessageDraft calls (3s/200 chars)
  with graceful degradation on API errors
- StreamDelegate bridge (bus/bus.go): decouples agent loop from channel
  manager without tight imports
- Manager integration (manager.go): implements StreamDelegate, tracks
  streamActive state, coordinates with placeholder editing
- Agent loop (loop.go): uses ChatStream when both provider and channel
  support streaming, cancels stream on tool calls, skips PublishOutbound
  when Finalize already delivered the message

Graceful degradation:
- Bots without forum/topics mode: first sendMessageDraft error sets
  failed=true, subsequent Updates become no-ops, Finalize still delivers
  via SendMessage. User sees normal non-streaming behavior.
- Non-streaming providers: type assertion fails, falls back to Chat()
- Config opt-out: streaming.enabled (default true) in telegram config

Closes #1098

* fix(telegram): delete placeholder message when streaming delivers response

When streaming was active, the "Thinking..." placeholder message stayed
in the chat because preSend only deleted the tracking entry without
removing the actual Telegram message. Now preSend deletes the placeholder
via the new MessageDeleter interface when streamActive is set.

* refactor(streaming): remove dead code and simplify streaming wiring

- Delete unused Anthropic ChatStream/parseStream (-131 lines) — factory
  creates HTTPProvider for all OpenAI-compat providers including OpenRouter
- Simplify runLLMIteration from 4 to 3 return values (remove unused
  streamed bool)
- Replace managerStreamer struct with finalizeHookStreamer using embedding
  (Update/Cancel promoted, only Finalize overridden)

* fix(streaming): skip streamer acquisition when SendResponse is false

Heartbeat messages set SendResponse=false but the streaming path
was unconditionally acquiring a streamer, causing HEARTBEAT_OK to
leak to Telegram via streamer.Finalize().

* fix(streaming): guard streamer for non-sendable messages, add streaming config

Skip streamer acquisition for heartbeat (NoHistory=true), preventing
HEARTBEAT_OK from leaking to Telegram via streamer.Finalize().

Add streaming.enabled to Telegram defaults and example config.

* fix(picoclaw): add missing closing brace for StreamingProvider interface

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: resolve golangci-lint formatting issues

Fix gci import ordering in telegram and anthropic provider, and break
long function signature in openai_compat provider to satisfy golines.

* fix: address code review feedback on streaming PR

- Deduplicate Streamer interface: alias channels.Streamer to bus.Streamer
  to prevent type drift across packages
- Increase SSE scanner buffer to 10MB max to handle large single-line
  responses that exceed bufio.Scanner's 64KB default
- Switch draftID generation from math/rand to crypto/rand for
  collision-resistant random IDs
- Add context cancellation check in SSE parsing loop so cancelled
  streams stop processing immediately
- Log Finalize failures with chat_id and content length for debugging
  silent message delivery failures

* feat: make streaming throttle interval and min growth configurable

Move hardcoded streamThrottleInterval (3s) and streamMinGrowth (200)
into StreamingConfig so they can be tuned per deployment via config
or environment variables.

* fix(telegram): use parseTelegramChatID in DeleteMessage and BeginStream

These two functions called undefined parseChatID. Use
parseTelegramChatID with _ for the unused threadID instead of adding
a wrapper function. Fixes all three CI checks.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix(streaming): set streamActive only after successful Finalize

Move onFinalize hook to run after Streamer.Finalize succeeds, so that
if Finalize fails the streamActive flag stays false and the regular
placeholder fallback path remains available.

Addresses review feedback from @alexhoshina.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-20 21:04:14 +08:00
Amir Mamaghani 544940807f feat(pico): add pico_client outbound WebSocket channel (#1198)
* feat(pico): add pico_client outbound WebSocket channel

Add a client-mode counterpart to the existing pico server channel.
pico_client connects to a remote Pico Protocol WebSocket server,
enabling picoclaw to bridge messages with external Pico-compatible
services.

Includes config, factory registration, manager wiring, 8 unit tests,
and a minimal echo-server example for interactive testing.

* fix(pico): address PR #1198 review — goroutine leak, race, auth

- Add per-connection context cancel to picoConn to prevent pingLoop
  goroutine leak on disconnect
- Re-acquire mutex in StartTyping stop closure to avoid stale conn race
- Remove query-param token auth from echo server (header-only)
- Move ListenAndServe to main goroutine where log.Fatal is safe

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: replace ConsumeInbound with InboundChan select in client test

MessageBus does not expose a ConsumeInbound method. Use a select on
InboundChan() with context cancellation, matching the pattern used in
the bus package tests.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-20 20:43:40 +08:00
taorye 75cfee46de Merge pull request #1832 from taorye/main
refactor(tui): enhance TUI configuration for picoclaw-launcher-tui
2026-03-20 19:46:46 +08:00
taorye 955d6e70f1 refactor: update interface types to use 'any' and improve code formatting 2026-03-20 19:41:59 +08:00
taorye ed47d5f7c3 feat: add onboarding command execution for non-existent config directory 2026-03-20 19:28:58 +08:00
taorye 8c44597c3d feat: add chat functionality to home page for interactive AI sessions 2026-03-20 19:28:58 +08:00
taorye 02da117199 feat: add gateway management page to TUI and integrate into home menu 2026-03-20 19:28:57 +08:00
taorye 7b4d5d4513 feat: add channels management page and integrate into home menu 2026-03-20 19:28:57 +08:00
taorye 545b7afe41 feat: add model selection synchronization to main config in TUI 2026-03-20 19:28:57 +08:00
taorye 74a145c291 style: apply cyberpunk theme to TUI components for enhanced visual appeal 2026-03-20 19:28:57 +08:00
taorye 119cc2e8e1 refactor: enhance TUI configuration and user management with improved UI elements and concurrency 2026-03-20 19:28:57 +08:00
taorye 5a199ec993 feat: implement TUI configuration and user management for picoclaw-launcher-tui 2026-03-20 19:28:54 +08:00
taorye 998b456b65 Remove UI components and gateway management for picoclaw-launcher-tui
- Deleted channel management UI from channel.go, including all associated forms and menu items.
- Removed platform-specific gateway process management from gateway_posix.go and gateway_windows.go.
- Eliminated menu structure and item management from menu.go.
- Removed model management and configuration handling from model.go.
- Deleted style definitions and application logic from style.go.
- Cleared main entry point in main.go.
2026-03-20 19:24:10 +08:00
wenjie fe87376d6a chore(deps): upgrade modelcontextprotocol go-sdk to v1.4.1 for security fixes (#1823) 2026-03-20 16:13:10 +08:00
wenjie 68d182a26e chore(deps): bump Go toolchain to 1.25.8 for stdlib security fixes (#1821) 2026-03-20 15:19:33 +08:00
wenjie bda18f5ee4 chore(deps): upgrade eslint dependency chain to resolve flatted vulnerability (#1820) 2026-03-20 15:18:15 +08:00
Alix-007 82d574eb7b fix(agent): separate empty-response and tool-limit fallbacks 2026-03-20 14:37:47 +08:00
dependabot[bot] cff85cfe5c chore(deps): bump tailwindcss from 4.2.1 to 4.2.2 in /web/frontend (#1809)
* chore(deps): bump tailwindcss from 4.2.1 to 4.2.2 in /web/frontend

Bumps [tailwindcss](https://github.com/tailwindlabs/tailwindcss/tree/HEAD/packages/tailwindcss) from 4.2.1 to 4.2.2.
- [Release notes](https://github.com/tailwindlabs/tailwindcss/releases)
- [Changelog](https://github.com/tailwindlabs/tailwindcss/blob/main/CHANGELOG.md)
- [Commits](https://github.com/tailwindlabs/tailwindcss/commits/v4.2.2/packages/tailwindcss)

---
updated-dependencies:
- dependency-name: tailwindcss
  dependency-version: 4.2.2
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

* fix(frontend): align tailwind vite deps

---------

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: wenjie <meetwenjie@gmail.com>
2026-03-20 13:53:31 +08:00
dependabot[bot] 1fd6dd1ffb chore(deps): bump shadcn from 4.0.5 to 4.0.8 in /web/frontend (#1808)
Bumps [shadcn](https://github.com/shadcn-ui/ui/tree/HEAD/packages/shadcn) from 4.0.5 to 4.0.8.
- [Release notes](https://github.com/shadcn-ui/ui/releases)
- [Changelog](https://github.com/shadcn-ui/ui/blob/main/packages/shadcn/CHANGELOG.md)
- [Commits](https://github.com/shadcn-ui/ui/commits/shadcn@4.0.8/packages/shadcn)

---
updated-dependencies:
- dependency-name: shadcn
  dependency-version: 4.0.8
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2026-03-20 13:35:48 +08:00
ywj 009a8d702b Feat/feishu card parsing (#1534)
* feat(feishu): add interactive card message parsing

Add support for parsing inbound Feishu interactive card messages.
When a user sends a card message, the text content is now extracted
and passed to the LLM for processing.

- Add extractCardText() to recursively extract text from card JSON
- Support both JSON 1.0 (legacy) and JSON 2.0 schema formats
- Handle nested elements: header, body, actions, columns
- Extract text from markdown, lark_md, and plain_text elements
- Add comprehensive unit tests for card parsing

Fixes #<issue_number>

💘 Generated with Crush

Assisted-by: GLM-5 via Crush <crush@charm.land>

* feat(feishu): extract and download images from interactive cards

When receiving interactive card messages, extract embedded images
(img_key, src, icon_key) and download them for LLM processing.

- Add extractCardImageKeys() to recursively extract image keys from card JSON
- Support img elements (img_key, src) and icon elements (icon_key)
- Update downloadInboundMedia() to handle MsgTypeInteractive
- Add comprehensive unit tests for image extraction

Images are downloaded and stored via MediaStore, then appended to
the message content as [image: photo] tags for LLM visibility.

💘 Generated with Crush

Assisted-by: GLM-5 via Crush <crush@charm.land>

* fix(feishu): simplify card parsing - pass raw JSON, only extract images

Address review feedback: text extraction cannot exhaustively handle all
card formats (i18n_elements, div.fields, etc.). Pass raw JSON to LLM
instead - same approach as MsgTypePost. Only image extraction remains
as images must be downloaded for LLM to process.

- Remove extractCardText() and helper functions
- extractContent() now returns raw JSON for MsgTypeInteractive
- Keep extractCardImageKeys() for downloading embedded images
- Update tests to expect raw JSON for interactive cards

* fix(feishu): don't append media tags to interactive card JSON

Appending media tags like "[attachment]" to raw JSON content produces
invalid JSON format. For interactive cards, the JSON already contains
image information and media refs are downloaded separately.

- Skip appendMediaTags for MsgTypeInteractive to preserve valid JSON
- Add test case for interactive card with images

* fix(feishu): filter out external URLs from card image extraction

Only Feishu-hosted image keys (img_xxx, icon_xxx) can be downloaded via
the Feishu API. External URLs in src field (https://...) should be
filtered out to avoid download failures.

- Add isFeishuImageKey() to detect Feishu-hosted keys vs external URLs
- Update extractImageKeysRecursive to skip external URLs in src field
- Add tests for external URL filtering and mixed scenarios

* feat(feishu): support downloading external images from interactive cards

Previously only Feishu-hosted images (img_key, icon_key) could be
downloaded. Now external URLs in src field are also downloaded via
HTTP and made available to the LLM.

- extractCardImageKeys now returns two slices: Feishu keys and external URLs
- Add downloadExternalImage to download images from HTTP URLs
- Update downloadInboundMedia to handle both Feishu API and HTTP downloads
- Update tests for new function signature

* fix(feishu): use HTTP client with timeout for external image downloads

Replaced http.DefaultClient with a client that has a 30-second timeout
to prevent hanging on unresponsive external URLs.

Generated with Crush

Assisted-by: GLM-5 via Crush <crush@charm.land>

* fix(feishu): resolve lint errors for shadow and formatting

- Rename err variables to avoid shadowing in downloadExternalImage
- Fix struct field alignment in TestExtractCardImageKeys

Generated with Crush

Assisted-by: GLM-5 via Crush <crush@charm.land>

* refactor(feishu): pass external image URLs to LLM instead of downloading

Instead of downloading external images from interactive cards, pass
the URLs directly to LLM. This reduces network overhead and lets
vision-capable models fetch images as needed.

- Remove downloadExternalImage function
- Append external URLs to card content for LLM processing
- Only download Feishu-hosted images via API

💘 Generated with Crush

Assisted-by: GLM-5 via Crush <crush@charm.land>

* fix(feishu): add blank line between functions for gci formatting

* fix(feishu): keep interactive card content as valid JSON
2026-03-20 12:59:43 +08:00