Commit Graph

178 Commits

Author SHA1 Message Date
LC 9c3dc0ee3a fix(auth): canonicalize Google Antigravity provider and enhance credential management (#2599)
* fix(auth): canonicalize Google Antigravity provider and enhance credential management

* fix(auth): improve error handling in credential storage tests

* fix(auth): stabilize canonical provider merge precedence
2026-04-21 16:28:29 +08:00
lxowalle 6421f146a9 Revert "Feat/channel tool feedback animation (#2569)" (#2596)
This reverts commit e556a816e4.
2026-04-20 18:30:29 +08:00
lxowalle e556a816e4 Feat/channel tool feedback animation (#2569)
* feat(channels): unify tool feedback animation across discord telegram and feishu

* fix(tool-feedback): unify fallback and single-message delivery

* fix(channels): finalize tool feedback in place

* fix ci

* feat: improve tool feedback
2026-04-20 15:20:26 +08:00
lc6464 ffd30d7db7 fix(auth): improve no-browser OAuth login 2026-04-16 23:01:28 +08:00
lc6464 ab019d3f18 feat(auth): add no-browser option for OAuth login 2026-04-16 22:19:34 +08:00
BeaconCat f1b659e5ef membench: add LLM-as-Judge evaluation mode (#2484)
* membench: add LLM-as-Judge evaluation mode

Add --eval-mode=llm to membench for LLM-based answer generation and
semantic scoring via an OpenAI-compatible API endpoint.

New files:
- llm_client.go: generic OpenAI-compatible chat completion client
  with support for API key, configurable timeout, and optional
  chat_template_kwargs (for llama.cpp thinking models)
- eval_llm.go: LLM answer generation + LLM-as-Judge scoring for
  both legacy and seahorse retrieval modes

Changes to main.go:
- --eval-mode flag (token|llm) to select evaluation strategy
- --api-base, --api-key, --model flags with env var fallback
  (MEMBENCH_API_BASE, MEMBENCH_API_KEY, MEMBENCH_MODEL)
- --no-thinking flag for llama.cpp + Qwen thinking models
- --limit flag to cap QA questions per sample for quick testing

* style: fix golangci-lint formatting (gofmt + golines)

* fix: address Copilot review feedback

- Validate --model is required for LLM eval mode
- Use rune-based truncation to preserve valid UTF-8
- Precompute totalQA count outside inner loop
- Log SearchMessages errors instead of silently skipping

* fix: address Copilot review round 2

- Validate --eval-mode accepts only 'token' or 'llm'
- Normalize base URL to avoid /v1/v1 duplication
- Separate token/LLM results for correct PrintComparison labeling
- Log ExpandMessages errors instead of silently ignoring
- Short-circuit with 0 scores when no context retrieved (match token eval)
- Add --timeout flag wired to LLMClientOptions.Timeout

* fix: address review P1+P2 — sort alignment, failure sentinel, score parser

- P1: Replace hand-rolled sortByRank with sort.Slice (ascending, best
  first) matching eval.go's EvalSeahorse — ensures BudgetTruncate keeps
  best-ranked messages when truncation occurs
- P2: Use -1.0 sentinel for LLM API failures and parse errors, distinct
  from genuine 0.0 score; aggregateMetrics skips -1.0 entries for F1
  averaging while still counting HitRate
- P2: Use regexp \b([1-5])\b for judge score extraction instead of
  first-digit scan — avoids misparses on '5/5', 'Score: 3' etc.

* fix: address Copilot review round 2

- Fix F1/HitRate weighted aggregation: track ValidF1Count separately so
  computeModeAgg weights F1 by valid scores only, not TotalQuestions
- No-context retrieval failure uses 0.0 (genuine bad score) instead of
  -1.0 sentinel (reserved for API/parse failures)
- Validate --timeout > 0 to prevent disabling HTTP timeouts

* fix: remove hardcoded /v1 from API base URL

Users now provide the full versioned path in --api-base (e.g. /v1, /v4).
Code only appends /chat/completions. Default changed to
http://127.0.0.1:8080/v1 for backward compatibility.

* fix: address Copilot review round 3

- ValidF1Count=0 when all scores are sentinel (no forced =1)
- Backward compat: old eval JSON without ValidF1Count falls back to
  TotalQuestions in computeModeAgg
- Skip empty section in PrintComparison when tokenResults is empty
- Update --api-base flag help to document /v1 default and version path
- Add sentinel aggregation unit tests (partial, all, weighted)

* feat: add --retries flag with exponential backoff for transient LLM errors

Retry on timeout, 5xx, and 429 (rate limit) with 1s/2s/4s backoff.
Default 3 retries, configurable via --retries. Context cancellation
is respected between retries.

* fix: address Copilot review round 4

- runReport splits results by mode suffix into token/llm for PrintComparison
- backward compat fallback (ValidF1Count=0 -> TotalQuestions) only for
  non-LLM modes; LLM modes keep ValidF1Count=0 when all scores sentinel
- MaxRetries==0 means no retry; only negative falls back to default 3
- truncateStr uses []rune to avoid cutting multi-byte UTF-8 characters
- Complete() returns error on empty LLM response (vs silent empty string)

* feat: --no-thinking adapts to llama.cpp, Ollama, and GLM backends

Send all three disable-thinking fields simultaneously:
- chat_template_kwargs.enable_thinking=false (llama.cpp, GLM)
- think=false (Ollama 0.9+)
- thinking.type=disabled (GLM/Zhipu)
Each backend picks the field it recognizes and ignores the rest.
Also bumps max_tokens from 512 to 2048 for thinking models.

* feat: mixed model eval + concurrent QA workers

- Add --judge-model, --judge-api-base, --judge-api-key flags for separate judge model
- Add --concurrency flag (default 1) with semaphore-based goroutine pool
- Add reasoning_content fallback for GLM/DeepSeek style responses
- Prepend /no_think to system prompt for Ollama /v1 compatibility
- Reduce default MaxTokens from 2048 to 512 (answers are 1-3 sentences)
- Extract evalQAWorker and buildSeahorseContext for shared concurrent logic

---------

Co-authored-by: BeaconCat <BeaconCat@users.noreply.github.com>
2026-04-15 21:15:17 +08:00
美電球 b52eb58f03 Merge pull request #2514 from lc6464/fix/issue-2488-host-binding
feat(launcher): add host overrides for launcher and gateway
2026-04-14 23:48:24 +08:00
lxowalle 0425cd4d77 refactor skills registries and add GitHub-backed skill discovery (#2442)
* refactor skills registries and add GitHub-backed skill discovery

* fix ci

* fix command error

* fix default skills install registry behavior

* fix github registry URL parsing and versioned skill links

* fix skills registry config compatibility and URL installs

* * fix lint

* fix deprecated github base url compatibility

* fix skills registry yaml and github default branch handling

* fix github skills registry fallback and install metadata

* fix cli skills install origin metadata

* fix clawhub registry env compatibility

* fix skills registry config merge compatibility

* fix skill install metadata consistency and onboard template copy

* fix yaml overrides for default skills registries

* fix install_skill registry metadata normalization

* fix github skill URL parsing for slash branch names

* fix skills registry install/search validation and github URLs

* fix github skill URL host validation

* fix install_skill validation for invalid registry archives

* fix redundant skills registry names in saved config

* fix github blob skill URL installs and metadata links

* fix github registry URL scheme validation

* fix v0 skills migration preserving github registry defaults

* fix github blob skill install directory resolution

* fix install_skill rollback on origin metadata write failure

* fix github skill URL validation and registry JSON merging

* fix github registry target resolution and metadata links

* fix install_skill force reinstall rollback

* fix skills config compatibility and legacy security overlays

* fix ci
2026-04-14 15:14:16 +08:00
lc6464 ae195831bb fix: resolve PR2514 lint regressions 2026-04-14 14:49:23 +08:00
lc6464 d4d652b455 feat(host): complete launcher and gateway multi-host binding support
- add shared netbind planning for strict tcp4/tcp6 bind semantics
- support launcher/gateway host env overrides and launcher-to-gateway forwarding
- cover host binding and forwarding with network and subprocess env tests
2026-04-14 14:04:36 +08:00
lc6464 448027c02a fix(host): align launcher and gateway host normalization semantics 2026-04-14 14:03:22 +08:00
lc6464 4e977367c2 feat(launcher): add host overrides for launcher and gateway 2026-04-14 14:00:54 +08:00
Cytown 667fc85d54 refactor(config): make config.Channel to multiple instance support
add new field type to Channel struct
config.channels refactor to channel_list
update config version to 3
update the docs
2026-04-13 22:21:21 +08:00
dataCenter430 b6617a4b17 feat(cli): structured terminal UI for PicoClaw CLI like modern CLIs (#2229)
* feat(cli): add boxed help/error UI with no-color support

* fix: CI testing error

* fix: lint errors

* fix linter error

* fix: address review
2026-04-12 18:44:24 +08:00
Liu Yuan 1175f4a62b feat(membench): add LOCOMO memory benchmark tool (#2353)
Benchmark tool comparing legacy session manager vs seahorse short memory
retrieval on the LOCOMO long-term conversational memory dataset.

- cmd/membench/: CLI with ingest/eval/report/run subcommands
- Mode A (legacy): recency-biased budget truncation baseline
- Mode B (seahorse): per-keyword trigram FTS5 search + expand
- Metrics: Token-Overlap F1 and Recall Hit Rate
- `make mem` builds, downloads data, runs benchmark end-to-end
2026-04-06 17:26:43 +08:00
Robert Bopko e8d92e4a36 test: update root help banner expectation 2026-04-03 21:59:57 +02:00
Robert Bopko cbd0798a56 fix: avoid duplicate v in CLI help banner 2026-04-03 19:58:52 +02:00
lxowalle 849e37cf79 * Load zoneinfo from TZ and ZONEINFO env (#2279) 2026-04-03 10:12:15 +08:00
sky5454 49e61fa07f feat(updater): robust self-update selection & extraction (nightly default) (#2201)
* feat(updater): add web self-update endpoint and updater package

* feat(selfupgrade): when url empty, using GetTestReleaseAPIURL for test .

* feat(selfupgrade):  only GetTestReleaseAPIURL  .

* feat(upgrade): cli  $0 update work well!

* fix(ci): fix ci err

* fix(test): fix ci test

* fix(ci): fix ci  lint fmt err

* test(updater): add test for updater

* fix(ci): fix ci  lint var copy err

* fix(ci): retry ci

* updater: require checksum verification, prefer API digest, verify SHA256, fix zip extraction, update tests

* fix(lint): lint fixed

* fix(lint): lint fixed2

* updater: stream download and verify sha256; add http client timeout and progress

Avoid double-download by streaming asset into temp file while computing SHA256 and verifying against checksum; replace http.Get with shared httpClient (2m timeout) to prevent hangs; add simple stderr progress display; remove unused helpers.
2026-04-01 23:41:32 +08:00
Mauro 34b4848214 Merge pull request #1838 from jonahzheng/patch-1
Update helpers.go
2026-03-30 14:23:38 +02:00
Cytown 50b8d9bf83 Merge branch 'main' into t3 2026-03-30 18:01:07 +08:00
daming大铭 cbe92286e9 Merge pull request #2184 from cytown/config
refactor config and add ModelConfig.Enabled
2026-03-30 17:23:07 +08:00
LC ff0266a40e feat(web): display backend version info in sidebar (#2087)
* feat(web): display backend version info in sidebar

* fix(web): improve version parsing and timeout behavior

* refactor(web): remove useless --version fallback

* feat(web): implement version info caching and improve retrieval logic

* fix(web): clarify version timeout rationale

* fix(web): harden gateway version probing and tests

* style(web): split regexp to two lines for lint
2026-03-30 16:44:50 +08:00
Cytown 93757812fc refactor config and add ModelConfig.Enabled 2026-03-30 14:01:20 +08:00
daming大铭 1fc5345857 refactor(cron): remove deliver and type params, unify agent execution path (#2147)
The agent path now publishes to outbound bus directly (since #2100),
making the deliver=true direct-to-bus shortcut and the directive type
prompt wrapping redundant. All cron jobs now uniformly route through
the agent. This is an intentional behavior change: old jobs with
deliver=true will execute through the agent instead of bypassing it.

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-29 22:52:34 +08:00
Cytown 0bb561548f add pid file for gateway running and auth token for /reload and pico channel 2026-03-29 01:14:39 +08:00
Cytown b646d3b8fe refactor config and security to simplified the structure (#2068) 2026-03-28 00:03:34 +08:00
Liu Yuan 155af28841 feat(logger): add PICOCLAW_LOG_FILE env var for file-only logging 2026-03-25 21:34:27 +08:00
taorye ee03d1247d chore(tui): add build target for picoclaw-launcher TUI and create README for TUI launcher (#1995) 2026-03-25 16:17:26 +08:00
Hoshina e760cb737c feat(auth): add wecom cli qr login 2026-03-24 20:23:29 +08:00
Hua Audio b23a6b3f54 Feat/move weixin login to auth and update docs (#1945)
* move weixin to auth and update docs

* fix ci test
2026-03-24 06:33:35 +01:00
Cytown df17684dd4 implement panic log for gateway and launcher
add file logger to gateway

ref issue: #1734

Signed-off-by: Cytown <cytown@gmail.com>
2026-03-23 15:40:30 +08:00
Cytown 36f9d20de1 Merge branch 'main' into version 2026-03-23 15:00:18 +08:00
Kunal Karmakar 40279c8dde chore(config): move loglevel settings under gateway (#1912)
* Move log level config to gateway property

* Fix unit test

* Fix linting

* Fix linting

* Add comment for log level
2026-03-23 06:14:53 +01:00
Cytown 7bf4831059 Merge branch 'main' into version 2026-03-23 10:54:08 +08:00
yinwm c48954d32d merge: sync main into refactor/agent 2026-03-22 21:44:17 +08:00
Cytown 284ced1f5c Merge branch 'main' into version 2026-03-22 19:58:33 +08:00
Administrator f7f27e237a merge: resolve conflicts between refactor/agent and main 2026-03-22 19:21:58 +08:00
Hua Audio dd82794255 Feat/weixin openclaw port (#1873)
* init

* fix lint

* fix go test

* update docs

* incorporate pr review

* Update pkg/channels/weixin/weixin.go

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Apply suggestions from code review

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* feat(weixin): add media sync and typing support

* test(weixin): cover media and sync helpers

---------

Co-authored-by: zhangmikoto <i@electromaster.me>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Hoshina <hoshina@evaz.org>
2026-03-22 14:23:39 +08:00
Cytown 7c854fe6d7 Merge branch 'main' into version 2026-03-22 02:53:55 +08:00
Cytown e455eb5e67 refactor: seperate security.yml for store keys 2026-03-22 01:55:00 +08:00
Kunal Karmakar 92b7687068 Add configurable logger 2026-03-21 06:54:49 +00:00
enoch.z e4b104c0de Update helpers.go
correction of the promt for the "picoclaw onboard" command.
2026-03-20 23:15:18 +08:00
liqianjie 0fe058254c fix: add fallback DNS resolver for Android with multi-DNS support (#1835)
On Android, /etc/resolv.conf does not exist, causing Go's default DNS
resolution to fail. This adds an init() hook that:

1. Detects missing /etc/resolv.conf (Android environment)
2. Configures a custom resolver with PreferGo: true
3. Supports multiple DNS servers via PICOCLAW_DNS_SERVER env var
   - Semicolon-separated: "8.8.8.8:53;1.1.1.1:53"
   - Single server also works: "8.8.8.8"
   - Auto-appends :53 if port omitted
4. Round-robin rotation across configured servers
5. Defaults to Google DNS + Cloudflare DNS

Also patches http.DefaultTransport to use the custom resolver.
2026-03-20 22:32:21 +08:00
taorye 955d6e70f1 refactor: update interface types to use 'any' and improve code formatting 2026-03-20 19:41:59 +08:00
taorye ed47d5f7c3 feat: add onboarding command execution for non-existent config directory 2026-03-20 19:28:58 +08:00
taorye 8c44597c3d feat: add chat functionality to home page for interactive AI sessions 2026-03-20 19:28:58 +08:00
taorye 02da117199 feat: add gateway management page to TUI and integrate into home menu 2026-03-20 19:28:57 +08:00
taorye 7b4d5d4513 feat: add channels management page and integrate into home menu 2026-03-20 19:28:57 +08:00
taorye 545b7afe41 feat: add model selection synchronization to main config in TUI 2026-03-20 19:28:57 +08:00