Commit Graph

1839 Commits

Author SHA1 Message Date
LC 7aa2d672ce fix(network): classify timeout errors as FailoverTimeout
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
2026-04-16 22:00:13 +08:00
lc6464 c3f4000817 feat(network): implement network error classification and fallback handling 2026-04-16 19:59:37 +08:00
wenjie 7fdc9c7b64 fix(web): support proxies in SearXNG and web fetch (#2542)
Propagate the configured HTTP client and proxy settings to the
SearXNG search provider.

Allow web_fetch to connect to the configured proxy as the first hop
without bypassing the existing private-host checks for redirect
targets and fetched URLs.

Add tests for loopback proxy fetches and SearXNG proxy propagation.
2026-04-16 17:15:47 +08:00
wenjie 7f56ca8cc6 feat(web): refactor tools page into tabbed library and web search settings (#2539)
- split the tools page into focused components and a shared hook
- add separate Tool Library and Web Search tabs
- refresh web search settings layout and localized copy
- make provider expansion keyboard accessible
- restore wrapping for long tool names in library cards
- allow custom styling for KeyInput
2026-04-16 17:14:35 +08:00
lxowalle e22b4e1eee feat(agent): support btw side questions (#2532) 2026-04-16 10:53:09 +08:00
wenjie a8d0b03515 fix(web): save channel configs with nested channel_list patches (#2530)
Persist channel settings through the current channel_list schema, keeping common
channel fields at the top level and channel-specific fields under settings.
Return common fields and default config shapes from channel config endpoints, and
add coverage for nested patches, missing channel defaults, and secret handling.
2026-04-16 10:30:16 +08:00
wenjie f32b303d2a fix(web): avoid resetting web search draft on config refetch (#2536) 2026-04-16 10:26:18 +08:00
BeaconCat f1b659e5ef membench: add LLM-as-Judge evaluation mode (#2484)
* membench: add LLM-as-Judge evaluation mode

Add --eval-mode=llm to membench for LLM-based answer generation and
semantic scoring via an OpenAI-compatible API endpoint.

New files:
- llm_client.go: generic OpenAI-compatible chat completion client
  with support for API key, configurable timeout, and optional
  chat_template_kwargs (for llama.cpp thinking models)
- eval_llm.go: LLM answer generation + LLM-as-Judge scoring for
  both legacy and seahorse retrieval modes

Changes to main.go:
- --eval-mode flag (token|llm) to select evaluation strategy
- --api-base, --api-key, --model flags with env var fallback
  (MEMBENCH_API_BASE, MEMBENCH_API_KEY, MEMBENCH_MODEL)
- --no-thinking flag for llama.cpp + Qwen thinking models
- --limit flag to cap QA questions per sample for quick testing

* style: fix golangci-lint formatting (gofmt + golines)

* fix: address Copilot review feedback

- Validate --model is required for LLM eval mode
- Use rune-based truncation to preserve valid UTF-8
- Precompute totalQA count outside inner loop
- Log SearchMessages errors instead of silently skipping

* fix: address Copilot review round 2

- Validate --eval-mode accepts only 'token' or 'llm'
- Normalize base URL to avoid /v1/v1 duplication
- Separate token/LLM results for correct PrintComparison labeling
- Log ExpandMessages errors instead of silently ignoring
- Short-circuit with 0 scores when no context retrieved (match token eval)
- Add --timeout flag wired to LLMClientOptions.Timeout

* fix: address review P1+P2 — sort alignment, failure sentinel, score parser

- P1: Replace hand-rolled sortByRank with sort.Slice (ascending, best
  first) matching eval.go's EvalSeahorse — ensures BudgetTruncate keeps
  best-ranked messages when truncation occurs
- P2: Use -1.0 sentinel for LLM API failures and parse errors, distinct
  from genuine 0.0 score; aggregateMetrics skips -1.0 entries for F1
  averaging while still counting HitRate
- P2: Use regexp \b([1-5])\b for judge score extraction instead of
  first-digit scan — avoids misparses on '5/5', 'Score: 3' etc.

* fix: address Copilot review round 2

- Fix F1/HitRate weighted aggregation: track ValidF1Count separately so
  computeModeAgg weights F1 by valid scores only, not TotalQuestions
- No-context retrieval failure uses 0.0 (genuine bad score) instead of
  -1.0 sentinel (reserved for API/parse failures)
- Validate --timeout > 0 to prevent disabling HTTP timeouts

* fix: remove hardcoded /v1 from API base URL

Users now provide the full versioned path in --api-base (e.g. /v1, /v4).
Code only appends /chat/completions. Default changed to
http://127.0.0.1:8080/v1 for backward compatibility.

* fix: address Copilot review round 3

- ValidF1Count=0 when all scores are sentinel (no forced =1)
- Backward compat: old eval JSON without ValidF1Count falls back to
  TotalQuestions in computeModeAgg
- Skip empty section in PrintComparison when tokenResults is empty
- Update --api-base flag help to document /v1 default and version path
- Add sentinel aggregation unit tests (partial, all, weighted)

* feat: add --retries flag with exponential backoff for transient LLM errors

Retry on timeout, 5xx, and 429 (rate limit) with 1s/2s/4s backoff.
Default 3 retries, configurable via --retries. Context cancellation
is respected between retries.

* fix: address Copilot review round 4

- runReport splits results by mode suffix into token/llm for PrintComparison
- backward compat fallback (ValidF1Count=0 -> TotalQuestions) only for
  non-LLM modes; LLM modes keep ValidF1Count=0 when all scores sentinel
- MaxRetries==0 means no retry; only negative falls back to default 3
- truncateStr uses []rune to avoid cutting multi-byte UTF-8 characters
- Complete() returns error on empty LLM response (vs silent empty string)

* feat: --no-thinking adapts to llama.cpp, Ollama, and GLM backends

Send all three disable-thinking fields simultaneously:
- chat_template_kwargs.enable_thinking=false (llama.cpp, GLM)
- think=false (Ollama 0.9+)
- thinking.type=disabled (GLM/Zhipu)
Each backend picks the field it recognizes and ignores the rest.
Also bumps max_tokens from 512 to 2048 for thinking models.

* feat: mixed model eval + concurrent QA workers

- Add --judge-model, --judge-api-base, --judge-api-key flags for separate judge model
- Add --concurrency flag (default 1) with semaphore-based goroutine pool
- Add reasoning_content fallback for GLM/DeepSeek style responses
- Prepend /no_think to system prompt for Ollama /v1 compatibility
- Reduce default MaxTokens from 2048 to 512 (answers are 1-3 sentences)
- Extract evalQAWorker and buildSeahorseContext for shared concurrent logic

---------

Co-authored-by: BeaconCat <BeaconCat@users.noreply.github.com>
2026-04-15 21:15:17 +08:00
美電球 ead2dc9699 Merge pull request #2524 from SiYue-ZO/feature/sogou-web-search-default
Add configurable Sogou-backed web search
2026-04-15 20:50:53 +08:00
wenjie 7bd11181a6 fix(agent): preserve reused tool call IDs across turns (#2528)
Scope tool result deduplication to each assistant tool-call block so providers
that reuse call IDs across separate turns do not lose valid tool results. Also
drop invalid empty tool call IDs and orphaned tool messages after validation.
2026-04-15 20:18:09 +08:00
daming大铭 100e576609 Merge pull request #2529 from lc6464/feat/web-code-highlight
feat(web): add markdown syntax highlighting for chat and skills
2026-04-15 18:50:40 +08:00
SiYue-ZO 2784223ad5 Make web search auto-switch with UI language
Default the sample web search provider to auto, route Sogou vs DuckDuckGo dynamically based on query/UI language, and sync frontend language changes back to the backend so Current Service and runtime selection stay aligned.
2026-04-15 18:45:28 +08:00
lc6464 5a2e7795cd refactor(web): improve theme style element management in useHighlightTheme hook 2026-04-15 18:30:43 +08:00
lc6464 acbe654674 chore(web): move app providers out of main entry 2026-04-15 17:36:22 +08:00
lc6464 389f492d8c refactor(web): use official highlight themes for markdown 2026-04-15 17:19:48 +08:00
lc6464 25ac563406 feat(web): add syntax highlighting for markdown code blocks 2026-04-15 14:54:13 +08:00
Mauro bb14a5c7cc Merge pull request #2525 from afjcjsbx/fix/vision-unsupported-media-stuck
fix(agent): recover after image-input-unsupported failures
2026-04-15 07:54:33 +02:00
SiYue-ZO bb953b788b test(api): fix web tools lint issues 2026-04-15 13:35:39 +08:00
SiYue-ZO 75e93b5189 Merge remote-tracking branch 'upstream/main' into feature/sogou-web-search-default
# Conflicts:
#	pkg/tools/web.go
#	pkg/tools/web_test.go
2026-04-15 13:28:05 +08:00
SiYue-ZO 0b84f0ae0a fix(web): address sogou search review feedback 2026-04-15 13:03:06 +08:00
Cytown d0ff24aa87 remove useless backend output for platform-token (#2500) 2026-04-15 11:38:47 +08:00
wenjie 51ab3b1385 fix(web): restore chat composer disabled-state messaging and clean up code (#2526) 2026-04-15 11:24:27 +08:00
lxowalle 773a94c414 fix(web_search): validate missing API key/URL directly in Search methods (#2517) 2026-04-15 09:55:05 +08:00
肆月 bf6d4fd997 feat(web): show disabled reasons in tooltips when buttons are disabled (#2430)
* feat(web): show disabled reasons in tooltips when buttons are disabled

- Add disabled reason tooltips for model card actions (set default, delete)
- Add disabled reason tooltips for marketplace skill card install button
- Add disabled reason display for chat input when disabled
- Add internationalization support for all disabled reasons (en/zh)
- Model card: Show specific reasons when set-default or delete buttons are disabled
- Marketplace skill card: Show specific reasons when install button is disabled
- Chat composer: Show reason text below input when input is disabled

* fix: show disabled action reasons via tooltips

* fix(web): restore accessible labels for model action tooltips
2026-04-15 09:49:45 +08:00
afjcjsbx e60a687387 fix lint 2026-04-14 22:35:02 +02:00
afjcjsbx 7824bc715f add test 2026-04-14 22:31:30 +02:00
afjcjsbx d3d639cb7d fix lint 2026-04-14 22:21:33 +02:00
afjcjsbx 1245f2ddf6 fix(agent): recover after image-input-unsupported failures 2026-04-14 22:15:28 +02:00
美電球 c0fadc5918 Merge pull request #2523 from lc6464/feat/web-chat-disabled-reasons-hint
feat(web): show disabled chat reasons in composer
2026-04-15 00:42:55 +08:00
美電球 b52eb58f03 Merge pull request #2514 from lc6464/fix/issue-2488-host-binding
feat(launcher): add host overrides for launcher and gateway
2026-04-14 23:48:24 +08:00
lc6464 0bb9bedc44 fix(web): address latest Copilot review points 2026-04-14 23:39:59 +08:00
SiYue-ZO dcf21ef11c Fix provider return formatting for golines 2026-04-14 23:26:40 +08:00
lc6464 79f87d151e fix(web): show localhost entry only for local binds 2026-04-14 23:24:14 +08:00
SiYue-ZO 824e800d70 Fix Sogou user agent formatting for linter 2026-04-14 23:22:37 +08:00
SiYue-ZO 9ded7933f0 Fix golines formatting for web search changes 2026-04-14 23:16:23 +08:00
SiYue-ZO 93977bf348 Add configurable Sogou-backed web search 2026-04-14 22:58:07 +08:00
lc6464 d4313b5e5f feat(web): show disabled chat reasons in composer 2026-04-14 22:22:30 +08:00
Caize Wu 08fc305d5e Merge pull request #2518 from imguoguo/update-wechat-qr
docs: update wechat qrcode
2026-04-14 17:34:06 +08:00
Guoguo 8ca89c49ab docs: update wechat qrcode 2026-04-14 02:30:26 -07:00
lc6464 24382271d6 fix(web): align wildcard advertise IP preference 2026-04-14 15:17:27 +08:00
lxowalle 0425cd4d77 refactor skills registries and add GitHub-backed skill discovery (#2442)
* refactor skills registries and add GitHub-backed skill discovery

* fix ci

* fix command error

* fix default skills install registry behavior

* fix github registry URL parsing and versioned skill links

* fix skills registry config compatibility and URL installs

* * fix lint

* fix deprecated github base url compatibility

* fix skills registry yaml and github default branch handling

* fix github skills registry fallback and install metadata

* fix cli skills install origin metadata

* fix clawhub registry env compatibility

* fix skills registry config merge compatibility

* fix skill install metadata consistency and onboard template copy

* fix yaml overrides for default skills registries

* fix install_skill registry metadata normalization

* fix github skill URL parsing for slash branch names

* fix skills registry install/search validation and github URLs

* fix github skill URL host validation

* fix install_skill validation for invalid registry archives

* fix redundant skills registry names in saved config

* fix github blob skill URL installs and metadata links

* fix github registry URL scheme validation

* fix v0 skills migration preserving github registry defaults

* fix github blob skill install directory resolution

* fix install_skill rollback on origin metadata write failure

* fix github skill URL validation and registry JSON merging

* fix github registry target resolution and metadata links

* fix install_skill force reinstall rollback

* fix skills config compatibility and legacy security overlays

* fix ci
2026-04-14 15:14:16 +08:00
lc6464 ae195831bb fix: resolve PR2514 lint regressions 2026-04-14 14:49:23 +08:00
lc6464 93bf871bd2 fix(launcher): refine console host display 2026-04-14 14:04:37 +08:00
lc6464 d4d652b455 feat(host): complete launcher and gateway multi-host binding support
- add shared netbind planning for strict tcp4/tcp6 bind semantics
- support launcher/gateway host env overrides and launcher-to-gateway forwarding
- cover host binding and forwarding with network and subprocess env tests
2026-04-14 14:04:36 +08:00
lc6464 7b38d437ba feat(launcher): support multi-host bind and strict host semantics 2026-04-14 14:03:24 +08:00
lc6464 e7b3654313 fix(host): modernize default host selection order 2026-04-14 14:03:23 +08:00
lc6464 448027c02a fix(host): align launcher and gateway host normalization semantics 2026-04-14 14:03:22 +08:00
lc6464 4e977367c2 feat(launcher): add host overrides for launcher and gateway 2026-04-14 14:00:54 +08:00
daming大铭 df9124b824 Merge pull request #2249 from alexhoshina/refactor-inbound-context-routing-session
Refactor inbound context routing session
2026-04-14 12:45:34 +08:00
美電球 08283dde61 Merge pull request #2489 from afjcjsbx/fix/mcp-reload-discovery-tools
fix(agent): reinitialize MCP and discovery tools after reload
2026-04-14 11:54:47 +08:00