mirror of https://github.com/sipeed/picoclaw.git synced 2026-06-12 18:08:54 +00:00

Files

T

BeaconCat 60a7098fd3 feat(search): add Baidu Qianfan AI Search provider with i18n docs

- Add BaiduSearchConfig struct and register in WebToolsConfig/defaults
- Insert Baidu Search in priority chain: DuckDuckGo > Baidu > GLM Search
- Use perplexityTimeout (30s) — Qianfan is LLM-based
- Fix response parsing: use references[] field per API spec
- Add baidu_search block to config.example.json

docs: sync configuration.md and README Documentation table across all languages

- Complete truncated configuration.md for fr/ja/pt-br/vi/zh: add Spawn
  async flow diagram, Providers table, Model Configuration (all vendors,
  examples, load balancing, migration), Provider Architecture, Scheduled
  Tasks, and Advanced Topics links
- Add Hooks/Steering/SubTurn entries to Documentation table in all 8
  READMEs (en/zh/fr/id/it/ja/pt-br/vi), ordered before Troubleshooting
- Add Baidu Search row to web search table in all 8 READMEs and
  tools_configuration.md (en + 5 i18n); zh README reorders search
  engines with China-friendly options first
- Add Matrix channel docs translations (fr/ja/pt-br/vi)
- Add Weixin channel to chat-apps.md and all README Channels tables

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

2026-03-23 00:51:27 +08:00

27 KiB

Raw Blame History

⚙️ Configuration Guide

Back to README

⚙️ Configuration

Config file: ~/.picoclaw/config.json

Environment Variables

You can override default paths using environment variables. This is useful for portable installations, containerized deployments, or running picoclaw as a system service. These variables are independent and control different paths.

Variable	Description	Default Path
`PICOCLAW_CONFIG`	Overrides the path to the configuration file. This directly tells picoclaw which `config.json` to load, ignoring all other locations.	`~/.picoclaw/config.json`
`PICOCLAW_HOME`	Overrides the root directory for picoclaw data. This changes the default location of the `workspace` and other data directories.	`~/.picoclaw`

Examples:

# Run picoclaw using a specific config file
# The workspace path will be read from within that config file
PICOCLAW_CONFIG=/etc/picoclaw/production.json picoclaw gateway

# Run picoclaw with all its data stored in /opt/picoclaw
# Config will be loaded from the default ~/.picoclaw/config.json
# Workspace will be created at /opt/picoclaw/workspace
PICOCLAW_HOME=/opt/picoclaw picoclaw agent

# Use both for a fully customized setup
PICOCLAW_HOME=/srv/picoclaw PICOCLAW_CONFIG=/srv/picoclaw/main.json picoclaw gateway

Workspace Layout

PicoClaw stores data in your configured workspace (default: ~/.picoclaw/workspace):

~/.picoclaw/workspace/
├── sessions/          # Conversation sessions and history
├── memory/           # Long-term memory (MEMORY.md)
├── state/            # Persistent state (last channel, etc.)
├── cron/             # Scheduled jobs database
├── skills/           # Custom skills
├── AGENT.md          # Agent behavior guide
├── HEARTBEAT.md      # Periodic task prompts (checked every 30 min)
├── IDENTITY.md       # Agent identity
├── SOUL.md           # Agent soul
└── USER.md           # User preferences

Note: Changes to AGENT.md, SOUL.md, USER.md and memory/MEMORY.md are automatically detected at runtime via file modification time (mtime) tracking. You do not need to restart the gateway after editing these files — the agent picks up the new content on the next request.

Skill Sources

By default, skills are loaded from:

~/.picoclaw/workspace/skills (workspace)
~/.picoclaw/skills (global)
<binary-embedded-path>/skills (builtin, set at build time)

For advanced/test setups, you can override the builtin skills root with:

export PICOCLAW_BUILTIN_SKILLS=/path/to/skills

Unified Command Execution Policy

Generic slash commands are executed through a single path in pkg/agent/loop.go via commands.Executor.
Channel adapters no longer consume generic commands locally; they forward inbound text to the bus/agent path. Telegram still auto-registers supported commands at startup.
Unknown slash command (for example /foo) passes through to normal LLM processing.
Registered but unsupported command on the current channel (for example /show on WhatsApp) returns an explicit user-facing error and stops further processing.

Agent Bindings (Route messages to specific agents)

Use bindings in config.json to route incoming messages to different agents by channel/account/context.

{
  "agents": {
    "defaults": {
      "workspace": "~/.picoclaw/workspace",
      "model_name": "gpt-4o-mini"
    },
    "list": [
      { "id": "main", "default": true, "name": "Main Assistant" },
      { "id": "support", "name": "Support Assistant" },
      { "id": "sales", "name": "Sales Assistant" }
    ]
  },
  "bindings": [
    {
      "agent_id": "support",
      "match": {
        "channel": "telegram",
        "account_id": "*",
        "peer": { "kind": "direct", "id": "user123" }
      }
    },
    {
      "agent_id": "sales",
      "match": {
        "channel": "discord",
        "account_id": "my-discord-bot",
        "guild_id": "987654321"
      }
    }
  ]
}

`bindings` fields

Field	Required	Description
`agent_id`	Yes	Target agent id in `agents.list`
`match.channel`	Yes	Channel name (e.g. `telegram`, `discord`)
`match.account_id`	No	Channel account filter. Use `"*"` for all accounts of that channel. If omitted, only default account is matched
`match.peer.kind` + `match.peer.id`	No	Exact peer match (e.g. direct chat / topic / group id)
`match.guild_id`	No	Guild/server-level match
`match.team_id`	No	Team/workspace-level match

Matching priority

When multiple bindings exist, PicoClaw resolves in this order:

peer
parent_peer (for thread/topic parent contexts)
guild_id
team_id
account_id (non-wildcard)
channel wildcard (account_id: "*")
default agent

If a binding points to a missing agent_id, PicoClaw falls back to the default agent.

How matching works (step-by-step)

PicoClaw first filters bindings by match.channel (must equal current channel).
It then filters by match.account_id:
- omitted: match only the channel's default account
- "*": match all accounts on this channel
- explicit value: exact account id match (case-insensitive)
From the remaining candidates, it applies the priority chain above and stops at the first hit.

In other words: channel + account form the candidate set; peer/guild/team then decide final winner.

Common recipes

1) Route one specific DM user to a specialist agent

{
  "agent_id": "support",
  "match": {
    "channel": "telegram",
    "account_id": "*",
    "peer": { "kind": "direct", "id": "user123" }
  }
}

2) Route one Discord server (guild) to a dedicated agent

{
  "agent_id": "sales",
  "match": {
    "channel": "discord",
    "account_id": "my-discord-bot",
    "guild_id": "987654321"
  }
}

3) Route all remaining traffic of a channel to a fallback agent

{
  "agent_id": "main",
  "match": {
    "channel": "discord",
    "account_id": "*"
  }
}

Authoring guidelines (important)

Keep exactly one clear default agent in agents.list ("default": true).
Put specific rules (peer, guild_id, team_id) and broad rules (account_id: "*" only) together safely; priority already guarantees specific rules win.
Avoid duplicate rules with the same specificity and match values. If duplicates exist, the first matching entry in the config array wins.
Ensure every agent_id exists in agents.list; unknown IDs silently fall back to default.

Troubleshooting checklist

Rule not taking effect? Check match.channel spelling first (must be exact).
Expected account-specific routing but still using default? Verify match.account_id equals actual runtime account id.
Wildcard catches too much traffic? Add more specific peer/guild/team rules for critical paths.
Unexpected default fallback? Confirm agent_id exists and is not misspelled.

🔒 Security Sandbox

PicoClaw runs in a sandboxed environment by default. The agent can only access files and execute commands within the configured workspace.

Default Configuration

{
  "agents": {
    "defaults": {
      "workspace": "~/.picoclaw/workspace",
      "restrict_to_workspace": true
    }
  }
}

Option	Default	Description
`workspace`	`~/.picoclaw/workspace`	Working directory for the agent
`restrict_to_workspace`	`true`	Restrict file/command access to workspace

Protected Tools

When restrict_to_workspace: true, the following tools are sandboxed:

Tool	Function	Restriction
`read_file`	Read files	Only files within workspace
`write_file`	Write files	Only files within workspace
`list_dir`	List directories	Only directories within workspace
`edit_file`	Edit files	Only files within workspace
`append_file`	Append to files	Only files within workspace
`exec`	Execute commands	Command paths must be within workspace

Additional Exec Protection

Even with restrict_to_workspace: false, the exec tool blocks these dangerous commands:

rm -rf, del /f, rmdir /s — Bulk deletion
format, mkfs, diskpart — Disk formatting
dd if= — Disk imaging
Writing to /dev/sd[a-z] — Direct disk writes
shutdown, reboot, poweroff — System shutdown
Fork bomb :(){ :|:& };:

File Access Control

Config Key	Type	Default	Description
`tools.allow_read_paths`	string[]	`[]`	Additional paths allowed for reading outside workspace
`tools.allow_write_paths`	string[]	`[]`	Additional paths allowed for writing outside workspace

Exec Security

Config Key	Type	Default	Description
`tools.exec.allow_remote`	bool	`false`	Allow exec tool from remote channels (Telegram/Discord etc.)
`tools.exec.enable_deny_patterns`	bool	`true`	Enable dangerous command interception
`tools.exec.custom_deny_patterns`	string[]	`[]`	Custom regex patterns to block
`tools.exec.custom_allow_patterns`	string[]	`[]`	Custom regex patterns to allow

Security Note: Symlink protection is enabled by default — all file paths are resolved through filepath.EvalSymlinks before whitelist matching, preventing symlink escape attacks.

Known Limitation: Child Processes From Build Tools

The exec safety guard only inspects the command line PicoClaw launches directly. It does not recursively inspect child processes spawned by allowed developer tools such as make, go run, cargo, npm run, or custom build scripts.

That means a top-level command can still compile or launch other binaries after it passes the initial guard check. In practice, treat build scripts, Makefiles, package scripts, and generated binaries as executable code that needs the same level of review as a direct shell command.

For higher-risk environments:

Review build scripts before execution.
Prefer approval/manual review for compile-and-run workflows.
Run PicoClaw inside a container or VM if you need stronger isolation than the built-in guard provides.

Error Examples

[ERROR] tool: Tool execution failed
{tool=exec, error=Command blocked by safety guard (path outside working dir)}

[ERROR] tool: Tool execution failed
{tool=exec, error=Command blocked by safety guard (dangerous pattern detected)}

Disabling Restrictions (Security Risk)

If you need the agent to access paths outside the workspace:

Method 1: Config file

{
  "agents": {
    "defaults": {
      "restrict_to_workspace": false
    }
  }
}

Method 2: Environment variable

export PICOCLAW_AGENTS_DEFAULTS_RESTRICT_TO_WORKSPACE=false

⚠️ Warning: Disabling this restriction allows the agent to access any path on your system. Use with caution in controlled environments only.

Security Boundary Consistency

The restrict_to_workspace setting applies consistently across all execution paths:

Execution Path	Security Boundary
Main Agent	`restrict_to_workspace` ✅
Subagent / Spawn	Inherits same restriction ✅
Heartbeat tasks	Inherits same restriction ✅

All paths share the same workspace restriction — there's no way to bypass the security boundary through subagents or scheduled tasks.

Heartbeat (Periodic Tasks)

PicoClaw can perform periodic tasks automatically. Create a HEARTBEAT.md file in your workspace:

# Periodic Tasks

- Check my email for important messages
- Review my calendar for upcoming events
- Check the weather forecast

The agent will read this file every 30 minutes (configurable) and execute any tasks using available tools.

Async Tasks with Spawn

For long-running tasks (web search, API calls), use the spawn tool to create a subagent:

# Periodic Tasks

## Quick Tasks (respond directly)

- Report current time

## Long Tasks (use spawn for async)

- Search the web for AI news and summarize
- Check email and report important messages

Key behaviors:

Feature	Description
spawn	Creates async subagent, doesn't block heartbeat
Independent context	Subagent has its own context, no session history
message tool	Subagent communicates with user directly via message tool
Non-blocking	After spawning, heartbeat continues to next task

How Subagent Communication Works

Heartbeat triggers
    ↓
Agent reads HEARTBEAT.md
    ↓
For long task: spawn subagent
    ↓                           ↓
Continue to next task      Subagent works independently
    ↓                           ↓
All tasks done            Subagent uses "message" tool
    ↓                           ↓
Respond HEARTBEAT_OK      User receives result directly

The subagent has access to tools (message, web_search, etc.) and can communicate with the user independently without going through the main agent.

Configuration:

{
  "heartbeat": {
    "enabled": true,
    "interval": 30
  }
}

Option	Default	Description
`enabled`	`true`	Enable/disable heartbeat
`interval`	`30`	Check interval in minutes (min: 5)

Environment variables:

PICOCLAW_HEARTBEAT_ENABLED=false to disable
PICOCLAW_HEARTBEAT_INTERVAL=60 to change interval

Providers

Note

Groq provides free voice transcription via Whisper. If configured, audio messages from any channel will be automatically transcribed at the agent level.

Provider	Purpose	Get API Key
`gemini`	LLM (Gemini direct)	aistudio.google.com
`zhipu`	LLM (Zhipu direct)	bigmodel.cn
`volcengine`	LLM (Volcengine direct)	volcengine.com
`openrouter`	LLM (recommended, access to all models)	openrouter.ai
`anthropic`	LLM (Claude direct)	console.anthropic.com
`openai`	LLM (GPT direct)	platform.openai.com
`deepseek`	LLM (DeepSeek direct)	platform.deepseek.com
`qwen`	LLM (Qwen direct)	dashscope.console.aliyun.com
`groq`	LLM + Voice transcription (Whisper)	console.groq.com
`cerebras`	LLM (Cerebras direct)	cerebras.ai
`vivgrid`	LLM (Vivgrid direct)	vivgrid.com

Model Configuration (model_list)

What's New? PicoClaw now uses a model-centric configuration approach. Simply specify vendor/model format (e.g., zhipu/glm-4.7) to add new providers — zero code changes required!

This design also enables multi-agent support with flexible provider selection:

Different agents, different providers: Each agent can use its own LLM provider
Model fallbacks: Configure primary and fallback models for resilience
Load balancing: Distribute requests across multiple endpoints
Centralized configuration: Manage all providers in one place

All Supported Vendors

Vendor	`model` Prefix	Default API Base	Protocol	API Key
OpenAI	`openai/`	`https://api.openai.com/v1`	OpenAI	Get Key
Anthropic	`anthropic/`	`https://api.anthropic.com/v1`	Anthropic	Get Key
智谱 AI (GLM)	`zhipu/`	`https://open.bigmodel.cn/api/paas/v4`	OpenAI	Get Key
DeepSeek	`deepseek/`	`https://api.deepseek.com/v1`	OpenAI	Get Key
Google Gemini	`gemini/`	`https://generativelanguage.googleapis.com/v1beta`	OpenAI	Get Key
Groq	`groq/`	`https://api.groq.com/openai/v1`	OpenAI	Get Key
Moonshot	`moonshot/`	`https://api.moonshot.cn/v1`	OpenAI	Get Key
通义千问 (Qwen)	`qwen/`	`https://dashscope.aliyuncs.com/compatible-mode/v1`	OpenAI	Get Key
NVIDIA	`nvidia/`	`https://integrate.api.nvidia.com/v1`	OpenAI	Get Key
Ollama	`ollama/`	`http://localhost:11434/v1`	OpenAI	Local (no key needed)
OpenRouter	`openrouter/`	`https://openrouter.ai/api/v1`	OpenAI	Get Key
LiteLLM Proxy	`litellm/`	`http://localhost:4000/v1`	OpenAI	Your LiteLLM proxy key
VLLM	`vllm/`	`http://localhost:8000/v1`	OpenAI	Local
Cerebras	`cerebras/`	`https://api.cerebras.ai/v1`	OpenAI	Get Key
VolcEngine (Doubao)	`volcengine/`	`https://ark.cn-beijing.volces.com/api/v3`	OpenAI	Get Key
神算云	`shengsuanyun/`	`https://router.shengsuanyun.com/api/v1`	OpenAI	—
BytePlus	`byteplus/`	`https://ark.ap-southeast.bytepluses.com/api/v3`	OpenAI	Get Key
Vivgrid	`vivgrid/`	`https://api.vivgrid.com/v1`	OpenAI	Get Key
LongCat	`longcat/`	`https://api.longcat.chat/openai`	OpenAI	Get Key
ModelScope (魔搭)	`modelscope/`	`https://api-inference.modelscope.cn/v1`	OpenAI	Get Token
Antigravity	`antigravity/`	Google Cloud	Custom	OAuth only
GitHub Copilot	`github-copilot/`	`localhost:4321`	gRPC	—

Basic Configuration

{
  "model_list": [
    {
      "model_name": "ark-code-latest",
      "model": "volcengine/ark-code-latest",
      "api_key": "sk-your-api-key"
    },
    {
      "model_name": "gpt-5.4",
      "model": "openai/gpt-5.4",
      "api_key": "sk-your-openai-key"
    },
    {
      "model_name": "claude-sonnet-4.6",
      "model": "anthropic/claude-sonnet-4.6",
      "api_key": "sk-ant-your-key"
    },
    {
      "model_name": "glm-4.7",
      "model": "zhipu/glm-4.7",
      "api_key": "your-zhipu-key"
    }
  ],
  "agents": {
    "defaults": {
      "model": "gpt-5.4"
    }
  }
}

Vendor-Specific Examples

OpenAI

{
  "model_name": "gpt-5.4",
  "model": "openai/gpt-5.4",
  "api_key": "sk-..."
}

VolcEngine (Doubao)

{
  "model_name": "ark-code-latest",
  "model": "volcengine/ark-code-latest",
  "api_key": "sk-..."
}

智谱 AI (GLM)

{
  "model_name": "glm-4.7",
  "model": "zhipu/glm-4.7",
  "api_key": "your-key"
}

DeepSeek

{
  "model_name": "deepseek-chat",
  "model": "deepseek/deepseek-chat",
  "api_key": "sk-..."
}

Anthropic

{
  "model_name": "claude-sonnet-4.6",
  "model": "anthropic/claude-sonnet-4.6",
  "api_key": "sk-ant-your-key"
}

Run picoclaw auth login --provider anthropic to paste your API token.

For direct Anthropic API access or custom endpoints that only support Anthropic's native message format:

{
  "model_name": "claude-opus-4-6",
  "model": "anthropic-messages/claude-opus-4-6",
  "api_key": "sk-ant-your-key",
  "api_base": "https://api.anthropic.com"
}

Use anthropic-messages when the endpoint requires Anthropic's native /v1/messages format instead of OpenAI-compatible /v1/chat/completions.

Ollama (local)

{
  "model_name": "llama3",
  "model": "ollama/llama3"
}

Custom Proxy / LiteLLM

{
  "model_name": "my-custom-model",
  "model": "openai/custom-model",
  "api_base": "https://my-proxy.com/v1",
  "api_key": "sk-..."
}

PicoClaw strips only the outer litellm/ prefix before sending the request, so litellm/lite-gpt4 sends lite-gpt4, while litellm/openai/gpt-4o sends openai/gpt-4o.

Load Balancing

Configure multiple endpoints for the same model name — PicoClaw will automatically round-robin between them:

{
  "model_list": [
    {
      "model_name": "gpt-5.4",
      "model": "openai/gpt-5.4",
      "api_base": "https://api1.example.com/v1",
      "api_key": "sk-key1"
    },
    {
      "model_name": "gpt-5.4",
      "model": "openai/gpt-5.4",
      "api_base": "https://api2.example.com/v1",
      "api_key": "sk-key2"
    }
  ]
}

Migration from Legacy `providers` Config

The old providers configuration is deprecated but still supported for backward compatibility. See docs/migration/model-list-migration.md for the full guide.

Provider Architecture

PicoClaw routes providers by protocol family:

OpenAI-compatible: OpenRouter, Groq, Zhipu, vLLM-style endpoints, and most others.
Anthropic: Claude-native API behavior.
Codex/OAuth: OpenAI OAuth/token authentication route.

This keeps the runtime lightweight while making new OpenAI-compatible backends mostly a config operation (api_base + api_key).

Zhipu (legacy providers format)

{
  "agents": {
    "defaults": {
      "workspace": "~/.picoclaw/workspace",
      "model": "glm-4.7",
      "max_tokens": 8192,
      "temperature": 0.7,
      "max_tool_iterations": 20
    }
  },
  "providers": {
    "zhipu": {
      "api_key": "Your API Key",
      "api_base": "https://open.bigmodel.cn/api/paas/v4"
    }
  }
}

Full config example

{
  "agents": {
    "defaults": {
      "model": "anthropic/claude-opus-4-5"
    }
  },
  "session": {
    "dm_scope": "per-channel-peer",
    "backlog_limit": 20
  },
  "providers": {
    "openrouter": {
      "api_key": "sk-or-v1-xxx"
    },
    "groq": {
      "api_key": "gsk_xxx"
    }
  },
  "channels": {
    "telegram": {
      "enabled": true,
      "token": "123456:ABC...",
      "allow_from": ["123456789"]
    }
  },
  "tools": {
    "web": {
      "duckduckgo": {
        "enabled": true,
        "max_results": 5
      }
    }
  },
  "heartbeat": {
    "enabled": true,
    "interval": 30
  }
}

Scheduled Tasks / Reminders

PicoClaw supports cron-style scheduled tasks via the cron tool. The agent can set, list, and cancel reminders or recurring jobs that trigger at specified times.

{
  "tools": {
    "cron": {
      "enabled": true,
      "exec_timeout_minutes": 5
    }
  }
}

Scheduled tasks persist across restarts and are stored in ~/.picoclaw/workspace/cron/.

Advanced Topics

Topic	Description
Hook System	Event-driven hooks: observers, interceptors, approval hooks
Steering	Inject messages into a running agent loop between tool calls
SubTurn	Subagent coordination, concurrency control, lifecycle
Context Management	Context boundary detection, proactive budget check, compression

27 KiB Raw Blame History