mirror of
https://github.com/sipeed/picoclaw.git
synced 2026-06-12 18:08:54 +00:00
refactor(docs): reorganize docs by type and locale
This commit is contained in:
@@ -0,0 +1,285 @@
|
||||
# Config Schema Versioning Guide
|
||||
|
||||
## Overview
|
||||
|
||||
PicoClaw uses a schema versioning system for `config.json` to ensure smooth upgrades as the configuration format evolves.
|
||||
|
||||
## Version History
|
||||
|
||||
### Version 1
|
||||
- **Introduction**: Initial version with version field support
|
||||
- **Changes**: Added `version` field to Config struct
|
||||
- **Migration**: No structural changes needed for existing configs
|
||||
|
||||
### Version 2
|
||||
- **Introduction**: Model enable/disable support and channel config unification
|
||||
- **Changes**:
|
||||
- Added `enabled` field to `ModelConfig` — allows disabling individual model entries without removing them
|
||||
- During V1→V2 migration, `enabled` is auto-inferred: models with API keys or the reserved `local-model` name are enabled; others default to disabled
|
||||
- Migrated legacy channel fields: Discord `mention_only` → `group_trigger.mention_only`, OneBot `group_trigger_prefix` → `group_trigger.prefixes`
|
||||
- V0 configs now migrate directly to CurrentVersion (V2) instead of going through V1
|
||||
- `makeBackup()` now uses date-only suffix (e.g., `config.json.20260330.bak`) and also backs up `.security.yml`
|
||||
|
||||
### Version 3
|
||||
- **Introduction**: Enhanced type safety and improved error handling
|
||||
- **Changes**:
|
||||
- Added comma-ok type assertions in channel configuration decoding to prevent potential panics
|
||||
- Improved error logging for Weixin channel configuration decoding
|
||||
- Enhanced security configuration documentation and examples
|
||||
- **Auto-migration**: V2 configs are automatically migrated to V3 on load with no user action required
|
||||
- **Backup**: Before migration, the system creates a date-stamped backup (e.g., `config.json.20260413.bak`) in the same directory
|
||||
- **Downgrade risk**: Once migrated to V3, the config cannot be safely loaded by older V2-only versions. To downgrade, restore from the auto-created backup file.
|
||||
|
||||
## How It Works
|
||||
|
||||
### Automatic Migration
|
||||
When you load a config file:
|
||||
1. The system first reads the `version` field from the JSON
|
||||
2. Based on the detected version, it loads the appropriate config struct (`configV0`, `configV1`, etc.)
|
||||
3. If the loaded version is less than the latest, migrations are applied incrementally
|
||||
4. Before saving, the system automatically creates a date-stamped backup of `config.json` and `.security.yml`
|
||||
5. The version number is updated automatically
|
||||
6. The migrated config is automatically saved back to disk
|
||||
|
||||
### Version Field
|
||||
The `version` field in `config.json` indicates the schema version:
|
||||
- `0` or missing: Legacy config (no version field)
|
||||
- `1`: Previous version (will be auto-migrated to V2 on load)
|
||||
- `2`: Current version
|
||||
|
||||
```json
|
||||
{
|
||||
"version": 3,
|
||||
"agents": {...},
|
||||
...
|
||||
}
|
||||
```
|
||||
|
||||
## Adding a New Migration
|
||||
|
||||
When making breaking changes to the config schema:
|
||||
|
||||
### Step 1: Define the New Version Struct
|
||||
|
||||
Create a new struct for the new version if the structure changes significantly:
|
||||
|
||||
```go
|
||||
// ConfigV2 represents version 2 config structure
|
||||
type ConfigV2 struct {
|
||||
Version int `json:"version"`
|
||||
Agents AgentsConfig `json:"agents"`
|
||||
// ... other fields with new structure
|
||||
}
|
||||
```
|
||||
|
||||
### Step 2: Update Current Config Version
|
||||
|
||||
```go
|
||||
const CurrentVersion = 2 // Increment this
|
||||
```
|
||||
|
||||
### Step 3: Add a Loader Function
|
||||
|
||||
```go
|
||||
// loadConfigV3 loads a version 3 config
|
||||
func loadConfigV3(data []byte) (*Config, error) {
|
||||
cfg := DefaultConfig()
|
||||
|
||||
// Parse to ConfigV3 struct
|
||||
var v3 ConfigV3
|
||||
if err := json.Unmarshal(data, &v3); err != nil {
|
||||
return nil, err
|
||||
}
|
||||
|
||||
// Convert to current Config
|
||||
cfg.Version = v3.Version
|
||||
cfg.Agents = v3.Agents
|
||||
// ... map other fields
|
||||
|
||||
return cfg, nil
|
||||
}
|
||||
```
|
||||
|
||||
### Step 4: Add Migration Logic
|
||||
|
||||
```go
|
||||
func (c *configV2) Migrate() (*Config, error) {
|
||||
// Apply V2→V3 structural changes here
|
||||
migrated := &c.Config
|
||||
migrated.Version = 3
|
||||
// Apply structural changes
|
||||
return migrated, nil
|
||||
}
|
||||
```
|
||||
|
||||
### Step 5: Update LoadConfig Switch
|
||||
|
||||
```go
|
||||
func LoadConfig(path string) (*Config, error) {
|
||||
// ... read file ...
|
||||
|
||||
switch versionInfo.Version {
|
||||
case 0:
|
||||
cfg, err = loadConfigV0(data)
|
||||
case 1:
|
||||
cfg, err = loadConfigV1(data)
|
||||
case 2:
|
||||
cfg, err = loadConfig(data)
|
||||
case 3:
|
||||
cfg, err = loadConfigV3(data)
|
||||
default:
|
||||
return nil, fmt.Errorf("unsupported config version: %d", versionInfo.Version)
|
||||
}
|
||||
|
||||
// ... migrate and validate ...
|
||||
}
|
||||
```
|
||||
|
||||
### Step 6: Test Your Migration
|
||||
|
||||
Create a test in `config_migration_test.go`:
|
||||
|
||||
```go
|
||||
func TestMigrateV2ToV3(t *testing.T) {
|
||||
// Create a version 2 config
|
||||
v2Config := Config{
|
||||
Version: 2,
|
||||
// ... set up test data
|
||||
}
|
||||
|
||||
// Apply migration
|
||||
migrated, err := v2Config.Migrate()
|
||||
if err != nil {
|
||||
t.Fatalf("Migration failed: %v", err)
|
||||
}
|
||||
|
||||
// Verify version is updated
|
||||
if migrated.Version != 3 {
|
||||
t.Errorf("Expected version 3, got %d", migrated.Version)
|
||||
}
|
||||
|
||||
// Verify data is preserved/transformed correctly
|
||||
// ...
|
||||
}
|
||||
```
|
||||
|
||||
## Migration Best Practices
|
||||
|
||||
1. **Version-Specific Structs**: Define a separate struct for each version that has structural changes
|
||||
2. **Backward Compatibility**: Ensure old configs can still be loaded with their specific structs
|
||||
3. **No Data Loss**: Migrations should preserve all user settings
|
||||
4. **Idempotent**: Running the same migration multiple times should be safe
|
||||
5. **Auto-Save**: Migrated configs are automatically saved to update the user's file
|
||||
6. **Auto-Backup**: Before saving, the system creates a date-stamped backup of `config.json` and `.security.yml`
|
||||
7. **Test Thoroughly**: Test with real user config files
|
||||
8. **Update Defaults**: Keep `defaults.go` in sync with the latest schema
|
||||
|
||||
## V2→V3 Migration Guide
|
||||
|
||||
### What Changed?
|
||||
|
||||
Version 3 introduces improved type safety and error handling:
|
||||
|
||||
- **Type-safe channel decoding**: All channel type assertions now use comma-ok pattern (`val, ok := v.(*Settings)`) to prevent panics if Type and Settings are mismatched
|
||||
- **Enhanced error logging**: Weixin channel now logs errors on `GetDecoded()` failure for consistency with other channels
|
||||
- **Documentation fixes**: Corrected stray quotes in JSON configuration examples
|
||||
|
||||
### Auto-Migration Behavior
|
||||
|
||||
When you run PicoClaw with a V2 config file:
|
||||
|
||||
1. **Detection**: PicoClaw reads the `version` field and detects V2
|
||||
2. **Backup**: Before any changes, creates `config.json.YYYYMMDD.bak` (e.g., `config.json.20260413.bak`)
|
||||
3. **Migration**: Applies V2→V3 structural changes (primarily internal type safety improvements)
|
||||
4. **Save**: Writes the updated config with `"version": 3`
|
||||
5. **Continue**: Starts normally with the V3 config
|
||||
|
||||
**No user action required** — the migration happens automatically on first load.
|
||||
|
||||
### Backup Location
|
||||
|
||||
Backups are created in the same directory as your config file:
|
||||
|
||||
- **Default**: `~/.picoclaw/config.json.20260413.bak`
|
||||
- **Custom path**: If using `PICOCLAW_CONFIG`, backup is created next to that file
|
||||
- **Security file**: `.security.yml` is also backed up as `.security.yml.YYYYMMDD.bak`
|
||||
|
||||
### Downgrade Risk
|
||||
|
||||
⚠️ **Important**: Once migrated to V3, the config **cannot** be safely loaded by older PicoClaw versions that only support V2.
|
||||
|
||||
**To downgrade:**
|
||||
|
||||
1. Stop PicoClaw
|
||||
2. Restore the backup:
|
||||
```bash
|
||||
cp ~/.picoclaw/config.json.20260413.bak ~/.picoclaw/config.json
|
||||
cp ~/.picoclaw/.security.yml.20260413.bak ~/.picoclaw/.security.yml # if it exists
|
||||
```
|
||||
3. Use a PicoClaw version that supports V2 configs
|
||||
|
||||
**Alternative**: Manually edit `config.json` and change `"version": 3` to `"version": 2`. This works because V3 changes are primarily code-level safety improvements, not structural schema changes.
|
||||
|
||||
## Example Migration
|
||||
|
||||
### Scenario: Adding a new field with default value
|
||||
|
||||
Old config (version 2):
|
||||
```json
|
||||
{
|
||||
"version": 3,
|
||||
"model_list": [
|
||||
{
|
||||
"model_name": "gpt-5.4",
|
||||
"model": "openai/gpt-5.4"
|
||||
}
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
Migration to version 3:
|
||||
```go
|
||||
func (c *configV2) Migrate() (*Config, error) {
|
||||
migrated := &c.Config
|
||||
migrated.Version = 3
|
||||
|
||||
// Add new field with default value if not set
|
||||
// ...
|
||||
|
||||
return migrated, nil
|
||||
}
|
||||
```
|
||||
|
||||
New config (version 3):
|
||||
```json
|
||||
{
|
||||
"version": 3,
|
||||
"model_list": [
|
||||
{
|
||||
"model_name": "gpt-5.4",
|
||||
"model": "openai/gpt-5.4",
|
||||
"new_option": true
|
||||
}
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### Config Not Upgrading
|
||||
- Check that `CurrentVersion` is incremented
|
||||
- Verify migration logic handles the target version
|
||||
- Ensure `Migrate()` is called in `LoadConfig()`
|
||||
|
||||
### Migration Errors
|
||||
- Check error messages for specific migration failures
|
||||
- Review migration logic for edge cases
|
||||
- Ensure all required fields are properly initialized
|
||||
- Verify the loader function for the source version
|
||||
|
||||
### Data Loss After Migration
|
||||
- Ensure all fields are copied during migration
|
||||
- Check that the migration doesn't overwrite values with defaults unnecessarily
|
||||
- Review the conversion logic in the loader functions
|
||||
- Check the auto-backup files (e.g., `config.json.20260330.bak`) to recover original data
|
||||
|
||||
@@ -0,0 +1,125 @@
|
||||
# Scheduled Tasks and Cron Jobs
|
||||
|
||||
> Back to [README](../README.md)
|
||||
|
||||
PicoClaw stores scheduled jobs in the current workspace and can run them either as reminders, full agent turns, or shell commands.
|
||||
|
||||
## Schedule Types
|
||||
|
||||
PicoClaw currently uses three schedule forms in the cron tool:
|
||||
|
||||
- `at_seconds`: one-time job, relative to now. After it runs, the job is removed from the store.
|
||||
- `every_seconds`: recurring interval, in seconds.
|
||||
- `cron_expr`: recurring cron expression such as `0 9 * * *`.
|
||||
|
||||
The CLI command `picoclaw cron add` currently supports recurring jobs only:
|
||||
|
||||
- `--every <seconds>`
|
||||
- `--cron '<expr>'`
|
||||
|
||||
There is no CLI flag for a one-time `at` job today.
|
||||
|
||||
Examples:
|
||||
|
||||
```bash
|
||||
picoclaw cron add --name "Daily summary" --message "Summarize today's logs" --cron "0 18 * * *"
|
||||
picoclaw cron add --name "Ping" --message "heartbeat" --every 300 --deliver
|
||||
```
|
||||
|
||||
## Execution Modes
|
||||
|
||||
Jobs are stored with a message payload and can execute in three stable user-facing modes:
|
||||
|
||||
### `deliver: false`
|
||||
|
||||
This is the default for the cron tool.
|
||||
|
||||
When the job fires, PicoClaw sends the saved message back through the agent loop as a new agent turn. Use this for scheduled work that may need reasoning, tools, or a generated reply.
|
||||
|
||||
### `deliver: true`
|
||||
|
||||
When the job fires, PicoClaw publishes the saved message directly to the target channel and recipient without agent processing.
|
||||
|
||||
The CLI `picoclaw cron add --deliver` flag uses this mode.
|
||||
|
||||
### `command`
|
||||
|
||||
When a cron-tool job includes `command`, PicoClaw runs that shell command through the `exec` tool and publishes the command output back to the channel.
|
||||
|
||||
For command jobs, `deliver` is forced to `false` when the job is created. The saved `message` becomes descriptive text only; the scheduled action is the shell command.
|
||||
|
||||
The current CLI `picoclaw cron add` command does not expose a `command` flag.
|
||||
|
||||
## Config and Security Gates
|
||||
|
||||
### `tools.cron`
|
||||
|
||||
`tools.cron.enabled` controls whether the agent-facing `cron` tool is registered. Default: `true`.
|
||||
|
||||
If you disable `tools.cron`, users can no longer create or manage jobs through the agent tool. The gateway still starts `CronService`, but it does not install the job execution callback. As a result, due jobs do not actually run; one-time jobs may be deleted and recurring jobs may be rescheduled without executing their payload. The CLI still uses the same job store.
|
||||
|
||||
`tools.cron.exec_timeout_minutes` sets the timeout used for scheduled command execution. Default: `5`. Set `0` for no timeout.
|
||||
|
||||
### `tools.exec`
|
||||
|
||||
Scheduled command jobs depend on `tools.exec.enabled`. Default: `true`.
|
||||
|
||||
If `tools.exec.enabled` is `false`:
|
||||
|
||||
- new command jobs are rejected by the cron tool
|
||||
- existing command jobs publish a `command execution is disabled` error when they fire
|
||||
|
||||
`tools.exec.allow_remote` is still enforced by the exec tool, but cron command scheduling already requires an internal channel when the job is created. In practice, reminder jobs can be scheduled from remote channels, while scheduled command jobs are limited to internal channels.
|
||||
|
||||
### `allow_command`
|
||||
|
||||
`tools.cron.allow_command` defaults to `true`.
|
||||
|
||||
This is not a hard disable switch. If you set `allow_command` to `false`, PicoClaw still allows a command job when the caller explicitly passes `command_confirm: true`.
|
||||
|
||||
Command jobs also require an internal channel. Non-command reminders do not have that restriction.
|
||||
|
||||
Example:
|
||||
|
||||
```json
|
||||
{
|
||||
"tools": {
|
||||
"cron": {
|
||||
"enabled": true,
|
||||
"exec_timeout_minutes": 5,
|
||||
"allow_command": true
|
||||
},
|
||||
"exec": {
|
||||
"enabled": true
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Persistence and Location
|
||||
|
||||
Cron jobs are stored in:
|
||||
|
||||
```text
|
||||
<workspace>/cron/jobs.json
|
||||
```
|
||||
|
||||
By default, the workspace is:
|
||||
|
||||
```text
|
||||
~/.picoclaw/workspace
|
||||
```
|
||||
|
||||
If `PICOCLAW_HOME` is set, the default workspace becomes:
|
||||
|
||||
```text
|
||||
$PICOCLAW_HOME/workspace
|
||||
```
|
||||
|
||||
Both the gateway and `picoclaw cron` CLI subcommands use the same `cron/jobs.json` file.
|
||||
|
||||
Notes:
|
||||
|
||||
- one-time `at_seconds` jobs are deleted after they run
|
||||
- recurring jobs stay in the store until removed
|
||||
- disabled jobs stay in the store and still appear in `picoclaw cron list`
|
||||
@@ -0,0 +1,95 @@
|
||||
# Dynamic Rate Limiting
|
||||
|
||||
PicoClaw prevents 429 errors from LLM provider APIs by enforcing configurable per-model request-rate limits **before** sending each request. Unlike the reactive cooldown/fallback system (which activates *after* a 429 is received), rate limiting is **proactive**: it keeps outbound QPS within the provider's free-tier or plan limits.
|
||||
|
||||
## How it works
|
||||
|
||||
### Token-bucket algorithm
|
||||
|
||||
Each rate-limited model gets a token bucket:
|
||||
|
||||
- **Capacity** = `rpm` (burst size equals the per-minute limit)
|
||||
- **Refill rate** = `rpm / 60` tokens per second
|
||||
- Tokens are consumed one per LLM call; if the bucket is empty, the call blocks until a token refills or the request context is cancelled
|
||||
|
||||
### Call chain integration
|
||||
|
||||
```
|
||||
AgentLoop.callLLM()
|
||||
└─ FallbackChain.Execute() ← iterate candidates
|
||||
├─ CooldownTracker.IsAvailable() ← skip if post-429 cooldown active
|
||||
├─ RateLimiterRegistry.Wait() ← NEW: block until token available
|
||||
└─ provider.Chat() ← actual LLM HTTP call
|
||||
```
|
||||
|
||||
The rate limiter runs **after** the cooldown check and **before** the provider call, so:
|
||||
- Candidates already in cooldown are skipped entirely (no token consumed)
|
||||
- Candidates that are available get throttled to the configured RPM
|
||||
|
||||
The same check applies in `ExecuteImage`.
|
||||
|
||||
### Thread safety
|
||||
|
||||
`RateLimiterRegistry` is safe for concurrent use. The per-limiter token bucket uses a fine-grained mutex so concurrent goroutines each acquire their own token independently.
|
||||
|
||||
## Configuration
|
||||
|
||||
Set `rpm` on any model in `model_list`:
|
||||
|
||||
```yaml
|
||||
model_list:
|
||||
- model_name: gpt-4o-free
|
||||
model: openai/gpt-4o
|
||||
api_base: https://api.openai.com/v1
|
||||
rpm: 3 # max 3 requests per minute
|
||||
api_keys:
|
||||
- sk-...
|
||||
|
||||
- model_name: claude-haiku
|
||||
model: anthropic/claude-haiku-4-5
|
||||
rpm: 60 # 60 rpm (Anthropic free tier)
|
||||
api_keys:
|
||||
- sk-ant-...
|
||||
|
||||
- model_name: local-llm
|
||||
model: openai/llama3
|
||||
api_base: http://localhost:11434/v1
|
||||
# no rpm → unrestricted
|
||||
```
|
||||
|
||||
| Field | Type | Default | Description |
|
||||
|---|---|---|---|
|
||||
| `rpm` | `int` | `0` | Requests per minute. `0` means no limit. |
|
||||
|
||||
### Interaction with fallbacks
|
||||
|
||||
When a model has fallbacks configured, each candidate is rate-limited **independently**:
|
||||
|
||||
```yaml
|
||||
model_list:
|
||||
- model_name: gpt4-with-fallback
|
||||
model: openai/gpt-4o
|
||||
rpm: 5
|
||||
fallbacks:
|
||||
- gpt-4o-mini # must also be in model_list; its own rpm applies
|
||||
```
|
||||
|
||||
If the current candidate's bucket is empty and there are more candidates available, PicoClaw skips the locally saturated candidate and tries the next fallback immediately. Only the last remaining candidate waits for a token to refill. If the context deadline is hit while waiting on that last candidate, the wait error propagates.
|
||||
|
||||
For `model_list` aliases that resolve to the same underlying provider/model, rate limiting is keyed by the stable config identity (for example `model_name`) rather than the resolved runtime model string. This preserves distinct RPM settings for multi-key and alias-based configurations.
|
||||
|
||||
### Burst behaviour
|
||||
|
||||
The bucket starts **full** (burst = RPM). For `rpm: 3`, the first 3 requests fire instantly; subsequent requests are spaced ~20 s apart.
|
||||
|
||||
To reduce burstiness for strict APIs, set a lower `rpm` and rely on the steady-state refill.
|
||||
|
||||
## Files changed
|
||||
|
||||
| File | What |
|
||||
|---|---|
|
||||
| `pkg/providers/ratelimiter.go` | `RateLimiter` (token bucket) + `RateLimiterRegistry` |
|
||||
| `pkg/providers/ratelimiter_test.go` | Unit tests for limiter and registry |
|
||||
| `pkg/providers/fallback.go` | `FallbackCandidate.RPM` field; `FallbackChain.rl`; `Wait()` call in `Execute`/`ExecuteImage` |
|
||||
| `pkg/agent/model_resolution.go` | Resolves candidates from `model_list`, preserving stable config identity and propagating `RPM` into `FallbackCandidate` |
|
||||
| `pkg/agent/loop.go` | Build `RateLimiterRegistry`, register all agents' candidates, pass to `NewFallbackChain` |
|
||||
@@ -0,0 +1,415 @@
|
||||
# 🔧 Configuration des Outils
|
||||
|
||||
> Retour au [README](../project/README.fr.md)
|
||||
|
||||
La configuration des outils de PicoClaw se trouve dans le champ `tools` de `config.json`.
|
||||
|
||||
## Structure du répertoire
|
||||
|
||||
```json
|
||||
{
|
||||
"tools": {
|
||||
"web": {
|
||||
...
|
||||
},
|
||||
"mcp": {
|
||||
...
|
||||
},
|
||||
"exec": {
|
||||
...
|
||||
},
|
||||
"cron": {
|
||||
...
|
||||
},
|
||||
"skills": {
|
||||
...
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Outils Web
|
||||
|
||||
Les outils web sont utilisés pour la recherche et la récupération de pages web.
|
||||
|
||||
### Web Fetcher
|
||||
Paramètres généraux pour la récupération et le traitement du contenu des pages web.
|
||||
|
||||
| Config | Type | Par défaut | Description |
|
||||
|---------------------|--------|---------------|-----------------------------------------------------------------------------------------------|
|
||||
| `enabled` | bool | true | Activer la capacité de récupération de pages web. |
|
||||
| `fetch_limit_bytes` | int | 10485760 | Taille maximale du contenu de la page web à récupérer, en octets (par défaut 10 Mo). |
|
||||
| `format` | string | "plaintext" | Format de sortie du contenu récupéré. Options : `plaintext` ou `markdown` (recommandé). |
|
||||
|
||||
### DuckDuckGo
|
||||
|
||||
| Config | Type | Par défaut | Description |
|
||||
|---------------|------|------------|--------------------------------|
|
||||
| `enabled` | bool | true | Activer la recherche DuckDuckGo |
|
||||
| `max_results` | int | 5 | Nombre maximum de résultats |
|
||||
|
||||
### Baidu Search
|
||||
|
||||
| Config | Type | Par défaut | Description |
|
||||
|---------------|--------|-----------------------------------------------------------------|------------------------------------|
|
||||
| `enabled` | bool | false | Activer la recherche Baidu |
|
||||
| `api_key` | string | - | Clé API Qianfan |
|
||||
| `base_url` | string | `https://qianfan.baidubce.com/v2/ai_search/web_search` | URL de l'API Baidu Search |
|
||||
| `max_results` | int | 10 | Nombre maximum de résultats |
|
||||
|
||||
```json
|
||||
{
|
||||
"tools": {
|
||||
"web": {
|
||||
"baidu_search": {
|
||||
"enabled": true,
|
||||
"api_key": "YOUR_BAIDU_QIANFAN_API_KEY",
|
||||
"max_results": 10
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Perplexity
|
||||
|
||||
| Config | Type | Par défaut | Description |
|
||||
|---------------|--------|------------|--------------------------------|
|
||||
| `enabled` | bool | false | Activer la recherche Perplexity |
|
||||
| `api_key` | string | - | Clé API Perplexity |
|
||||
| `api_keys` | string[] | - | Plusieurs clés API Perplexity pour la rotation (`api_key` prioritaire) |
|
||||
| `max_results` | int | 5 | Nombre maximum de résultats |
|
||||
|
||||
### Brave
|
||||
|
||||
| Config | Type | Par défaut | Description |
|
||||
|---------------|--------|------------|---------------------------|
|
||||
| `enabled` | bool | false | Activer la recherche Brave |
|
||||
| `api_key` | string | - | Clé API Brave Search |
|
||||
| `api_keys` | string[] | - | Plusieurs clés API Brave Search pour la rotation (`api_key` prioritaire) |
|
||||
| `max_results` | int | 5 | Nombre maximum de résultats |
|
||||
|
||||
### Tavily
|
||||
|
||||
| Config | Type | Par défaut | Description |
|
||||
|---------------|--------|------------|------------------------------------|
|
||||
| `enabled` | bool | false | Activer la recherche Tavily |
|
||||
| `api_key` | string | - | Clé API Tavily |
|
||||
| `base_url` | string | - | URL de base Tavily personnalisée |
|
||||
| `max_results` | int | 0 | Nombre maximum de résultats (0 = défaut) |
|
||||
|
||||
### SearXNG
|
||||
|
||||
| Config | Type | Par défaut | Description |
|
||||
|---------------|--------|--------------------------|--------------------------------|
|
||||
| `enabled` | bool | false | Activer la recherche SearXNG |
|
||||
| `base_url` | string | `http://localhost:8888` | URL de l'instance SearXNG |
|
||||
| `max_results` | int | 5 | Nombre maximum de résultats |
|
||||
|
||||
### GLM Search
|
||||
|
||||
| Config | Type | Par défaut | Description |
|
||||
|-----------------|--------|------------------------------------------------------|---------------------------|
|
||||
| `enabled` | bool | false | Activer GLM Search |
|
||||
| `api_key` | string | - | Clé API GLM |
|
||||
| `base_url` | string | `https://open.bigmodel.cn/api/paas/v4/web_search` | URL de l'API GLM Search |
|
||||
| `search_engine` | string | `search_std` | Type de moteur de recherche |
|
||||
| `max_results` | int | 5 | Nombre maximum de résultats |
|
||||
|
||||
## Outil Exec
|
||||
|
||||
L'outil exec est utilisé pour exécuter des commandes shell.
|
||||
|
||||
| Config | Type | Par défaut | Description |
|
||||
|------------------------|-------|------------|------------------------------------------------|
|
||||
| `enabled` | bool | true | Activer l'outil exec |
|
||||
| `enable_deny_patterns` | bool | true | Activer le blocage par défaut des commandes dangereuses |
|
||||
| `custom_deny_patterns` | array | [] | Modèles de refus personnalisés (expressions régulières) |
|
||||
|
||||
### Désactivation de l'Outil Exec
|
||||
|
||||
Pour désactiver complètement l'outil `exec`, définissez `enabled` à `false` :
|
||||
|
||||
**Via le fichier de configuration :**
|
||||
```json
|
||||
{
|
||||
"tools": {
|
||||
"exec": {
|
||||
"enabled": false
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
**Via la variable d'environnement :**
|
||||
```bash
|
||||
PICOCLAW_TOOLS_EXEC_ENABLED=false
|
||||
```
|
||||
|
||||
> **Note :** Lorsqu'il est désactivé, l'agent ne pourra pas exécuter de commandes shell. Cela affecte également la capacité de l'outil Cron à exécuter des commandes shell planifiées.
|
||||
|
||||
### Fonctionnalité
|
||||
|
||||
- **`enable_deny_patterns`** : Définir à `false` pour désactiver complètement les modèles de blocage par défaut des commandes dangereuses
|
||||
- **`custom_deny_patterns`** : Ajouter des modèles regex de refus personnalisés ; les commandes correspondantes seront bloquées
|
||||
|
||||
### Modèles de commandes bloquées par défaut
|
||||
|
||||
Par défaut, PicoClaw bloque les commandes dangereuses suivantes :
|
||||
|
||||
- Commandes de suppression : `rm -rf`, `del /f/q`, `rmdir /s`
|
||||
- Opérations disque : `format`, `mkfs`, `diskpart`, `dd if=`, écriture vers `/dev/sd*`
|
||||
- Opérations système : `shutdown`, `reboot`, `poweroff`
|
||||
- Substitution de commandes : `$()`, `${}`, backticks
|
||||
- Pipe vers shell : `| sh`, `| bash`
|
||||
- Élévation de privilèges : `sudo`, `chmod`, `chown`
|
||||
- Contrôle de processus : `pkill`, `killall`, `kill -9`
|
||||
- Opérations distantes : `curl | sh`, `wget | sh`, `ssh`
|
||||
- Gestion de paquets : `apt`, `yum`, `dnf`, `npm install -g`, `pip install --user`
|
||||
- Conteneurs : `docker run`, `docker exec`
|
||||
- Git : `git push`, `git force`
|
||||
- Autres : `eval`, `source *.sh`
|
||||
|
||||
### Limitation architecturale connue
|
||||
|
||||
Le garde exec ne valide que la commande de niveau supérieur envoyée à PicoClaw. Il n'inspecte **pas** récursivement les processus enfants générés par les outils de build ou les scripts après le démarrage de cette commande.
|
||||
|
||||
Exemples de workflows pouvant contourner le garde de commande directe une fois la commande initiale autorisée :
|
||||
|
||||
- `make run`
|
||||
- `go run ./cmd/...`
|
||||
- `cargo run`
|
||||
- `npm run build`
|
||||
|
||||
Cela signifie que le garde est utile pour bloquer les commandes directes manifestement dangereuses, mais ce n'est **pas** un bac à sable complet pour les pipelines de build non vérifiés. Si votre modèle de menace inclut du code non fiable dans l'espace de travail, utilisez une isolation plus forte comme des conteneurs, des VM ou un flux d'approbation autour des commandes de build et d'exécution.
|
||||
|
||||
### Exemple de configuration
|
||||
|
||||
```json
|
||||
{
|
||||
"tools": {
|
||||
"exec": {
|
||||
"enable_deny_patterns": true,
|
||||
"custom_deny_patterns": [
|
||||
"\\brm\\s+-r\\b",
|
||||
"\\bkillall\\s+python"
|
||||
]
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Outil Cron
|
||||
|
||||
L'outil cron est utilisé pour planifier des tâches périodiques.
|
||||
|
||||
| Config | Type | Par défaut | Description |
|
||||
|------------------------|------|------------|----------------------------------------------------|
|
||||
| `exec_timeout_minutes` | int | 5 | Délai d'expiration en minutes, 0 signifie sans limite |
|
||||
|
||||
<a id="mcp-tool"></a>
|
||||
## Outil MCP
|
||||
|
||||
L'outil MCP permet l'intégration avec des serveurs Model Context Protocol externes.
|
||||
|
||||
### Découverte d'outils (chargement paresseux)
|
||||
|
||||
Lors de la connexion à plusieurs serveurs MCP, exposer simultanément des centaines d'outils peut épuiser la fenêtre de contexte du LLM et augmenter les coûts API. La fonctionnalité **Discovery** résout ce problème en gardant les outils MCP *masqués* par défaut.
|
||||
|
||||
Au lieu de charger tous les outils, le LLM reçoit un outil de recherche léger (utilisant la correspondance par mots-clés BM25 ou les expressions régulières). Lorsque le LLM a besoin d'une capacité spécifique, il recherche dans la bibliothèque masquée. Les outils correspondants sont alors temporairement « déverrouillés » et injectés dans le contexte pour un nombre configuré de tours (`ttl`).
|
||||
|
||||
### Configuration globale
|
||||
|
||||
| Config | Type | Par défaut | Description |
|
||||
|-------------|--------|------------|----------------------------------------------|
|
||||
| `enabled` | bool | false | Activer l'intégration MCP globalement |
|
||||
| `discovery` | object | `{}` | Configuration de la découverte d'outils (voir ci-dessous) |
|
||||
| `servers` | object | `{}` | Mappage du nom de serveur à la configuration du serveur |
|
||||
|
||||
### Configuration Discovery (`discovery`)
|
||||
|
||||
| Config | Type | Par défaut | Description |
|
||||
|----------------------|------|------------|-----------------------------------------------------------------------------------------------------------------------------------|
|
||||
| `enabled` | bool | false | Si true, les outils MCP sont masqués et chargés à la demande via la recherche. Si false, tous les outils sont chargés |
|
||||
| `ttl` | int | 5 | Nombre de tours de conversation pendant lesquels un outil découvert reste déverrouillé |
|
||||
| `max_search_results` | int | 5 | Nombre maximum d'outils retournés par requête de recherche |
|
||||
| `use_bm25` | bool | true | Activer l'outil de recherche par langage naturel/mots-clés (`tool_search_tool_bm25`). **Attention** : consomme plus de ressources que la recherche regex |
|
||||
| `use_regex` | bool | false | Activer l'outil de recherche par motif regex (`tool_search_tool_regex`) |
|
||||
|
||||
> **Note :** Si `discovery.enabled` est `true`, vous **devez** activer au moins un moteur de recherche (`use_bm25` ou `use_regex`),
|
||||
> sinon l'application ne démarrera pas.
|
||||
|
||||
### Configuration par serveur
|
||||
|
||||
| Config | Type | Requis | Description |
|
||||
|------------|--------|----------|--------------------------------------------|
|
||||
| `enabled` | bool | oui | Activer ce serveur MCP |
|
||||
| `type` | string | non | Type de transport : `stdio`, `sse`, `http` |
|
||||
| `command` | string | stdio | Commande exécutable pour le transport stdio |
|
||||
| `args` | array | non | Arguments de commande pour le transport stdio |
|
||||
| `env` | object | non | Variables d'environnement pour le processus stdio |
|
||||
| `env_file` | string | non | Chemin vers le fichier d'environnement pour le processus stdio |
|
||||
| `url` | string | sse/http | URL du point de terminaison pour le transport `sse`/`http` |
|
||||
| `headers` | object | non | En-têtes HTTP pour le transport `sse`/`http` |
|
||||
|
||||
### Comportement du transport
|
||||
|
||||
- Si `type` est omis, le transport est détecté automatiquement :
|
||||
- `url` est défini → `sse`
|
||||
- `command` est défini → `stdio`
|
||||
- `http` et `sse` utilisent tous deux `url` + `headers` optionnels.
|
||||
- `env` et `env_file` ne sont appliqués qu'aux serveurs `stdio`.
|
||||
|
||||
### Exemples de configuration
|
||||
|
||||
#### 1) Serveur MCP Stdio
|
||||
|
||||
```json
|
||||
{
|
||||
"tools": {
|
||||
"mcp": {
|
||||
"enabled": true,
|
||||
"servers": {
|
||||
"filesystem": {
|
||||
"enabled": true,
|
||||
"command": "npx",
|
||||
"args": [
|
||||
"-y",
|
||||
"@modelcontextprotocol/server-filesystem",
|
||||
"/tmp"
|
||||
]
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
#### 2) Serveur MCP distant SSE/HTTP
|
||||
|
||||
```json
|
||||
{
|
||||
"tools": {
|
||||
"mcp": {
|
||||
"enabled": true,
|
||||
"servers": {
|
||||
"remote-mcp": {
|
||||
"enabled": true,
|
||||
"type": "sse",
|
||||
"url": "https://example.com/mcp",
|
||||
"headers": {
|
||||
"Authorization": "Bearer YOUR_TOKEN"
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
#### 3) Configuration MCP massive avec découverte d'outils activée
|
||||
|
||||
*Dans cet exemple, le LLM ne verra que `tool_search_tool_bm25`. Il recherchera et déverrouillera dynamiquement les outils Github ou Postgres uniquement lorsque l'utilisateur le demande.*
|
||||
|
||||
```json
|
||||
{
|
||||
"tools": {
|
||||
"mcp": {
|
||||
"enabled": true,
|
||||
"discovery": {
|
||||
"enabled": true,
|
||||
"ttl": 5,
|
||||
"max_search_results": 5,
|
||||
"use_bm25": true,
|
||||
"use_regex": false
|
||||
},
|
||||
"servers": {
|
||||
"github": {
|
||||
"enabled": true,
|
||||
"command": "npx",
|
||||
"args": [
|
||||
"-y",
|
||||
"@modelcontextprotocol/server-github"
|
||||
],
|
||||
"env": {
|
||||
"GITHUB_PERSONAL_ACCESS_TOKEN": "YOUR_GITHUB_TOKEN"
|
||||
}
|
||||
},
|
||||
"postgres": {
|
||||
"enabled": true,
|
||||
"command": "npx",
|
||||
"args": [
|
||||
"-y",
|
||||
"@modelcontextprotocol/server-postgres",
|
||||
"postgresql://user:password@localhost/dbname"
|
||||
]
|
||||
},
|
||||
"slack": {
|
||||
"enabled": true,
|
||||
"type": "slack",
|
||||
"command": "npx",
|
||||
"args": [
|
||||
"-y",
|
||||
"@modelcontextprotocol/server-slack"
|
||||
],
|
||||
"env": {
|
||||
"SLACK_BOT_TOKEN": "YOUR_SLACK_BOT_TOKEN",
|
||||
"SLACK_TEAM_ID": "YOUR_SLACK_TEAM_ID"
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
<a id="skills-tool"></a>
|
||||
## Outil Skills
|
||||
|
||||
L'outil skills configure la découverte et l'installation de compétences via des registres comme ClawHub.
|
||||
|
||||
### Registres
|
||||
|
||||
| Config | Type | Par défaut | Description |
|
||||
|------------------------------------|--------|----------------------|----------------------------------------------|
|
||||
| `registries.clawhub.enabled` | bool | true | Activer le registre ClawHub |
|
||||
| `registries.clawhub.base_url` | string | `https://clawhub.ai` | URL de base ClawHub |
|
||||
| `registries.clawhub.auth_token` | string | `""` | Jeton Bearer optionnel pour des limites de débit plus élevées |
|
||||
| `registries.clawhub.search_path` | string | `/api/v1/search` | Chemin de l'API de recherche |
|
||||
| `registries.clawhub.skills_path` | string | `/api/v1/skills` | Chemin de l'API Skills |
|
||||
| `registries.clawhub.download_path` | string | `/api/v1/download` | Chemin de l'API de téléchargement |
|
||||
|
||||
### Exemple de configuration
|
||||
|
||||
```json
|
||||
{
|
||||
"tools": {
|
||||
"skills": {
|
||||
"registries": {
|
||||
"clawhub": {
|
||||
"enabled": true,
|
||||
"base_url": "https://clawhub.ai",
|
||||
"auth_token": "",
|
||||
"search_path": "/api/v1/search",
|
||||
"skills_path": "/api/v1/skills",
|
||||
"download_path": "/api/v1/download"
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Variables d'environnement
|
||||
|
||||
Toutes les options de configuration peuvent être remplacées via des variables d'environnement au format `PICOCLAW_TOOLS_<SECTION>_<KEY>` :
|
||||
|
||||
Par exemple :
|
||||
|
||||
- `PICOCLAW_TOOLS_WEB_BRAVE_ENABLED=true`
|
||||
- `PICOCLAW_TOOLS_EXEC_ENABLED=false`
|
||||
- `PICOCLAW_TOOLS_EXEC_ENABLE_DENY_PATTERNS=false`
|
||||
- `PICOCLAW_TOOLS_CRON_EXEC_TIMEOUT_MINUTES=10`
|
||||
- `PICOCLAW_TOOLS_MCP_ENABLED=true`
|
||||
|
||||
Note : La configuration de type map imbriquée (par exemple `tools.mcp.servers.<name>.*`) est configurée dans `config.json` plutôt que via des variables d'environnement.
|
||||
@@ -0,0 +1,415 @@
|
||||
# 🔧 ツール設定
|
||||
|
||||
> [README](../project/README.ja.md) に戻る
|
||||
|
||||
PicoClaw のツール設定は `config.json` の `tools` フィールドにあります。
|
||||
|
||||
## ディレクトリ構造
|
||||
|
||||
```json
|
||||
{
|
||||
"tools": {
|
||||
"web": {
|
||||
...
|
||||
},
|
||||
"mcp": {
|
||||
...
|
||||
},
|
||||
"exec": {
|
||||
...
|
||||
},
|
||||
"cron": {
|
||||
...
|
||||
},
|
||||
"skills": {
|
||||
...
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Web ツール
|
||||
|
||||
Web ツールはウェブ検索とフェッチに使用されます。
|
||||
|
||||
### Web Fetcher
|
||||
ウェブページコンテンツの取得と処理に関する一般設定。
|
||||
|
||||
| 設定項目 | 型 | デフォルト | 説明 |
|
||||
|---------------------|--------|---------------|----------------------------------------------------------------------------------------|
|
||||
| `enabled` | bool | true | ウェブページ取得機能を有効にする。 |
|
||||
| `fetch_limit_bytes` | int | 10485760 | 取得するウェブページペイロードの最大サイズ(バイト単位、デフォルトは10MB)。 |
|
||||
| `format` | string | "plaintext" | 取得コンテンツの出力形式。オプション:`plaintext` または `markdown`(推奨)。 |
|
||||
|
||||
### DuckDuckGo
|
||||
|
||||
| 設定項目 | 型 | デフォルト | 説明 |
|
||||
|---------------|------|------------|---------------------------|
|
||||
| `enabled` | bool | true | DuckDuckGo 検索を有効にする |
|
||||
| `max_results` | int | 5 | 最大結果数 |
|
||||
|
||||
### Baidu Search
|
||||
|
||||
| 設定項目 | 型 | デフォルト | 説明 |
|
||||
|---------------|--------|-----------------------------------------------------------------|-------------------------------|
|
||||
| `enabled` | bool | false | Baidu 検索を有効にする |
|
||||
| `api_key` | string | - | Qianfan API キー |
|
||||
| `base_url` | string | `https://qianfan.baidubce.com/v2/ai_search/web_search` | Baidu Search API URL |
|
||||
| `max_results` | int | 10 | 最大結果数 |
|
||||
|
||||
```json
|
||||
{
|
||||
"tools": {
|
||||
"web": {
|
||||
"baidu_search": {
|
||||
"enabled": true,
|
||||
"api_key": "YOUR_BAIDU_QIANFAN_API_KEY",
|
||||
"max_results": 10
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Perplexity
|
||||
|
||||
| 設定項目 | 型 | デフォルト | 説明 |
|
||||
|---------------|--------|------------|---------------------------|
|
||||
| `enabled` | bool | false | Perplexity 検索を有効にする |
|
||||
| `api_key` | string | - | Perplexity API キー |
|
||||
| `api_keys` | string[] | - | 複数の Perplexity API キー(ローテーション用、`api_key` より優先) |
|
||||
| `max_results` | int | 5 | 最大結果数 |
|
||||
|
||||
### Brave
|
||||
|
||||
| 設定項目 | 型 | デフォルト | 説明 |
|
||||
|---------------|--------|------------|-----------------------|
|
||||
| `enabled` | bool | false | Brave 検索を有効にする |
|
||||
| `api_key` | string | - | Brave Search API キー |
|
||||
| `api_keys` | string[] | - | 複数の Brave Search API キー(ローテーション用、`api_key` より優先) |
|
||||
| `max_results` | int | 5 | 最大結果数 |
|
||||
|
||||
### Tavily
|
||||
|
||||
| 設定項目 | 型 | デフォルト | 説明 |
|
||||
|---------------|--------|------------|-----------------------------------|
|
||||
| `enabled` | bool | false | Tavily 検索を有効にする |
|
||||
| `api_key` | string | - | Tavily API キー |
|
||||
| `base_url` | string | - | カスタム Tavily API ベース URL |
|
||||
| `max_results` | int | 0 | 最大結果数(0 = デフォルト) |
|
||||
|
||||
### SearXNG
|
||||
|
||||
| 設定項目 | 型 | デフォルト | 説明 |
|
||||
|---------------|--------|--------------------------|---------------------------|
|
||||
| `enabled` | bool | false | SearXNG 検索を有効にする |
|
||||
| `base_url` | string | `http://localhost:8888` | SearXNG インスタンス URL |
|
||||
| `max_results` | int | 5 | 最大結果数 |
|
||||
|
||||
### GLM Search
|
||||
|
||||
| 設定項目 | 型 | デフォルト | 説明 |
|
||||
|-----------------|--------|------------------------------------------------------|---------------------------|
|
||||
| `enabled` | bool | false | GLM Search を有効にする |
|
||||
| `api_key` | string | - | GLM API キー |
|
||||
| `base_url` | string | `https://open.bigmodel.cn/api/paas/v4/web_search` | GLM Search API URL |
|
||||
| `search_engine` | string | `search_std` | 検索エンジンタイプ |
|
||||
| `max_results` | int | 5 | 最大結果数 |
|
||||
|
||||
## Exec ツール
|
||||
|
||||
Exec ツールはシェルコマンドの実行に使用されます。
|
||||
|
||||
| 設定項目 | 型 | デフォルト | 説明 |
|
||||
|------------------------|-------|------------|------------------------------------|
|
||||
| `enabled` | bool | true | Exec ツールを有効にする |
|
||||
| `enable_deny_patterns` | bool | true | デフォルトの危険コマンドブロックを有効にする |
|
||||
| `custom_deny_patterns` | array | [] | カスタム拒否パターン(正規表現) |
|
||||
|
||||
### Exec ツールの無効化
|
||||
|
||||
`exec` ツールを完全に無効にするには、`enabled` を `false` に設定します:
|
||||
|
||||
**設定ファイル経由:**
|
||||
```json
|
||||
{
|
||||
"tools": {
|
||||
"exec": {
|
||||
"enabled": false
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
**環境変数経由:**
|
||||
```bash
|
||||
PICOCLAW_TOOLS_EXEC_ENABLED=false
|
||||
```
|
||||
|
||||
> **注意:** 無効にすると、エージェントはシェルコマンドを実行できなくなります。これは Cron ツールがスケジュールされたシェルコマンドを実行する能力にも影響します。
|
||||
|
||||
### 機能
|
||||
|
||||
- **`enable_deny_patterns`**:`false` に設定すると、デフォルトの危険コマンドブロックパターンを完全に無効にします
|
||||
- **`custom_deny_patterns`**:カスタム拒否正規表現パターンを追加します。一致するコマンドはブロックされます
|
||||
|
||||
### デフォルトでブロックされるコマンドパターン
|
||||
|
||||
デフォルトで、PicoClaw は以下の危険なコマンドをブロックします:
|
||||
|
||||
- 削除コマンド:`rm -rf`、`del /f/q`、`rmdir /s`
|
||||
- ディスク操作:`format`、`mkfs`、`diskpart`、`dd if=`、`/dev/sd*` への書き込み
|
||||
- システム操作:`shutdown`、`reboot`、`poweroff`
|
||||
- コマンド置換:`$()`、`${}`、バッククォート
|
||||
- シェルへのパイプ:`| sh`、`| bash`
|
||||
- 権限昇格:`sudo`、`chmod`、`chown`
|
||||
- プロセス制御:`pkill`、`killall`、`kill -9`
|
||||
- リモート操作:`curl | sh`、`wget | sh`、`ssh`
|
||||
- パッケージ管理:`apt`、`yum`、`dnf`、`npm install -g`、`pip install --user`
|
||||
- コンテナ:`docker run`、`docker exec`
|
||||
- Git:`git push`、`git force`
|
||||
- その他:`eval`、`source *.sh`
|
||||
|
||||
### 既知のアーキテクチャ上の制限
|
||||
|
||||
exec ガードは PicoClaw に送信されたトップレベルのコマンドのみを検証します。そのコマンドの実行開始後にビルドツールやスクリプトが生成する子プロセスを再帰的に検査することは**ありません**。
|
||||
|
||||
初期コマンドが許可された後、直接コマンドガードをバイパスできるワークフローの例:
|
||||
|
||||
- `make run`
|
||||
- `go run ./cmd/...`
|
||||
- `cargo run`
|
||||
- `npm run build`
|
||||
|
||||
これは、明らかに危険な直接コマンドのブロックには有用ですが、未レビューのビルドパイプラインに対する完全なサンドボックスでは**ありません**。脅威モデルにワークスペース内の信頼できないコードが含まれる場合は、コンテナ、VM、またはビルド・実行コマンドに対する承認フローなど、より強力な分離を使用してください。
|
||||
|
||||
### 設定例
|
||||
|
||||
```json
|
||||
{
|
||||
"tools": {
|
||||
"exec": {
|
||||
"enable_deny_patterns": true,
|
||||
"custom_deny_patterns": [
|
||||
"\\brm\\s+-r\\b",
|
||||
"\\bkillall\\s+python"
|
||||
]
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Cron ツール
|
||||
|
||||
Cron ツールは定期タスクのスケジューリングに使用されます。
|
||||
|
||||
| 設定項目 | 型 | デフォルト | 説明 |
|
||||
|------------------------|-----|------------|-----------------------------------------|
|
||||
| `exec_timeout_minutes` | int | 5 | 実行タイムアウト(分)、0 は無制限 |
|
||||
|
||||
<a id="mcp-tool"></a>
|
||||
## MCP ツール
|
||||
|
||||
MCP ツールは外部の Model Context Protocol サーバーとの統合を可能にします。
|
||||
|
||||
### ツールディスカバリ(遅延読み込み)
|
||||
|
||||
複数の MCP サーバーに接続する場合、数百のツールを同時に公開すると LLM のコンテキストウィンドウを使い果たし、API コストが増加する可能性があります。**Discovery** 機能は、MCP ツールをデフォルトで*非表示*にすることでこの問題を解決します。
|
||||
|
||||
すべてのツールを読み込む代わりに、LLM には軽量な検索ツール(BM25 キーワードマッチングまたは正規表現を使用)が提供されます。LLM が特定の機能を必要とする場合、非表示のライブラリを検索します。一致するツールは一時的に「アンロック」され、設定されたターン数(`ttl`)の間コンテキストに注入されます。
|
||||
|
||||
### グローバル設定
|
||||
|
||||
| 設定項目 | 型 | デフォルト | 説明 |
|
||||
|-------------|--------|------------|--------------------------------------|
|
||||
| `enabled` | bool | false | MCP 統合をグローバルに有効にする |
|
||||
| `discovery` | object | `{}` | ツールディスカバリ設定(下記参照) |
|
||||
| `servers` | object | `{}` | サーバー名からサーバー設定へのマップ |
|
||||
|
||||
### Discovery 設定(`discovery`)
|
||||
|
||||
| 設定項目 | 型 | デフォルト | 説明 |
|
||||
|----------------------|------|------------|---------------------------------------------------------------------------------------------------------------|
|
||||
| `enabled` | bool | false | true の場合、MCP ツールは非表示になり、検索を通じてオンデマンドで読み込まれます。false の場合、すべてのツールが読み込まれます |
|
||||
| `ttl` | int | 5 | 発見されたツールがアンロック状態を維持する会話ターン数 |
|
||||
| `max_search_results` | int | 5 | 検索クエリごとに返されるツールの最大数 |
|
||||
| `use_bm25` | bool | true | 自然言語/キーワード検索ツール(`tool_search_tool_bm25`)を有効にする。**警告**:正規表現検索よりリソースを消費します |
|
||||
| `use_regex` | bool | false | 正規表現パターン検索ツール(`tool_search_tool_regex`)を有効にする |
|
||||
|
||||
> **注意:** `discovery.enabled` が `true` の場合、少なくとも1つの検索エンジン(`use_bm25` または `use_regex`)を有効にする**必要があります**。
|
||||
> そうしないとアプリケーションの起動に失敗します。
|
||||
|
||||
### サーバーごとの設定
|
||||
|
||||
| 設定項目 | 型 | 必須 | 説明 |
|
||||
|------------|--------|----------|----------------------------------------|
|
||||
| `enabled` | bool | はい | この MCP サーバーを有効にする |
|
||||
| `type` | string | いいえ | トランスポートタイプ:`stdio`、`sse`、`http` |
|
||||
| `command` | string | stdio | stdio トランスポートの実行コマンド |
|
||||
| `args` | array | いいえ | stdio トランスポートのコマンド引数 |
|
||||
| `env` | object | いいえ | stdio プロセスの環境変数 |
|
||||
| `env_file` | string | いいえ | stdio プロセスの環境ファイルパス |
|
||||
| `url` | string | sse/http | `sse`/`http` トランスポートのエンドポイント URL |
|
||||
| `headers` | object | いいえ | `sse`/`http` トランスポートの HTTP ヘッダー |
|
||||
|
||||
### トランスポートの動作
|
||||
|
||||
- `type` を省略した場合、トランスポートは自動検出されます:
|
||||
- `url` が設定されている → `sse`
|
||||
- `command` が設定されている → `stdio`
|
||||
- `http` と `sse` はどちらも `url` + オプションの `headers` を使用します。
|
||||
- `env` と `env_file` は `stdio` サーバーにのみ適用されます。
|
||||
|
||||
### 設定例
|
||||
|
||||
#### 1) Stdio MCP サーバー
|
||||
|
||||
```json
|
||||
{
|
||||
"tools": {
|
||||
"mcp": {
|
||||
"enabled": true,
|
||||
"servers": {
|
||||
"filesystem": {
|
||||
"enabled": true,
|
||||
"command": "npx",
|
||||
"args": [
|
||||
"-y",
|
||||
"@modelcontextprotocol/server-filesystem",
|
||||
"/tmp"
|
||||
]
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
#### 2) リモート SSE/HTTP MCP サーバー
|
||||
|
||||
```json
|
||||
{
|
||||
"tools": {
|
||||
"mcp": {
|
||||
"enabled": true,
|
||||
"servers": {
|
||||
"remote-mcp": {
|
||||
"enabled": true,
|
||||
"type": "sse",
|
||||
"url": "https://example.com/mcp",
|
||||
"headers": {
|
||||
"Authorization": "Bearer YOUR_TOKEN"
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
#### 3) ツールディスカバリを有効にした大規模 MCP セットアップ
|
||||
|
||||
*この例では、LLM は `tool_search_tool_bm25` のみを認識します。ユーザーからリクエストがあった場合にのみ、Github や Postgres のツールを動的に検索してアンロックします。*
|
||||
|
||||
```json
|
||||
{
|
||||
"tools": {
|
||||
"mcp": {
|
||||
"enabled": true,
|
||||
"discovery": {
|
||||
"enabled": true,
|
||||
"ttl": 5,
|
||||
"max_search_results": 5,
|
||||
"use_bm25": true,
|
||||
"use_regex": false
|
||||
},
|
||||
"servers": {
|
||||
"github": {
|
||||
"enabled": true,
|
||||
"command": "npx",
|
||||
"args": [
|
||||
"-y",
|
||||
"@modelcontextprotocol/server-github"
|
||||
],
|
||||
"env": {
|
||||
"GITHUB_PERSONAL_ACCESS_TOKEN": "YOUR_GITHUB_TOKEN"
|
||||
}
|
||||
},
|
||||
"postgres": {
|
||||
"enabled": true,
|
||||
"command": "npx",
|
||||
"args": [
|
||||
"-y",
|
||||
"@modelcontextprotocol/server-postgres",
|
||||
"postgresql://user:password@localhost/dbname"
|
||||
]
|
||||
},
|
||||
"slack": {
|
||||
"enabled": true,
|
||||
"type": "slack",
|
||||
"command": "npx",
|
||||
"args": [
|
||||
"-y",
|
||||
"@modelcontextprotocol/server-slack"
|
||||
],
|
||||
"env": {
|
||||
"SLACK_BOT_TOKEN": "YOUR_SLACK_BOT_TOKEN",
|
||||
"SLACK_TEAM_ID": "YOUR_SLACK_TEAM_ID"
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
<a id="skills-tool"></a>
|
||||
## Skills ツール
|
||||
|
||||
Skills ツールは ClawHub などのレジストリを通じたスキルの発見とインストールを設定します。
|
||||
|
||||
### レジストリ
|
||||
|
||||
| 設定項目 | 型 | デフォルト | 説明 |
|
||||
|------------------------------------|--------|----------------------|----------------------------------------------|
|
||||
| `registries.clawhub.enabled` | bool | true | ClawHub レジストリを有効にする |
|
||||
| `registries.clawhub.base_url` | string | `https://clawhub.ai` | ClawHub ベース URL |
|
||||
| `registries.clawhub.auth_token` | string | `""` | より高いレート制限のためのオプションの Bearer トークン |
|
||||
| `registries.clawhub.search_path` | string | `/api/v1/search` | 検索 API パス |
|
||||
| `registries.clawhub.skills_path` | string | `/api/v1/skills` | Skills API パス |
|
||||
| `registries.clawhub.download_path` | string | `/api/v1/download` | ダウンロード API パス |
|
||||
|
||||
### 設定例
|
||||
|
||||
```json
|
||||
{
|
||||
"tools": {
|
||||
"skills": {
|
||||
"registries": {
|
||||
"clawhub": {
|
||||
"enabled": true,
|
||||
"base_url": "https://clawhub.ai",
|
||||
"auth_token": "",
|
||||
"search_path": "/api/v1/search",
|
||||
"skills_path": "/api/v1/skills",
|
||||
"download_path": "/api/v1/download"
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## 環境変数
|
||||
|
||||
すべての設定オプションは `PICOCLAW_TOOLS_<SECTION>_<KEY>` 形式の環境変数で上書きできます:
|
||||
|
||||
例:
|
||||
|
||||
- `PICOCLAW_TOOLS_WEB_BRAVE_ENABLED=true`
|
||||
- `PICOCLAW_TOOLS_EXEC_ENABLED=false`
|
||||
- `PICOCLAW_TOOLS_EXEC_ENABLE_DENY_PATTERNS=false`
|
||||
- `PICOCLAW_TOOLS_CRON_EXEC_TIMEOUT_MINUTES=10`
|
||||
- `PICOCLAW_TOOLS_MCP_ENABLED=true`
|
||||
|
||||
注意:ネストされたマップ形式の設定(例:`tools.mcp.servers.<name>.*`)は環境変数ではなく `config.json` で設定します。
|
||||
@@ -0,0 +1,557 @@
|
||||
# Tools Configuration
|
||||
|
||||
PicoClaw's tools configuration is located in the `tools` field of `config.json`.
|
||||
|
||||
## Directory Structure
|
||||
|
||||
```json
|
||||
{
|
||||
"tools": {
|
||||
"web": {
|
||||
...
|
||||
},
|
||||
"mcp": {
|
||||
...
|
||||
},
|
||||
"exec": {
|
||||
...
|
||||
},
|
||||
"cron": {
|
||||
...
|
||||
},
|
||||
"skills": {
|
||||
...
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Sensitive Data Filtering
|
||||
|
||||
Before tool results are sent to the LLM, PicoClaw can filter sensitive values (API keys, tokens, secrets) from the output. This prevents the LLM from seeing its own credentials.
|
||||
|
||||
See [Sensitive Data Filtering](../security/sensitive_data_filtering.md) for full documentation.
|
||||
|
||||
| Config | Type | Default | Description |
|
||||
|--------|------|---------|-------------|
|
||||
| `filter_sensitive_data` | bool | `true` | Enable/disable filtering |
|
||||
| `filter_min_length` | int | `8` | Minimum content length to trigger filtering |
|
||||
|
||||
## Web Tools
|
||||
|
||||
Web tools are used for web search and fetching.
|
||||
|
||||
### Web Fetcher
|
||||
General settings for fetching and processing webpage content.
|
||||
|
||||
| Config | Type | Default | Description |
|
||||
|---------------------|--------|---------------|-----------------------------------------------------------------------------------------------|
|
||||
| `enabled` | bool | true | Enable the webpage fetching capability. |
|
||||
| `fetch_limit_bytes` | int | 10485760 | Maximum size of the webpage payload to fetch, in bytes (default is 10MB). |
|
||||
| `format` | string | "plaintext" | Output format of the fetched content. Options: `plaintext` or `markdown` (recommended). |
|
||||
|
||||
### Brave
|
||||
|
||||
| Config | Type | Default | Description |
|
||||
|---------------|----------|---------|------------------------------------------------|
|
||||
| `enabled` | bool | false | Enable Brave search |
|
||||
| `api_key` | string | - | Brave Search API key |
|
||||
| `api_keys` | string[] | - | Multiple API keys for rotation (takes priority over `api_key`) |
|
||||
| `max_results` | int | 5 | Maximum number of results |
|
||||
|
||||
### DuckDuckGo
|
||||
|
||||
| Config | Type | Default | Description |
|
||||
|---------------|------|---------|---------------------------|
|
||||
| `enabled` | bool | true | Enable DuckDuckGo search |
|
||||
| `max_results` | int | 5 | Maximum number of results |
|
||||
|
||||
### Baidu Search
|
||||
|
||||
Baidu Search uses the [Qianfan AI Search API](https://cloud.baidu.com/doc/qianfan-api/s/Wmbq4z7e5), which is AI-powered and optimized for Chinese-language queries.
|
||||
|
||||
| Config | Type | Default | Description |
|
||||
|---------------|--------|--------------------------------------------------------|---------------------------|
|
||||
| `enabled` | bool | false | Enable Baidu Search |
|
||||
| `api_key` | string | - | Qianfan API key |
|
||||
| `base_url` | string | `https://qianfan.baidubce.com/v2/ai_search/web_search` | Baidu Search API URL |
|
||||
| `max_results` | int | 5 | Maximum number of results |
|
||||
|
||||
```json
|
||||
{
|
||||
"tools": {
|
||||
"web": {
|
||||
"baidu_search": {
|
||||
"enabled": true,
|
||||
"api_key": "YOUR_BAIDU_QIANFAN_API_KEY",
|
||||
"max_results": 10
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Perplexity
|
||||
|
||||
| Config | Type | Default | Description |
|
||||
|---------------|----------|---------|------------------------------------------------|
|
||||
| `enabled` | bool | false | Enable Perplexity search |
|
||||
| `api_key` | string | - | Perplexity API key |
|
||||
| `api_keys` | string[] | - | Multiple API keys for rotation (takes priority over `api_key`) |
|
||||
| `max_results` | int | 5 | Maximum number of results |
|
||||
|
||||
### Tavily
|
||||
|
||||
| Config | Type | Default | Description |
|
||||
|---------------|--------|---------|---------------------------|
|
||||
| `enabled` | bool | false | Enable Tavily search |
|
||||
| `api_key` | string | - | Tavily API key |
|
||||
| `base_url` | string | - | Custom Tavily API base URL |
|
||||
| `max_results` | int | 5 | Maximum number of results |
|
||||
|
||||
### SearXNG
|
||||
|
||||
| Config | Type | Default | Description |
|
||||
|---------------|--------|-------------------------|---------------------------|
|
||||
| `enabled` | bool | false | Enable SearXNG search |
|
||||
| `base_url` | string | `http://localhost:8888` | SearXNG instance URL |
|
||||
| `max_results` | int | 5 | Maximum number of results |
|
||||
|
||||
### GLM Search
|
||||
|
||||
| Config | Type | Default | Description |
|
||||
|-----------------|--------|---------------------------------------------------|---------------------------|
|
||||
| `enabled` | bool | false | Enable GLM Search |
|
||||
| `api_key` | string | - | GLM API key |
|
||||
| `base_url` | string | `https://open.bigmodel.cn/api/paas/v4/web_search` | GLM Search API URL |
|
||||
| `search_engine` | string | `search_std` | Search engine type |
|
||||
| `max_results` | int | 5 | Maximum number of results |
|
||||
|
||||
### Additional Web Settings
|
||||
|
||||
| Config | Type | Default | Description |
|
||||
|--------------------------|----------|---------|----------------------------------------------------------------|
|
||||
| `prefer_native` | bool | true | Prefer provider's native search over configured search engines |
|
||||
| `private_host_whitelist` | string[] | `[]` | Private/internal hosts allowed for web fetching |
|
||||
|
||||
### `web_search` Tool Parameters
|
||||
|
||||
At runtime, the `web_search` tool accepts the following parameters:
|
||||
|
||||
| Field | Type | Required | Description |
|
||||
|-------|------|----------|-------------|
|
||||
| `query` | string | yes | Search query string |
|
||||
| `count` | integer | no | Number of results to return. Default: `10`, max: `10` |
|
||||
| `range` | string | no | Optional time filter: `d` (day), `w` (week), `m` (month), `y` (year) |
|
||||
|
||||
If `range` is omitted, PicoClaw performs an unrestricted search.
|
||||
|
||||
### Example `web_search` Call
|
||||
|
||||
```json
|
||||
{
|
||||
"query": "ai agent news",
|
||||
"count": 10,
|
||||
"range": "w"
|
||||
}
|
||||
```
|
||||
|
||||
## Exec Tool
|
||||
|
||||
The exec tool is used to execute shell commands.
|
||||
|
||||
| Config | Type | Default | Description |
|
||||
|------------------------|-------|---------|--------------------------------------------|
|
||||
| `enabled` | bool | true | Enable the exec tool |
|
||||
| `enable_deny_patterns` | bool | true | Enable default dangerous command blocking |
|
||||
| `custom_deny_patterns` | array | [] | Custom deny patterns (regular expressions) |
|
||||
|
||||
### Disabling the Exec Tool
|
||||
|
||||
To completely disable the `exec` tool, set `enabled` to `false`:
|
||||
|
||||
**Via config file:**
|
||||
```json
|
||||
{
|
||||
"tools": {
|
||||
"exec": {
|
||||
"enabled": false
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
**Via environment variable:**
|
||||
```bash
|
||||
PICOCLAW_TOOLS_EXEC_ENABLED=false
|
||||
```
|
||||
|
||||
> **Note:** When disabled, the agent will not be able to execute shell commands. This also affects the Cron tool's ability to run scheduled shell commands.
|
||||
|
||||
### Functionality
|
||||
|
||||
- **`enable_deny_patterns`**: Set to `false` to completely disable the default dangerous command blocking patterns
|
||||
- **`custom_deny_patterns`**: Add custom deny regex patterns; commands matching these will be blocked
|
||||
|
||||
### Default Blocked Command Patterns
|
||||
|
||||
By default, PicoClaw blocks the following dangerous commands:
|
||||
|
||||
- Delete commands: `rm -rf`, `del /f/q`, `rmdir /s`
|
||||
- Disk operations: `format`, `mkfs`, `diskpart`, `dd if=`, writing to `/dev/sd*`
|
||||
- System operations: `shutdown`, `reboot`, `poweroff`
|
||||
- Command substitution: `$()`, `${}`, backticks
|
||||
- Pipe to shell: `| sh`, `| bash`
|
||||
- Privilege escalation: `sudo`, `chmod`, `chown`
|
||||
- Process control: `pkill`, `killall`, `kill -9`
|
||||
- Remote operations: `curl | sh`, `wget | sh`, `ssh`
|
||||
- Package management: `apt`, `yum`, `dnf`, `npm install -g`, `pip install --user`
|
||||
- Containers: `docker run`, `docker exec`
|
||||
- Git: `git push`, `git force`
|
||||
- Other: `eval`, `source *.sh`
|
||||
|
||||
### Known Architectural Limitation
|
||||
|
||||
The exec guard only validates the top-level command sent to PicoClaw. It does **not** recursively inspect child
|
||||
processes spawned by build tools or scripts after that command starts running.
|
||||
|
||||
Examples of workflows that can bypass the direct command guard once the initial command is allowed:
|
||||
|
||||
- `make run`
|
||||
- `go run ./cmd/...`
|
||||
- `cargo run`
|
||||
- `npm run build`
|
||||
|
||||
This means the guard is useful for blocking obviously dangerous direct commands, but it is **not** a full sandbox for
|
||||
unreviewed build pipelines. If your threat model includes untrusted code in the workspace, use stronger isolation such
|
||||
as containers, VMs, or an approval flow around build-and-run commands.
|
||||
|
||||
### Configuration Example
|
||||
|
||||
```json
|
||||
{
|
||||
"tools": {
|
||||
"exec": {
|
||||
"enable_deny_patterns": true,
|
||||
"custom_deny_patterns": [
|
||||
"\\brm\\s+-r\\b",
|
||||
"\\bkillall\\s+python"
|
||||
]
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Cron Tool
|
||||
|
||||
The cron tool is used for scheduling periodic tasks.
|
||||
|
||||
| Config | Type | Default | Description |
|
||||
|------------------------|------|---------|------------------------------------------------|
|
||||
| `enabled` | bool | true | Register the agent-facing cron tool |
|
||||
| `allow_command` | bool | true | Allow command jobs without extra confirmation |
|
||||
| `exec_timeout_minutes` | int | 5 | Execution timeout in minutes, 0 means no limit |
|
||||
|
||||
For schedule types, execution modes (`deliver`, agent turn, and command jobs), persistence, and the current command-security gates, see [Scheduled Tasks and Cron Jobs](cron.md).
|
||||
|
||||
## MCP Tool
|
||||
|
||||
The MCP tool enables integration with external Model Context Protocol servers.
|
||||
|
||||
### Tool Discovery (Lazy Loading)
|
||||
|
||||
When connecting to multiple MCP servers, exposing hundreds of tools simultaneously can exhaust the LLM's context window
|
||||
and increase API costs. The **Discovery** feature solves this by keeping MCP tools *hidden* by default.
|
||||
|
||||
Instead of loading all tools, the LLM is provided with a lightweight search tool (using BM25 keyword matching or Regex).
|
||||
When the LLM needs a specific capability, it searches the hidden library. Matching tools are then temporarily "unlocked"
|
||||
and injected into the context for a configured number of turns (`ttl`).
|
||||
|
||||
### Global Config
|
||||
|
||||
| Config | Type | Default | Description |
|
||||
|-------------|--------|---------|----------------------------------------------|
|
||||
| `enabled` | bool | false | Enable MCP integration globally |
|
||||
| `discovery` | object | `{}` | Configuration for Tool Discovery (see below) |
|
||||
| `servers` | object | `{}` | Map of server name to server config |
|
||||
|
||||
### Discovery Config (`discovery`)
|
||||
|
||||
| Config | Type | Default | Description |
|
||||
|----------------------|------|---------|-----------------------------------------------------------------------------------------------------------------------------------|
|
||||
| `enabled` | bool | false | Global default: if `true`, all MCP tools are hidden and loaded on-demand via search; if `false`, all tools are loaded into context. Individual servers can override this with the per-server `deferred` field. |
|
||||
| `ttl` | int | 5 | Number of conversational turns a discovered tool remains unlocked |
|
||||
| `max_search_results` | int | 5 | Maximum number of tools returned per search query |
|
||||
| `use_bm25` | bool | true | Enable the natural language/keyword search tool (`tool_search_tool_bm25`). **Warning**: consumes more resources than regex search |
|
||||
| `use_regex` | bool | false | Enable the regex pattern search tool (`tool_search_tool_regex`) |
|
||||
|
||||
> **Note:** If `discovery.enabled` is `true`, you MUST enable at least one search engine (`use_bm25` or `use_regex`),
|
||||
> otherwise the application will fail to start.
|
||||
|
||||
### Per-Server Config
|
||||
|
||||
| Config | Type | Required | Description |
|
||||
|------------|---------|----------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------|
|
||||
| `enabled` | bool | yes | Enable this MCP server |
|
||||
| `deferred` | bool | no | Override deferred mode for this server only. `true` = tools are hidden and discoverable via search; `false` = tools are always visible in context. When omitted, the global `discovery.enabled` value applies. |
|
||||
| `type` | string | no | Transport type: `stdio`, `sse`, `http` |
|
||||
| `command` | string | stdio | Executable command for stdio transport |
|
||||
| `args` | array | no | Command arguments for stdio transport |
|
||||
| `env` | object | no | Environment variables for stdio process |
|
||||
| `env_file` | string | no | Path to environment file for stdio process |
|
||||
| `url` | string | sse/http | Endpoint URL for `sse`/`http` transport |
|
||||
| `headers` | object | no | HTTP headers for `sse`/`http` transport |
|
||||
|
||||
### Transport Behavior
|
||||
|
||||
- If `type` is omitted, transport is auto-detected:
|
||||
- `url` is set → `sse`
|
||||
- `command` is set → `stdio`
|
||||
- `http` and `sse` both use `url` + optional `headers`.
|
||||
- `env` and `env_file` are only applied to `stdio` servers.
|
||||
|
||||
### Configuration Examples
|
||||
|
||||
#### 1) Stdio MCP server
|
||||
|
||||
```json
|
||||
{
|
||||
"tools": {
|
||||
"mcp": {
|
||||
"enabled": true,
|
||||
"servers": {
|
||||
"filesystem": {
|
||||
"enabled": true,
|
||||
"command": "npx",
|
||||
"args": [
|
||||
"-y",
|
||||
"@modelcontextprotocol/server-filesystem",
|
||||
"/tmp"
|
||||
]
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
#### 2) Remote SSE/HTTP MCP server
|
||||
|
||||
```json
|
||||
{
|
||||
"tools": {
|
||||
"mcp": {
|
||||
"enabled": true,
|
||||
"servers": {
|
||||
"remote-mcp": {
|
||||
"enabled": true,
|
||||
"type": "sse",
|
||||
"url": "https://example.com/mcp",
|
||||
"headers": {
|
||||
"Authorization": "Bearer YOUR_TOKEN"
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
#### 3) Massive MCP setup with Tool Discovery enabled
|
||||
|
||||
*In this example, the LLM will only see the `tool_search_tool_bm25`. It will search and unlock Github or Postgres tools
|
||||
dynamically only when requested by the user.*
|
||||
|
||||
```json
|
||||
{
|
||||
"tools": {
|
||||
"mcp": {
|
||||
"enabled": true,
|
||||
"discovery": {
|
||||
"enabled": true,
|
||||
"ttl": 5,
|
||||
"max_search_results": 5,
|
||||
"use_bm25": true,
|
||||
"use_regex": false
|
||||
},
|
||||
"servers": {
|
||||
"github": {
|
||||
"enabled": true,
|
||||
"command": "npx",
|
||||
"args": [
|
||||
"-y",
|
||||
"@modelcontextprotocol/server-github"
|
||||
],
|
||||
"env": {
|
||||
"GITHUB_PERSONAL_ACCESS_TOKEN": "YOUR_GITHUB_TOKEN"
|
||||
}
|
||||
},
|
||||
"postgres": {
|
||||
"enabled": true,
|
||||
"command": "npx",
|
||||
"args": [
|
||||
"-y",
|
||||
"@modelcontextprotocol/server-postgres",
|
||||
"postgresql://user:password@localhost/dbname"
|
||||
]
|
||||
},
|
||||
"slack": {
|
||||
"enabled": true,
|
||||
"type": "slack",
|
||||
"command": "npx",
|
||||
"args": [
|
||||
"-y",
|
||||
"@modelcontextprotocol/server-slack"
|
||||
],
|
||||
"env": {
|
||||
"SLACK_BOT_TOKEN": "YOUR_SLACK_BOT_TOKEN",
|
||||
"SLACK_TEAM_ID": "YOUR_SLACK_TEAM_ID"
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
#### 4) Mixed setup: per-server deferred override
|
||||
|
||||
*Discovery is enabled globally, but `filesystem` is pinned as always-visible while `context7` follows the global
|
||||
default (deferred). `aws` explicitly opts in to deferred mode even though it is the same as the global default.*
|
||||
|
||||
```json
|
||||
{
|
||||
"tools": {
|
||||
"mcp": {
|
||||
"enabled": true,
|
||||
"discovery": {
|
||||
"enabled": true,
|
||||
"ttl": 5,
|
||||
"max_search_results": 5,
|
||||
"use_bm25": true
|
||||
},
|
||||
"servers": {
|
||||
"filesystem": {
|
||||
"enabled": true,
|
||||
"command": "npx",
|
||||
"args": ["-y", "@modelcontextprotocol/server-filesystem", "/workspace"],
|
||||
"deferred": false
|
||||
},
|
||||
"context7": {
|
||||
"enabled": true,
|
||||
"command": "npx",
|
||||
"args": ["-y", "@upstash/context7-mcp"]
|
||||
},
|
||||
"aws": {
|
||||
"enabled": true,
|
||||
"command": "npx",
|
||||
"args": ["-y", "aws-mcp-server"],
|
||||
"deferred": true
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
> **Tip:** `deferred` on a per-server basis is independent of `discovery.enabled`. You can keep
|
||||
> `discovery.enabled: false` globally (all tools visible by default) and still mark individual
|
||||
> high-volume servers as `"deferred": true` to avoid polluting the context with their tools.
|
||||
|
||||
## Skills Tool
|
||||
|
||||
The skills tool configures skill discovery and installation via registries like ClawHub and GitHub.
|
||||
|
||||
### Registries
|
||||
|
||||
| Config | Type | Default | Description |
|
||||
|------------------------------------|--------|----------------------|----------------------------------------------|
|
||||
| `registries.clawhub.enabled` | bool | true | Enable ClawHub registry |
|
||||
| `registries.clawhub.base_url` | string | `https://clawhub.ai` | ClawHub base URL |
|
||||
| `registries.clawhub.auth_token` | string | `""` | Optional Bearer token for higher rate limits |
|
||||
| `registries.clawhub.search_path` | string | `""` | Search API path |
|
||||
| `registries.clawhub.skills_path` | string | `""` | Skills API path |
|
||||
| `registries.clawhub.download_path` | string | `""` | Download API path |
|
||||
| `registries.clawhub.timeout` | int | 0 | Request timeout in seconds (0 = default) |
|
||||
| `registries.clawhub.max_zip_size` | int | 0 | Max skill zip size in bytes (0 = default) |
|
||||
| `registries.clawhub.max_response_size` | int | 0 | Max API response size in bytes (0 = default) |
|
||||
| `registries.github.enabled` | bool | true | Enable GitHub installs via registry config |
|
||||
| `registries.github.base_url` | string | `https://github.com` | GitHub or GitHub Enterprise base URL |
|
||||
| `registries.github.auth_token` | string | `""` | GitHub personal access token |
|
||||
| `registries.github.proxy` | string | `""` | HTTP proxy for GitHub API requests |
|
||||
|
||||
### Legacy GitHub Config
|
||||
|
||||
`github.*` is deprecated. Use `registries.github.*` instead. The legacy fields are still supported for compatibility and will be removed later.
|
||||
|
||||
| Config | Type | Default | Description |
|
||||
|--------------------|--------|----------------------|--------------------------------|
|
||||
| `github.base_url` | string | `https://github.com` | Deprecated GitHub base URL |
|
||||
| `github.proxy` | string | `""` | Deprecated GitHub proxy |
|
||||
| `github.token` | string | `""` | Deprecated GitHub token |
|
||||
|
||||
### Search Settings
|
||||
|
||||
| Config | Type | Default | Description |
|
||||
|---------------------------|------|---------|--------------------------------------------|
|
||||
| `max_concurrent_searches` | int | 2 | Max concurrent skill search requests |
|
||||
| `search_cache.max_size` | int | 50 | Max cached search results |
|
||||
| `search_cache.ttl_seconds`| int | 300 | Cache TTL in seconds |
|
||||
|
||||
### Configuration Example
|
||||
|
||||
```json
|
||||
{
|
||||
"tools": {
|
||||
"skills": {
|
||||
"registries": {
|
||||
"clawhub": {
|
||||
"enabled": true,
|
||||
"base_url": "https://clawhub.ai",
|
||||
"auth_token": "",
|
||||
"search_path": "",
|
||||
"skills_path": "",
|
||||
"download_path": "",
|
||||
"timeout": 0,
|
||||
"max_zip_size": 0,
|
||||
"max_response_size": 0
|
||||
},
|
||||
"github": {
|
||||
"enabled": true,
|
||||
"base_url": "https://github.com",
|
||||
"auth_token": "",
|
||||
"proxy": ""
|
||||
}
|
||||
},
|
||||
"github": {
|
||||
"base_url": "https://github.com",
|
||||
"proxy": "",
|
||||
"token": ""
|
||||
},
|
||||
"max_concurrent_searches": 2,
|
||||
"search_cache": {
|
||||
"max_size": 50,
|
||||
"ttl_seconds": 300
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Environment Variables
|
||||
|
||||
All configuration options can be overridden via environment variables with the format `PICOCLAW_TOOLS_<SECTION>_<KEY>`:
|
||||
|
||||
For example:
|
||||
|
||||
- `PICOCLAW_TOOLS_WEB_BRAVE_ENABLED=true`
|
||||
- `PICOCLAW_TOOLS_EXEC_ENABLED=false`
|
||||
- `PICOCLAW_TOOLS_EXEC_ENABLE_DENY_PATTERNS=false`
|
||||
- `PICOCLAW_TOOLS_CRON_EXEC_TIMEOUT_MINUTES=10`
|
||||
- `PICOCLAW_TOOLS_MCP_ENABLED=true`
|
||||
- `PICOCLAW_TOOLS_MCP_MAX_INLINE_TEXT_CHARS=16384`
|
||||
|
||||
Note: Nested map-style config (for example `tools.mcp.servers.<name>.*`) is configured in `config.json` rather than
|
||||
environment variables.
|
||||
|
||||
For MCP tools, `tools.mcp.max_inline_text_chars` controls how much text result is kept inline in model context. The threshold is counted in Unicode characters (Go runes), not bytes. For example, `16384` means up to 16,384 characters inline, which may occupy more than 16 KB for multibyte text such as CJK. Above this threshold, PicoClaw saves the MCP text result as a local artifact in the agent workspace and gives the model a short note plus a structured `[file:...]` artifact path instead of injecting the full payload into context.
|
||||
@@ -0,0 +1,415 @@
|
||||
# 🔧 Configuração de Ferramentas
|
||||
|
||||
> Voltar ao [README](../project/README.pt-br.md)
|
||||
|
||||
A configuração de ferramentas do PicoClaw está localizada no campo `tools` do `config.json`.
|
||||
|
||||
## Estrutura de diretórios
|
||||
|
||||
```json
|
||||
{
|
||||
"tools": {
|
||||
"web": {
|
||||
...
|
||||
},
|
||||
"mcp": {
|
||||
...
|
||||
},
|
||||
"exec": {
|
||||
...
|
||||
},
|
||||
"cron": {
|
||||
...
|
||||
},
|
||||
"skills": {
|
||||
...
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Ferramentas Web
|
||||
|
||||
As ferramentas web são usadas para pesquisa e busca de páginas web.
|
||||
|
||||
### Web Fetcher
|
||||
Configurações gerais para busca e processamento de conteúdo de páginas web.
|
||||
|
||||
| Config | Tipo | Padrão | Descrição |
|
||||
|---------------------|--------|---------------|-----------------------------------------------------------------------------------------------|
|
||||
| `enabled` | bool | true | Habilitar a capacidade de busca de páginas web. |
|
||||
| `fetch_limit_bytes` | int | 10485760 | Tamanho máximo do payload da página web a ser buscado, em bytes (padrão é 10MB). |
|
||||
| `format` | string | "plaintext" | Formato de saída do conteúdo buscado. Opções: `plaintext` ou `markdown` (recomendado). |
|
||||
|
||||
### DuckDuckGo
|
||||
|
||||
| Config | Tipo | Padrão | Descrição |
|
||||
|---------------|------|--------|--------------------------------|
|
||||
| `enabled` | bool | true | Habilitar pesquisa DuckDuckGo |
|
||||
| `max_results` | int | 5 | Número máximo de resultados |
|
||||
|
||||
### Baidu Search
|
||||
|
||||
| Config | Tipo | Padrão | Descrição |
|
||||
|---------------|--------|-----------------------------------------------------------------|------------------------------------|
|
||||
| `enabled` | bool | false | Habilitar pesquisa Baidu |
|
||||
| `api_key` | string | - | Chave API Qianfan |
|
||||
| `base_url` | string | `https://qianfan.baidubce.com/v2/ai_search/web_search` | URL da API Baidu Search |
|
||||
| `max_results` | int | 10 | Número máximo de resultados |
|
||||
|
||||
```json
|
||||
{
|
||||
"tools": {
|
||||
"web": {
|
||||
"baidu_search": {
|
||||
"enabled": true,
|
||||
"api_key": "YOUR_BAIDU_QIANFAN_API_KEY",
|
||||
"max_results": 10
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Perplexity
|
||||
|
||||
| Config | Tipo | Padrão | Descrição |
|
||||
|---------------|--------|--------|--------------------------------|
|
||||
| `enabled` | bool | false | Habilitar pesquisa Perplexity |
|
||||
| `api_key` | string | - | Chave API do Perplexity |
|
||||
| `api_keys` | string[] | - | Várias chaves API do Perplexity para rotação (prioridade sobre `api_key`) |
|
||||
| `max_results` | int | 5 | Número máximo de resultados |
|
||||
|
||||
### Brave
|
||||
|
||||
| Config | Tipo | Padrão | Descrição |
|
||||
|---------------|--------|--------|----------------------------|
|
||||
| `enabled` | bool | false | Habilitar pesquisa Brave |
|
||||
| `api_key` | string | - | Chave API única do Brave Search |
|
||||
| `api_keys` | string[] | - | Várias chaves API do Brave para rotação (prioridade sobre `api_key`) |
|
||||
| `max_results` | int | 5 | Número máximo de resultados |
|
||||
|
||||
### Tavily
|
||||
|
||||
| Config | Tipo | Padrão | Descrição |
|
||||
|---------------|--------|--------|------------------------------------|
|
||||
| `enabled` | bool | false | Habilitar pesquisa Tavily |
|
||||
| `api_key` | string | - | Chave API do Tavily |
|
||||
| `base_url` | string | - | URL base personalizada do Tavily |
|
||||
| `max_results` | int | 0 | Número máximo de resultados (0 = padrão) |
|
||||
|
||||
### SearXNG
|
||||
|
||||
| Config | Tipo | Padrão | Descrição |
|
||||
|---------------|--------|--------------------------|--------------------------------|
|
||||
| `enabled` | bool | false | Habilitar pesquisa SearXNG |
|
||||
| `base_url` | string | `http://localhost:8888` | URL da instância SearXNG |
|
||||
| `max_results` | int | 5 | Número máximo de resultados |
|
||||
|
||||
### GLM Search
|
||||
|
||||
| Config | Tipo | Padrão | Descrição |
|
||||
|-----------------|--------|------------------------------------------------------|----------------------------|
|
||||
| `enabled` | bool | false | Habilitar GLM Search |
|
||||
| `api_key` | string | - | Chave API GLM |
|
||||
| `base_url` | string | `https://open.bigmodel.cn/api/paas/v4/web_search` | URL da API GLM Search |
|
||||
| `search_engine` | string | `search_std` | Tipo de motor de busca |
|
||||
| `max_results` | int | 5 | Número máximo de resultados |
|
||||
|
||||
## Ferramenta Exec
|
||||
|
||||
A ferramenta exec é usada para executar comandos shell.
|
||||
|
||||
| Config | Tipo | Padrão | Descrição |
|
||||
|------------------------|-------|--------|-------------------------------------------------|
|
||||
| `enabled` | bool | true | Habilitar a ferramenta exec |
|
||||
| `enable_deny_patterns` | bool | true | Habilitar bloqueio padrão de comandos perigosos |
|
||||
| `custom_deny_patterns` | array | [] | Padrões de negação personalizados (expressões regulares) |
|
||||
|
||||
### Desabilitando a Ferramenta Exec
|
||||
|
||||
Para desabilitar completamente a ferramenta `exec`, defina `enabled` como `false`:
|
||||
|
||||
**Via arquivo de configuração:**
|
||||
```json
|
||||
{
|
||||
"tools": {
|
||||
"exec": {
|
||||
"enabled": false
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
**Via variável de ambiente:**
|
||||
```bash
|
||||
PICOCLAW_TOOLS_EXEC_ENABLED=false
|
||||
```
|
||||
|
||||
> **Nota:** Quando desabilitada, o agent não poderá executar comandos shell. Isso também afeta a capacidade da ferramenta Cron de executar comandos shell agendados.
|
||||
|
||||
### Funcionalidade
|
||||
|
||||
- **`enable_deny_patterns`**: Defina como `false` para desabilitar completamente os padrões de bloqueio de comandos perigosos padrão
|
||||
- **`custom_deny_patterns`**: Adicione padrões regex de negação personalizados; comandos correspondentes serão bloqueados
|
||||
|
||||
### Padrões de comandos bloqueados por padrão
|
||||
|
||||
Por padrão, o PicoClaw bloqueia os seguintes comandos perigosos:
|
||||
|
||||
- Comandos de exclusão: `rm -rf`, `del /f/q`, `rmdir /s`
|
||||
- Operações de disco: `format`, `mkfs`, `diskpart`, `dd if=`, escrita em `/dev/sd*`
|
||||
- Operações do sistema: `shutdown`, `reboot`, `poweroff`
|
||||
- Substituição de comandos: `$()`, `${}`, crases
|
||||
- Pipe para shell: `| sh`, `| bash`
|
||||
- Escalação de privilégios: `sudo`, `chmod`, `chown`
|
||||
- Controle de processos: `pkill`, `killall`, `kill -9`
|
||||
- Operações remotas: `curl | sh`, `wget | sh`, `ssh`
|
||||
- Gerenciamento de pacotes: `apt`, `yum`, `dnf`, `npm install -g`, `pip install --user`
|
||||
- Contêineres: `docker run`, `docker exec`
|
||||
- Git: `git push`, `git force`
|
||||
- Outros: `eval`, `source *.sh`
|
||||
|
||||
### Limitação arquitetural conhecida
|
||||
|
||||
O guarda exec apenas valida o comando de nível superior enviado ao PicoClaw. Ele **não** inspeciona recursivamente processos filhos gerados por ferramentas de build ou scripts após o início desse comando.
|
||||
|
||||
Exemplos de fluxos de trabalho que podem contornar o guarda de comando direto uma vez que o comando inicial é permitido:
|
||||
|
||||
- `make run`
|
||||
- `go run ./cmd/...`
|
||||
- `cargo run`
|
||||
- `npm run build`
|
||||
|
||||
Isso significa que o guarda é útil para bloquear comandos diretos obviamente perigosos, mas **não** é um sandbox completo para pipelines de build não revisados. Se seu modelo de ameaça inclui código não confiável no workspace, use isolamento mais forte, como contêineres, VMs ou um fluxo de aprovação em torno de comandos de build e execução.
|
||||
|
||||
### Exemplo de configuração
|
||||
|
||||
```json
|
||||
{
|
||||
"tools": {
|
||||
"exec": {
|
||||
"enable_deny_patterns": true,
|
||||
"custom_deny_patterns": [
|
||||
"\\brm\\s+-r\\b",
|
||||
"\\bkillall\\s+python"
|
||||
]
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Ferramenta Cron
|
||||
|
||||
A ferramenta cron é usada para agendar tarefas periódicas.
|
||||
|
||||
| Config | Tipo | Padrão | Descrição |
|
||||
|------------------------|------|--------|-----------------------------------------------------|
|
||||
| `exec_timeout_minutes` | int | 5 | Tempo limite de execução em minutos, 0 significa sem limite |
|
||||
|
||||
<a id="mcp-tool"></a>
|
||||
## Ferramenta MCP
|
||||
|
||||
A ferramenta MCP permite a integração com servidores Model Context Protocol externos.
|
||||
|
||||
### Descoberta de ferramentas (carregamento preguiçoso)
|
||||
|
||||
Ao conectar a vários servidores MCP, expor centenas de ferramentas simultaneamente pode esgotar a janela de contexto do LLM e aumentar os custos de API. O recurso **Discovery** resolve isso mantendo as ferramentas MCP *ocultas* por padrão.
|
||||
|
||||
Em vez de carregar todas as ferramentas, o LLM recebe uma ferramenta de pesquisa leve (usando correspondência de palavras-chave BM25 ou Regex). Quando o LLM precisa de uma capacidade específica, ele pesquisa a biblioteca oculta. As ferramentas correspondentes são então temporariamente "desbloqueadas" e injetadas no contexto por um número configurado de turnos (`ttl`).
|
||||
|
||||
### Configuração global
|
||||
|
||||
| Config | Tipo | Padrão | Descrição |
|
||||
|-------------|--------|--------|----------------------------------------------|
|
||||
| `enabled` | bool | false | Habilitar integração MCP globalmente |
|
||||
| `discovery` | object | `{}` | Configuração de descoberta de ferramentas (veja abaixo) |
|
||||
| `servers` | object | `{}` | Mapa de nome do servidor para configuração do servidor |
|
||||
|
||||
### Configuração Discovery (`discovery`)
|
||||
|
||||
| Config | Tipo | Padrão | Descrição |
|
||||
|----------------------|------|--------|-----------------------------------------------------------------------------------------------------------------------------------|
|
||||
| `enabled` | bool | false | Se true, as ferramentas MCP ficam ocultas e são carregadas sob demanda via pesquisa. Se false, todas as ferramentas são carregadas |
|
||||
| `ttl` | int | 5 | Número de turnos de conversa que uma ferramenta descoberta permanece desbloqueada |
|
||||
| `max_search_results` | int | 5 | Número máximo de ferramentas retornadas por consulta de pesquisa |
|
||||
| `use_bm25` | bool | true | Habilitar a ferramenta de pesquisa por linguagem natural/palavras-chave (`tool_search_tool_bm25`). **Aviso**: consome mais recursos que a pesquisa regex |
|
||||
| `use_regex` | bool | false | Habilitar a ferramenta de pesquisa por padrão regex (`tool_search_tool_regex`) |
|
||||
|
||||
> **Nota:** Se `discovery.enabled` for `true`, você **deve** habilitar pelo menos um mecanismo de pesquisa (`use_bm25` ou `use_regex`),
|
||||
> caso contrário a aplicação falhará ao iniciar.
|
||||
|
||||
### Configuração por servidor
|
||||
|
||||
| Config | Tipo | Obrigatório | Descrição |
|
||||
|------------|--------|-------------|--------------------------------------------|
|
||||
| `enabled` | bool | sim | Habilitar este servidor MCP |
|
||||
| `type` | string | não | Tipo de transporte: `stdio`, `sse`, `http` |
|
||||
| `command` | string | stdio | Comando executável para transporte stdio |
|
||||
| `args` | array | não | Argumentos do comando para transporte stdio |
|
||||
| `env` | object | não | Variáveis de ambiente para processo stdio |
|
||||
| `env_file` | string | não | Caminho para arquivo de ambiente para processo stdio |
|
||||
| `url` | string | sse/http | URL do endpoint para transporte `sse`/`http` |
|
||||
| `headers` | object | não | Cabeçalhos HTTP para transporte `sse`/`http` |
|
||||
|
||||
### Comportamento do transporte
|
||||
|
||||
- Se `type` for omitido, o transporte é detectado automaticamente:
|
||||
- `url` está definido → `sse`
|
||||
- `command` está definido → `stdio`
|
||||
- `http` e `sse` ambos usam `url` + `headers` opcionais.
|
||||
- `env` e `env_file` são aplicados apenas a servidores `stdio`.
|
||||
|
||||
### Exemplos de configuração
|
||||
|
||||
#### 1) Servidor MCP Stdio
|
||||
|
||||
```json
|
||||
{
|
||||
"tools": {
|
||||
"mcp": {
|
||||
"enabled": true,
|
||||
"servers": {
|
||||
"filesystem": {
|
||||
"enabled": true,
|
||||
"command": "npx",
|
||||
"args": [
|
||||
"-y",
|
||||
"@modelcontextprotocol/server-filesystem",
|
||||
"/tmp"
|
||||
]
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
#### 2) Servidor MCP remoto SSE/HTTP
|
||||
|
||||
```json
|
||||
{
|
||||
"tools": {
|
||||
"mcp": {
|
||||
"enabled": true,
|
||||
"servers": {
|
||||
"remote-mcp": {
|
||||
"enabled": true,
|
||||
"type": "sse",
|
||||
"url": "https://example.com/mcp",
|
||||
"headers": {
|
||||
"Authorization": "Bearer YOUR_TOKEN"
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
#### 3) Configuração MCP massiva com descoberta de ferramentas habilitada
|
||||
|
||||
*Neste exemplo, o LLM verá apenas o `tool_search_tool_bm25`. Ele pesquisará e desbloqueará ferramentas do Github ou Postgres dinamicamente apenas quando solicitado pelo usuário.*
|
||||
|
||||
```json
|
||||
{
|
||||
"tools": {
|
||||
"mcp": {
|
||||
"enabled": true,
|
||||
"discovery": {
|
||||
"enabled": true,
|
||||
"ttl": 5,
|
||||
"max_search_results": 5,
|
||||
"use_bm25": true,
|
||||
"use_regex": false
|
||||
},
|
||||
"servers": {
|
||||
"github": {
|
||||
"enabled": true,
|
||||
"command": "npx",
|
||||
"args": [
|
||||
"-y",
|
||||
"@modelcontextprotocol/server-github"
|
||||
],
|
||||
"env": {
|
||||
"GITHUB_PERSONAL_ACCESS_TOKEN": "YOUR_GITHUB_TOKEN"
|
||||
}
|
||||
},
|
||||
"postgres": {
|
||||
"enabled": true,
|
||||
"command": "npx",
|
||||
"args": [
|
||||
"-y",
|
||||
"@modelcontextprotocol/server-postgres",
|
||||
"postgresql://user:password@localhost/dbname"
|
||||
]
|
||||
},
|
||||
"slack": {
|
||||
"enabled": true,
|
||||
"type": "slack",
|
||||
"command": "npx",
|
||||
"args": [
|
||||
"-y",
|
||||
"@modelcontextprotocol/server-slack"
|
||||
],
|
||||
"env": {
|
||||
"SLACK_BOT_TOKEN": "YOUR_SLACK_BOT_TOKEN",
|
||||
"SLACK_TEAM_ID": "YOUR_SLACK_TEAM_ID"
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
<a id="skills-tool"></a>
|
||||
## Ferramenta Skills
|
||||
|
||||
A ferramenta skills configura a descoberta e instalação de habilidades via registros como o ClawHub.
|
||||
|
||||
### Registros
|
||||
|
||||
| Config | Tipo | Padrão | Descrição |
|
||||
|------------------------------------|--------|-----------------------|----------------------------------------------|
|
||||
| `registries.clawhub.enabled` | bool | true | Habilitar registro ClawHub |
|
||||
| `registries.clawhub.base_url` | string | `https://clawhub.ai` | URL base do ClawHub |
|
||||
| `registries.clawhub.auth_token` | string | `""` | Token Bearer opcional para limites de taxa mais altos |
|
||||
| `registries.clawhub.search_path` | string | `/api/v1/search` | Caminho da API de pesquisa |
|
||||
| `registries.clawhub.skills_path` | string | `/api/v1/skills` | Caminho da API de Skills |
|
||||
| `registries.clawhub.download_path` | string | `/api/v1/download` | Caminho da API de download |
|
||||
|
||||
### Exemplo de configuração
|
||||
|
||||
```json
|
||||
{
|
||||
"tools": {
|
||||
"skills": {
|
||||
"registries": {
|
||||
"clawhub": {
|
||||
"enabled": true,
|
||||
"base_url": "https://clawhub.ai",
|
||||
"auth_token": "",
|
||||
"search_path": "/api/v1/search",
|
||||
"skills_path": "/api/v1/skills",
|
||||
"download_path": "/api/v1/download"
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Variáveis de ambiente
|
||||
|
||||
Todas as opções de configuração podem ser substituídas via variáveis de ambiente com o formato `PICOCLAW_TOOLS_<SECTION>_<KEY>`:
|
||||
|
||||
Por exemplo:
|
||||
|
||||
- `PICOCLAW_TOOLS_WEB_BRAVE_ENABLED=true`
|
||||
- `PICOCLAW_TOOLS_EXEC_ENABLED=false`
|
||||
- `PICOCLAW_TOOLS_EXEC_ENABLE_DENY_PATTERNS=false`
|
||||
- `PICOCLAW_TOOLS_CRON_EXEC_TIMEOUT_MINUTES=10`
|
||||
- `PICOCLAW_TOOLS_MCP_ENABLED=true`
|
||||
|
||||
Nota: Configuração de tipo mapa aninhado (por exemplo `tools.mcp.servers.<name>.*`) é configurada no `config.json` em vez de variáveis de ambiente.
|
||||
@@ -0,0 +1,415 @@
|
||||
# 🔧 Cấu Hình Công Cụ
|
||||
|
||||
> Quay lại [README](../project/README.vi.md)
|
||||
|
||||
Cấu hình công cụ của PicoClaw nằm trong trường `tools` của `config.json`.
|
||||
|
||||
## Cấu trúc thư mục
|
||||
|
||||
```json
|
||||
{
|
||||
"tools": {
|
||||
"web": {
|
||||
...
|
||||
},
|
||||
"mcp": {
|
||||
...
|
||||
},
|
||||
"exec": {
|
||||
...
|
||||
},
|
||||
"cron": {
|
||||
...
|
||||
},
|
||||
"skills": {
|
||||
...
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Công cụ Web
|
||||
|
||||
Các công cụ web được sử dụng để tìm kiếm và tải nội dung web.
|
||||
|
||||
### Web Fetcher
|
||||
Cài đặt chung để tải và xử lý nội dung trang web.
|
||||
|
||||
| Cấu hình | Kiểu | Mặc định | Mô tả |
|
||||
|----------------------|--------|---------------|-----------------------------------------------------------------------------------------------|
|
||||
| `enabled` | bool | true | Bật khả năng tải trang web. |
|
||||
| `fetch_limit_bytes` | int | 10485760 | Kích thước tối đa của payload trang web cần tải, tính bằng byte (mặc định là 10MB). |
|
||||
| `format` | string | "plaintext" | Định dạng đầu ra của nội dung đã tải. Tùy chọn: `plaintext` hoặc `markdown` (khuyến nghị). |
|
||||
|
||||
### DuckDuckGo
|
||||
|
||||
| Cấu hình | Kiểu | Mặc định | Mô tả |
|
||||
|----------------|------|----------|-------------------------------|
|
||||
| `enabled` | bool | true | Bật tìm kiếm DuckDuckGo |
|
||||
| `max_results` | int | 5 | Số kết quả tối đa |
|
||||
|
||||
### Baidu Search
|
||||
|
||||
| Cấu hình | Kiểu | Mặc định | Mô tả |
|
||||
|----------------|--------|-----------------------------------------------------------------|------------------------------------|
|
||||
| `enabled` | bool | false | Bật tìm kiếm Baidu |
|
||||
| `api_key` | string | - | Khóa API Qianfan |
|
||||
| `base_url` | string | `https://qianfan.baidubce.com/v2/ai_search/web_search` | URL API Baidu Search |
|
||||
| `max_results` | int | 10 | Số kết quả tối đa |
|
||||
|
||||
```json
|
||||
{
|
||||
"tools": {
|
||||
"web": {
|
||||
"baidu_search": {
|
||||
"enabled": true,
|
||||
"api_key": "YOUR_BAIDU_QIANFAN_API_KEY",
|
||||
"max_results": 10
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Perplexity
|
||||
|
||||
| Cấu hình | Kiểu | Mặc định | Mô tả |
|
||||
|----------------|--------|----------|-------------------------------|
|
||||
| `enabled` | bool | false | Bật tìm kiếm Perplexity |
|
||||
| `api_key` | string | - | Khóa API Perplexity |
|
||||
| `api_keys` | string[] | - | Nhiều khóa API Perplexity để xoay vòng (ưu tiên hơn `api_key`) |
|
||||
| `max_results` | int | 5 | Số kết quả tối đa |
|
||||
|
||||
### Brave
|
||||
|
||||
| Cấu hình | Kiểu | Mặc định | Mô tả |
|
||||
|----------------|--------|----------|----------------------------|
|
||||
| `enabled` | bool | false | Bật tìm kiếm Brave |
|
||||
| `api_key` | string | - | Khóa API Brave Search |
|
||||
| `api_keys` | string[] | - | Nhiều khóa API Brave Search để xoay vòng (ưu tiên hơn `api_key`) |
|
||||
| `max_results` | int | 5 | Số kết quả tối đa |
|
||||
|
||||
### Tavily
|
||||
|
||||
| Cấu hình | Kiểu | Mặc định | Mô tả |
|
||||
|----------------|--------|----------|------------------------------------|
|
||||
| `enabled` | bool | false | Bật tìm kiếm Tavily |
|
||||
| `api_key` | string | - | Khóa API Tavily |
|
||||
| `base_url` | string | - | URL cơ sở Tavily tùy chỉnh |
|
||||
| `max_results` | int | 0 | Số kết quả tối đa (0 = mặc định) |
|
||||
|
||||
### SearXNG
|
||||
|
||||
| Cấu hình | Kiểu | Mặc định | Mô tả |
|
||||
|----------------|--------|--------------------------|----------------------------|
|
||||
| `enabled` | bool | false | Bật tìm kiếm SearXNG |
|
||||
| `base_url` | string | `http://localhost:8888` | URL phiên bản SearXNG |
|
||||
| `max_results` | int | 5 | Số kết quả tối đa |
|
||||
|
||||
### GLM Search
|
||||
|
||||
| Cấu hình | Kiểu | Mặc định | Mô tả |
|
||||
|------------------|--------|------------------------------------------------------|----------------------------|
|
||||
| `enabled` | bool | false | Bật GLM Search |
|
||||
| `api_key` | string | - | Khóa API GLM |
|
||||
| `base_url` | string | `https://open.bigmodel.cn/api/paas/v4/web_search` | URL API GLM Search |
|
||||
| `search_engine` | string | `search_std` | Loại công cụ tìm kiếm |
|
||||
| `max_results` | int | 5 | Số kết quả tối đa |
|
||||
|
||||
## Công cụ Exec
|
||||
|
||||
Công cụ exec được sử dụng để thực thi các lệnh shell.
|
||||
|
||||
| Cấu hình | Kiểu | Mặc định | Mô tả |
|
||||
|--------------------------|-------|----------|------------------------------------------------|
|
||||
| `enabled` | bool | true | Bật công cụ exec |
|
||||
| `enable_deny_patterns` | bool | true | Bật chặn lệnh nguy hiểm mặc định |
|
||||
| `custom_deny_patterns` | array | [] | Mẫu từ chối tùy chỉnh (biểu thức chính quy) |
|
||||
|
||||
### Vô hiệu hóa Công cụ Exec
|
||||
|
||||
Để hoàn toàn vô hiệu hóa công cụ `exec`, đặt `enabled` thành `false`:
|
||||
|
||||
**Qua tệp cấu hình:**
|
||||
```json
|
||||
{
|
||||
"tools": {
|
||||
"exec": {
|
||||
"enabled": false
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
**Qua biến môi trường:**
|
||||
```bash
|
||||
PICOCLAW_TOOLS_EXEC_ENABLED=false
|
||||
```
|
||||
|
||||
> **Lưu ý:** Khi bị vô hiệu hóa, agent sẽ không thể thực thi lệnh shell. Điều này cũng ảnh hưởng đến khả năng chạy lệnh shell theo lịch của công cụ Cron.
|
||||
|
||||
### Chức năng
|
||||
|
||||
- **`enable_deny_patterns`**: Đặt thành `false` để tắt hoàn toàn các mẫu chặn lệnh nguy hiểm mặc định
|
||||
- **`custom_deny_patterns`**: Thêm các mẫu regex từ chối tùy chỉnh; các lệnh khớp sẽ bị chặn
|
||||
|
||||
### Các mẫu lệnh bị chặn mặc định
|
||||
|
||||
Theo mặc định, PicoClaw chặn các lệnh nguy hiểm sau:
|
||||
|
||||
- Lệnh xóa: `rm -rf`, `del /f/q`, `rmdir /s`
|
||||
- Thao tác đĩa: `format`, `mkfs`, `diskpart`, `dd if=`, ghi vào `/dev/sd*`
|
||||
- Thao tác hệ thống: `shutdown`, `reboot`, `poweroff`
|
||||
- Thay thế lệnh: `$()`, `${}`, dấu backtick
|
||||
- Pipe đến shell: `| sh`, `| bash`
|
||||
- Leo thang đặc quyền: `sudo`, `chmod`, `chown`
|
||||
- Điều khiển tiến trình: `pkill`, `killall`, `kill -9`
|
||||
- Thao tác từ xa: `curl | sh`, `wget | sh`, `ssh`
|
||||
- Quản lý gói: `apt`, `yum`, `dnf`, `npm install -g`, `pip install --user`
|
||||
- Container: `docker run`, `docker exec`
|
||||
- Git: `git push`, `git force`
|
||||
- Khác: `eval`, `source *.sh`
|
||||
|
||||
### Hạn chế kiến trúc đã biết
|
||||
|
||||
Bộ bảo vệ exec chỉ xác thực lệnh cấp cao nhất được gửi đến PicoClaw. Nó **không** kiểm tra đệ quy các tiến trình con được tạo bởi các công cụ build hoặc script sau khi lệnh đó bắt đầu chạy.
|
||||
|
||||
Ví dụ về các quy trình có thể bỏ qua bộ bảo vệ lệnh trực tiếp sau khi lệnh ban đầu được cho phép:
|
||||
|
||||
- `make run`
|
||||
- `go run ./cmd/...`
|
||||
- `cargo run`
|
||||
- `npm run build`
|
||||
|
||||
Điều này có nghĩa là bộ bảo vệ hữu ích để chặn các lệnh trực tiếp rõ ràng nguy hiểm, nhưng nó **không phải** là sandbox đầy đủ cho các pipeline build chưa được xem xét. Nếu mô hình mối đe dọa của bạn bao gồm mã không đáng tin cậy trong workspace, hãy sử dụng cách ly mạnh hơn như container, VM hoặc quy trình phê duyệt xung quanh các lệnh build và chạy.
|
||||
|
||||
### Ví dụ cấu hình
|
||||
|
||||
```json
|
||||
{
|
||||
"tools": {
|
||||
"exec": {
|
||||
"enable_deny_patterns": true,
|
||||
"custom_deny_patterns": [
|
||||
"\\brm\\s+-r\\b",
|
||||
"\\bkillall\\s+python"
|
||||
]
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Công cụ Cron
|
||||
|
||||
Công cụ cron được sử dụng để lên lịch các tác vụ định kỳ.
|
||||
|
||||
| Cấu hình | Kiểu | Mặc định | Mô tả |
|
||||
|--------------------------|------|----------|-----------------------------------------------------|
|
||||
| `exec_timeout_minutes` | int | 5 | Thời gian chờ thực thi tính bằng phút, 0 nghĩa là không giới hạn |
|
||||
|
||||
<a id="mcp-tool"></a>
|
||||
## Công cụ MCP
|
||||
|
||||
Công cụ MCP cho phép tích hợp với các máy chủ Model Context Protocol bên ngoài.
|
||||
|
||||
### Khám phá công cụ (tải chậm)
|
||||
|
||||
Khi kết nối với nhiều máy chủ MCP, việc hiển thị hàng trăm công cụ cùng lúc có thể làm cạn kiệt cửa sổ ngữ cảnh của LLM và tăng chi phí API. Tính năng **Discovery** giải quyết vấn đề này bằng cách giữ các công cụ MCP *ẩn* theo mặc định.
|
||||
|
||||
Thay vì tải tất cả các công cụ, LLM được cung cấp một công cụ tìm kiếm nhẹ (sử dụng khớp từ khóa BM25 hoặc Regex). Khi LLM cần một khả năng cụ thể, nó tìm kiếm trong thư viện ẩn. Các công cụ khớp sau đó được tạm thời "mở khóa" và đưa vào ngữ cảnh trong số lượt được cấu hình (`ttl`).
|
||||
|
||||
### Cấu hình toàn cục
|
||||
|
||||
| Cấu hình | Kiểu | Mặc định | Mô tả |
|
||||
|-------------|--------|----------|-----------------------------------------------|
|
||||
| `enabled` | bool | false | Bật tích hợp MCP toàn cục |
|
||||
| `discovery` | object | `{}` | Cấu hình khám phá công cụ (xem bên dưới) |
|
||||
| `servers` | object | `{}` | Ánh xạ tên máy chủ đến cấu hình máy chủ |
|
||||
|
||||
### Cấu hình Discovery (`discovery`)
|
||||
|
||||
| Cấu hình | Kiểu | Mặc định | Mô tả |
|
||||
|----------------------|------|----------|-----------------------------------------------------------------------------------------------------------------------------------|
|
||||
| `enabled` | bool | false | Nếu true, các công cụ MCP bị ẩn và được tải theo yêu cầu qua tìm kiếm. Nếu false, tất cả công cụ được tải |
|
||||
| `ttl` | int | 5 | Số lượt hội thoại mà một công cụ đã khám phá vẫn được mở khóa |
|
||||
| `max_search_results` | int | 5 | Số công cụ tối đa được trả về cho mỗi truy vấn tìm kiếm |
|
||||
| `use_bm25` | bool | true | Bật công cụ tìm kiếm ngôn ngữ tự nhiên/từ khóa (`tool_search_tool_bm25`). **Cảnh báo**: tiêu tốn nhiều tài nguyên hơn tìm kiếm regex |
|
||||
| `use_regex` | bool | false | Bật công cụ tìm kiếm mẫu regex (`tool_search_tool_regex`) |
|
||||
|
||||
> **Lưu ý:** Nếu `discovery.enabled` là `true`, bạn **phải** bật ít nhất một công cụ tìm kiếm (`use_bm25` hoặc `use_regex`),
|
||||
> nếu không ứng dụng sẽ không khởi động được.
|
||||
|
||||
### Cấu hình từng máy chủ
|
||||
|
||||
| Cấu hình | Kiểu | Bắt buộc | Mô tả |
|
||||
|------------|--------|----------|--------------------------------------------|
|
||||
| `enabled` | bool | có | Bật máy chủ MCP này |
|
||||
| `type` | string | không | Loại truyền tải: `stdio`, `sse`, `http` |
|
||||
| `command` | string | stdio | Lệnh thực thi cho truyền tải stdio |
|
||||
| `args` | array | không | Đối số lệnh cho truyền tải stdio |
|
||||
| `env` | object | không | Biến môi trường cho tiến trình stdio |
|
||||
| `env_file` | string | không | Đường dẫn đến tệp môi trường cho tiến trình stdio |
|
||||
| `url` | string | sse/http | URL endpoint cho truyền tải `sse`/`http` |
|
||||
| `headers` | object | không | Header HTTP cho truyền tải `sse`/`http` |
|
||||
|
||||
### Hành vi truyền tải
|
||||
|
||||
- Nếu bỏ qua `type`, truyền tải được tự động phát hiện:
|
||||
- `url` được đặt → `sse`
|
||||
- `command` được đặt → `stdio`
|
||||
- `http` và `sse` đều sử dụng `url` + `headers` tùy chọn.
|
||||
- `env` và `env_file` chỉ được áp dụng cho máy chủ `stdio`.
|
||||
|
||||
### Ví dụ cấu hình
|
||||
|
||||
#### 1) Máy chủ MCP Stdio
|
||||
|
||||
```json
|
||||
{
|
||||
"tools": {
|
||||
"mcp": {
|
||||
"enabled": true,
|
||||
"servers": {
|
||||
"filesystem": {
|
||||
"enabled": true,
|
||||
"command": "npx",
|
||||
"args": [
|
||||
"-y",
|
||||
"@modelcontextprotocol/server-filesystem",
|
||||
"/tmp"
|
||||
]
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
#### 2) Máy chủ MCP từ xa SSE/HTTP
|
||||
|
||||
```json
|
||||
{
|
||||
"tools": {
|
||||
"mcp": {
|
||||
"enabled": true,
|
||||
"servers": {
|
||||
"remote-mcp": {
|
||||
"enabled": true,
|
||||
"type": "sse",
|
||||
"url": "https://example.com/mcp",
|
||||
"headers": {
|
||||
"Authorization": "Bearer YOUR_TOKEN"
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
#### 3) Thiết lập MCP quy mô lớn với khám phá công cụ được bật
|
||||
|
||||
*Trong ví dụ này, LLM chỉ thấy `tool_search_tool_bm25`. Nó sẽ tìm kiếm và mở khóa động các công cụ Github hoặc Postgres chỉ khi được người dùng yêu cầu.*
|
||||
|
||||
```json
|
||||
{
|
||||
"tools": {
|
||||
"mcp": {
|
||||
"enabled": true,
|
||||
"discovery": {
|
||||
"enabled": true,
|
||||
"ttl": 5,
|
||||
"max_search_results": 5,
|
||||
"use_bm25": true,
|
||||
"use_regex": false
|
||||
},
|
||||
"servers": {
|
||||
"github": {
|
||||
"enabled": true,
|
||||
"command": "npx",
|
||||
"args": [
|
||||
"-y",
|
||||
"@modelcontextprotocol/server-github"
|
||||
],
|
||||
"env": {
|
||||
"GITHUB_PERSONAL_ACCESS_TOKEN": "YOUR_GITHUB_TOKEN"
|
||||
}
|
||||
},
|
||||
"postgres": {
|
||||
"enabled": true,
|
||||
"command": "npx",
|
||||
"args": [
|
||||
"-y",
|
||||
"@modelcontextprotocol/server-postgres",
|
||||
"postgresql://user:password@localhost/dbname"
|
||||
]
|
||||
},
|
||||
"slack": {
|
||||
"enabled": true,
|
||||
"type": "slack",
|
||||
"command": "npx",
|
||||
"args": [
|
||||
"-y",
|
||||
"@modelcontextprotocol/server-slack"
|
||||
],
|
||||
"env": {
|
||||
"SLACK_BOT_TOKEN": "YOUR_SLACK_BOT_TOKEN",
|
||||
"SLACK_TEAM_ID": "YOUR_SLACK_TEAM_ID"
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
<a id="skills-tool"></a>
|
||||
## Công cụ Skills
|
||||
|
||||
Công cụ skills cấu hình khám phá và cài đặt kỹ năng thông qua các registry như ClawHub.
|
||||
|
||||
### Registry
|
||||
|
||||
| Cấu hình | Kiểu | Mặc định | Mô tả |
|
||||
|------------------------------------|--------|-----------------------|----------------------------------------------|
|
||||
| `registries.clawhub.enabled` | bool | true | Bật registry ClawHub |
|
||||
| `registries.clawhub.base_url` | string | `https://clawhub.ai` | URL cơ sở ClawHub |
|
||||
| `registries.clawhub.auth_token` | string | `""` | Token Bearer tùy chọn để có giới hạn tốc độ cao hơn |
|
||||
| `registries.clawhub.search_path` | string | `/api/v1/search` | Đường dẫn API tìm kiếm |
|
||||
| `registries.clawhub.skills_path` | string | `/api/v1/skills` | Đường dẫn API Skills |
|
||||
| `registries.clawhub.download_path` | string | `/api/v1/download` | Đường dẫn API tải xuống |
|
||||
|
||||
### Ví dụ cấu hình
|
||||
|
||||
```json
|
||||
{
|
||||
"tools": {
|
||||
"skills": {
|
||||
"registries": {
|
||||
"clawhub": {
|
||||
"enabled": true,
|
||||
"base_url": "https://clawhub.ai",
|
||||
"auth_token": "",
|
||||
"search_path": "/api/v1/search",
|
||||
"skills_path": "/api/v1/skills",
|
||||
"download_path": "/api/v1/download"
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Biến môi trường
|
||||
|
||||
Tất cả các tùy chọn cấu hình có thể được ghi đè qua biến môi trường với định dạng `PICOCLAW_TOOLS_<SECTION>_<KEY>`:
|
||||
|
||||
Ví dụ:
|
||||
|
||||
- `PICOCLAW_TOOLS_WEB_BRAVE_ENABLED=true`
|
||||
- `PICOCLAW_TOOLS_EXEC_ENABLED=false`
|
||||
- `PICOCLAW_TOOLS_EXEC_ENABLE_DENY_PATTERNS=false`
|
||||
- `PICOCLAW_TOOLS_CRON_EXEC_TIMEOUT_MINUTES=10`
|
||||
- `PICOCLAW_TOOLS_MCP_ENABLED=true`
|
||||
|
||||
Lưu ý: Cấu hình kiểu map lồng nhau (ví dụ `tools.mcp.servers.<name>.*`) được cấu hình trong `config.json` thay vì qua biến môi trường.
|
||||
@@ -0,0 +1,492 @@
|
||||
# 🔧 工具配置
|
||||
|
||||
> 返回 [README](../project/README.zh.md)
|
||||
|
||||
PicoClaw 的工具配置位于 `config.json` 的 `tools` 字段中。
|
||||
|
||||
## 目录结构
|
||||
|
||||
```json
|
||||
{
|
||||
"tools": {
|
||||
"web": {
|
||||
...
|
||||
},
|
||||
"mcp": {
|
||||
...
|
||||
},
|
||||
"exec": {
|
||||
...
|
||||
},
|
||||
"cron": {
|
||||
...
|
||||
},
|
||||
"skills": {
|
||||
...
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## 敏感数据过滤
|
||||
|
||||
在将工具结果发送给 LLM 之前,PicoClaw 可以从输出中过滤敏感值(API 密钥、令牌、密码)。这可以防止 LLM 看到自己的凭据。
|
||||
|
||||
详细说明请参阅[敏感数据过滤](../security/sensitive_data_filtering.zh.md)。
|
||||
|
||||
| 配置项 | 类型 | 默认值 | 描述 |
|
||||
|--------|------|--------|------|
|
||||
| `filter_sensitive_data` | bool | `true` | 启用/禁用过滤 |
|
||||
| `filter_min_length` | int | `8` | 触发过滤的最小内容长度 |
|
||||
|
||||
## Web 工具
|
||||
|
||||
Web 工具用于网页搜索和抓取。
|
||||
|
||||
### Web Fetcher
|
||||
用于抓取和处理网页内容的通用设置。
|
||||
|
||||
| 配置项 | 类型 | 默认值 | 描述 |
|
||||
|---------------------|--------|---------------|----------------------------------------------------------------------------------------|
|
||||
| `enabled` | bool | true | 启用网页抓取功能。 |
|
||||
| `fetch_limit_bytes` | int | 10485760 | 抓取网页负载的最大大小,单位为字节(默认 10MB)。 |
|
||||
| `format` | string | "plaintext" | 抓取内容的输出格式。选项:`plaintext` 或 `markdown`(推荐)。 |
|
||||
|
||||
### 百度搜索
|
||||
|
||||
使用[千帆 AI 搜索 API](https://cloud.baidu.com/doc/qianfan-api/s/Wmbq4z7e5),国内访问稳定,中文搜索效果好。
|
||||
|
||||
| 配置项 | 类型 | 默认值 | 描述 |
|
||||
|---------------|--------|----------------------------------------------------------------|-----------------------|
|
||||
| `enabled` | bool | false | 启用百度搜索 |
|
||||
| `api_key` | string | - | 千帆 API 密钥 |
|
||||
| `base_url` | string | `https://qianfan.baidubce.com/v2/ai_search/web_search` | 百度搜索 API URL |
|
||||
| `max_results` | int | 10 | 最大结果数 |
|
||||
|
||||
```json
|
||||
{
|
||||
"tools": {
|
||||
"web": {
|
||||
"baidu_search": {
|
||||
"enabled": true,
|
||||
"api_key": "YOUR_BAIDU_QIANFAN_API_KEY",
|
||||
"max_results": 10
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Tavily
|
||||
|
||||
| 配置项 | 类型 | 默认值 | 描述 |
|
||||
|---------------|--------|--------|-----------------------------------|
|
||||
| `enabled` | bool | false | 启用 Tavily 搜索 |
|
||||
| `api_key` | string | - | Tavily API 密钥 |
|
||||
| `base_url` | string | - | 自定义 Tavily API 基础 URL |
|
||||
| `max_results` | int | 0 | 最大结果数(0 = 默认) |
|
||||
|
||||
### GLM Search
|
||||
|
||||
| 配置项 | 类型 | 默认值 | 描述 |
|
||||
|-----------------|--------|------------------------------------------------------|-----------------------|
|
||||
| `enabled` | bool | false | 启用 GLM 搜索 |
|
||||
| `api_key` | string | - | GLM API 密钥 |
|
||||
| `base_url` | string | `https://open.bigmodel.cn/api/paas/v4/web_search` | GLM Search API URL |
|
||||
| `search_engine` | string | `search_std` | 搜索引擎类型 |
|
||||
| `max_results` | int | 5 | 最大结果数 |
|
||||
|
||||
### DuckDuckGo
|
||||
|
||||
> ⚠️ 国内访问困难,建议搭配代理使用。
|
||||
|
||||
| 配置项 | 类型 | 默认值 | 描述 |
|
||||
|---------------|------|--------|-----------------------|
|
||||
| `enabled` | bool | true | 启用 DuckDuckGo 搜索 |
|
||||
| `max_results` | int | 5 | 最大结果数 |
|
||||
|
||||
### Perplexity
|
||||
|
||||
> ⚠️ 国内访问困难,建议搭配代理使用。
|
||||
|
||||
| 配置项 | 类型 | 默认值 | 描述 |
|
||||
|---------------|----------|--------|------------------------------------------------|
|
||||
| `enabled` | bool | false | 启用 Perplexity 搜索 |
|
||||
| `api_key` | string | - | Perplexity API 密钥 |
|
||||
| `api_keys` | string[] | - | 多个 API 密钥轮换(优先于 `api_key`) |
|
||||
| `max_results` | int | 5 | 最大结果数 |
|
||||
|
||||
### Brave
|
||||
|
||||
> ⚠️ 国内访问困难,建议搭配代理使用。
|
||||
|
||||
| 配置项 | 类型 | 默认值 | 描述 |
|
||||
|---------------|----------|--------|------------------------------------------------|
|
||||
| `enabled` | bool | false | 启用 Brave 搜索 |
|
||||
| `api_key` | string | - | Brave Search API 密钥 |
|
||||
| `api_keys` | string[] | - | 多个 API 密钥轮换(优先于 `api_key`) |
|
||||
| `max_results` | int | 5 | 最大结果数 |
|
||||
|
||||
### SearXNG
|
||||
|
||||
| 配置项 | 类型 | 默认值 | 描述 |
|
||||
|---------------|--------|--------------------------|-----------------------|
|
||||
| `enabled` | bool | false | 启用 SearXNG 搜索 |
|
||||
| `base_url` | string | `http://localhost:8888` | SearXNG 实例 URL |
|
||||
| `max_results` | int | 5 | 最大结果数 |
|
||||
|
||||
### 其他 Web 设置
|
||||
|
||||
| 配置项 | 类型 | 默认值 | 描述 |
|
||||
|--------------------------|----------|--------|-------------------------------------------------|
|
||||
| `prefer_native` | bool | true | 优先使用 provider 原生搜索而非配置的搜索引擎 |
|
||||
| `private_host_whitelist` | string[] | `[]` | 允许 Web 抓取的私有/内部主机白名单 |
|
||||
|
||||
## Exec 工具
|
||||
|
||||
Exec 工具用于执行 shell 命令。
|
||||
|
||||
| 配置项 | 类型 | 默认值 | 描述 |
|
||||
|------------------------|-------|--------|--------------------------------|
|
||||
| `enabled` | bool | true | 启用 exec 工具 |
|
||||
| `enable_deny_patterns` | bool | true | 启用默认的危险命令拦截 |
|
||||
| `custom_deny_patterns` | array | [] | 自定义拒绝模式(正则表达式) |
|
||||
|
||||
### 禁用 Exec 工具
|
||||
|
||||
要完全禁用 `exec` 工具,请将 `enabled` 设置为 `false`:
|
||||
|
||||
**通过配置文件:**
|
||||
```json
|
||||
{
|
||||
"tools": {
|
||||
"exec": {
|
||||
"enabled": false
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
**通过环境变量:**
|
||||
```bash
|
||||
PICOCLAW_TOOLS_EXEC_ENABLED=false
|
||||
```
|
||||
|
||||
> **注意:** 禁用后,代理将无法执行 shell 命令。这也会影响 Cron 工具运行计划 shell 命令的能力。
|
||||
|
||||
### 功能说明
|
||||
|
||||
- **`enable_deny_patterns`**:设为 `false` 可完全禁用默认的危险命令拦截模式
|
||||
- **`custom_deny_patterns`**:添加自定义拒绝正则模式;匹配的命令将被拦截
|
||||
|
||||
### 默认拦截的命令模式
|
||||
|
||||
默认情况下,PicoClaw 会拦截以下危险命令:
|
||||
|
||||
- 删除命令:`rm -rf`、`del /f/q`、`rmdir /s`
|
||||
- 磁盘操作:`format`、`mkfs`、`diskpart`、`dd if=`、写入 `/dev/sd*`
|
||||
- 系统操作:`shutdown`、`reboot`、`poweroff`
|
||||
- 命令替换:`$()`、`${}`、反引号
|
||||
- 管道到 shell:`| sh`、`| bash`
|
||||
- 权限提升:`sudo`、`chmod`、`chown`
|
||||
- 进程控制:`pkill`、`killall`、`kill -9`
|
||||
- 远程操作:`curl | sh`、`wget | sh`、`ssh`
|
||||
- 包管理:`apt`、`yum`、`dnf`、`npm install -g`、`pip install --user`
|
||||
- 容器:`docker run`、`docker exec`
|
||||
- Git:`git push`、`git force`
|
||||
- 其他:`eval`、`source *.sh`
|
||||
|
||||
### 已知架构限制
|
||||
|
||||
exec 守卫仅验证发送给 PicoClaw 的顶层命令。它**不会**递归检查该命令启动后由构建工具或脚本生成的子进程。
|
||||
|
||||
以下工作流在初始命令被允许后可以绕过直接命令守卫:
|
||||
|
||||
- `make run`
|
||||
- `go run ./cmd/...`
|
||||
- `cargo run`
|
||||
- `npm run build`
|
||||
|
||||
这意味着守卫对于拦截明显危险的直接命令很有用,但它**不是**未审查构建管道的完整沙箱。如果你的威胁模型包括工作区中的不受信任代码,请使用更强的隔离措施,如容器、虚拟机或围绕构建和运行命令的审批流程。
|
||||
|
||||
### 配置示例
|
||||
|
||||
```json
|
||||
{
|
||||
"tools": {
|
||||
"exec": {
|
||||
"enable_deny_patterns": true,
|
||||
"custom_deny_patterns": [
|
||||
"\\brm\\s+-r\\b",
|
||||
"\\bkillall\\s+python"
|
||||
]
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Cron 工具
|
||||
|
||||
Cron 工具用于调度周期性任务。
|
||||
|
||||
| 配置项 | 类型 | 默认值 | 描述 |
|
||||
|------------------------|------|--------|-------------------------------------|
|
||||
| `exec_timeout_minutes` | int | 5 | 执行超时时间(分钟),0 表示无限制 |
|
||||
| `allow_command` | bool | false | 允许 cron 任务执行 shell 命令 |
|
||||
|
||||
<a id="mcp-tool"></a>
|
||||
## MCP 工具
|
||||
|
||||
MCP 工具支持与外部 Model Context Protocol 服务器集成。
|
||||
|
||||
### 工具发现(延迟加载)
|
||||
|
||||
当连接多个 MCP 服务器时,同时暴露数百个工具可能会耗尽 LLM 的上下文窗口并增加 API 成本。**Discovery** 功能通过默认*隐藏* MCP 工具来解决此问题。
|
||||
|
||||
LLM 不会加载所有工具,而是获得一个轻量级搜索工具(使用 BM25 关键词匹配或正则表达式)。当 LLM 需要特定功能时,它会搜索隐藏的工具库。匹配的工具随后被临时"解锁"并注入上下文中,持续配置的轮数(`ttl`)。
|
||||
|
||||
### 全局配置
|
||||
|
||||
| 配置项 | 类型 | 默认值 | 描述 |
|
||||
|-------------|--------|--------|--------------------------------------|
|
||||
| `enabled` | bool | false | 全局启用 MCP 集成 |
|
||||
| `discovery` | object | `{}` | 工具发现配置(见下文) |
|
||||
| `servers` | object | `{}` | 服务器名称到服务器配置的映射 |
|
||||
|
||||
### Discovery 配置(`discovery`)
|
||||
|
||||
| 配置项 | 类型 | 默认值 | 描述 |
|
||||
|----------------------|------|--------|---------------------------------------------------------------------------------------------------------------|
|
||||
| `enabled` | bool | false | 如果为 true,MCP 工具将被隐藏并按需通过搜索加载。如果为 false,所有工具都会被加载 |
|
||||
| `ttl` | int | 5 | 已发现工具保持解锁状态的对话轮数 |
|
||||
| `max_search_results` | int | 5 | 每次搜索查询返回的最大工具数 |
|
||||
| `use_bm25` | bool | true | 启用自然语言/关键词搜索工具(`tool_search_tool_bm25`)。**警告**:比正则搜索消耗更多资源 |
|
||||
| `use_regex` | bool | false | 启用正则模式搜索工具(`tool_search_tool_regex`) |
|
||||
|
||||
> **注意:** 如果 `discovery.enabled` 为 `true`,你**必须**启用至少一个搜索引擎(`use_bm25` 或 `use_regex`),
|
||||
> 否则应用程序将无法启动。
|
||||
|
||||
### 单服务器配置
|
||||
|
||||
| 配置项 | 类型 | 必需 | 描述 |
|
||||
|------------|--------|----------|------------------------------------|
|
||||
| `enabled` | bool | 是 | 启用此 MCP 服务器 |
|
||||
| `type` | string | 否 | 传输类型:`stdio`、`sse`、`http` |
|
||||
| `command` | string | stdio | stdio 传输的可执行命令 |
|
||||
| `args` | array | 否 | stdio 传输的命令参数 |
|
||||
| `env` | object | 否 | stdio 进程的环境变量 |
|
||||
| `env_file` | string | 否 | stdio 进程的环境文件路径 |
|
||||
| `url` | string | sse/http | `sse`/`http` 传输的端点 URL |
|
||||
| `headers` | object | 否 | `sse`/`http` 传输的 HTTP 头 |
|
||||
|
||||
### 传输行为
|
||||
|
||||
- 如果省略 `type`,传输方式将自动检测:
|
||||
- 设置了 `url` → `sse`
|
||||
- 设置了 `command` → `stdio`
|
||||
- `http` 和 `sse` 都使用 `url` + 可选的 `headers`。
|
||||
- `env` 和 `env_file` 仅应用于 `stdio` 服务器。
|
||||
|
||||
### 配置示例
|
||||
|
||||
#### 1) Stdio MCP 服务器
|
||||
|
||||
```json
|
||||
{
|
||||
"tools": {
|
||||
"mcp": {
|
||||
"enabled": true,
|
||||
"servers": {
|
||||
"filesystem": {
|
||||
"enabled": true,
|
||||
"command": "npx",
|
||||
"args": [
|
||||
"-y",
|
||||
"@modelcontextprotocol/server-filesystem",
|
||||
"/tmp"
|
||||
]
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
#### 2) 远程 SSE/HTTP MCP 服务器
|
||||
|
||||
```json
|
||||
{
|
||||
"tools": {
|
||||
"mcp": {
|
||||
"enabled": true,
|
||||
"servers": {
|
||||
"remote-mcp": {
|
||||
"enabled": true,
|
||||
"type": "sse",
|
||||
"url": "https://example.com/mcp",
|
||||
"headers": {
|
||||
"Authorization": "Bearer YOUR_TOKEN"
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
#### 3) 启用工具发现的大规模 MCP 设置
|
||||
|
||||
*在此示例中,LLM 只会看到 `tool_search_tool_bm25`。它将仅在用户请求时动态搜索并解锁 Github 或 Postgres 工具。*
|
||||
|
||||
```json
|
||||
{
|
||||
"tools": {
|
||||
"mcp": {
|
||||
"enabled": true,
|
||||
"discovery": {
|
||||
"enabled": true,
|
||||
"ttl": 5,
|
||||
"max_search_results": 5,
|
||||
"use_bm25": true,
|
||||
"use_regex": false
|
||||
},
|
||||
"servers": {
|
||||
"github": {
|
||||
"enabled": true,
|
||||
"command": "npx",
|
||||
"args": [
|
||||
"-y",
|
||||
"@modelcontextprotocol/server-github"
|
||||
],
|
||||
"env": {
|
||||
"GITHUB_PERSONAL_ACCESS_TOKEN": "YOUR_GITHUB_TOKEN"
|
||||
}
|
||||
},
|
||||
"postgres": {
|
||||
"enabled": true,
|
||||
"command": "npx",
|
||||
"args": [
|
||||
"-y",
|
||||
"@modelcontextprotocol/server-postgres",
|
||||
"postgresql://user:password@localhost/dbname"
|
||||
]
|
||||
},
|
||||
"slack": {
|
||||
"enabled": true,
|
||||
"type": "slack",
|
||||
"command": "npx",
|
||||
"args": [
|
||||
"-y",
|
||||
"@modelcontextprotocol/server-slack"
|
||||
],
|
||||
"env": {
|
||||
"SLACK_BOT_TOKEN": "YOUR_SLACK_BOT_TOKEN",
|
||||
"SLACK_TEAM_ID": "YOUR_SLACK_TEAM_ID"
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
<a id="skills-tool"></a>
|
||||
## Skills 工具
|
||||
|
||||
Skills 工具配置通过 ClawHub 等注册表进行技能发现和安装。
|
||||
|
||||
### 注册表
|
||||
|
||||
| 配置项 | 类型 | 默认值 | 描述 |
|
||||
|------------------------------------|--------|----------------------|--------------------------------------|
|
||||
| `registries.clawhub.enabled` | bool | true | 启用 ClawHub 注册表 |
|
||||
| `registries.clawhub.base_url` | string | `https://clawhub.ai` | ClawHub 基础 URL |
|
||||
| `registries.clawhub.auth_token` | string | `""` | 可选的 Bearer 令牌,用于更高速率限制 |
|
||||
| `registries.clawhub.search_path` | string | `""` | 搜索 API 路径 |
|
||||
| `registries.clawhub.skills_path` | string | `""` | Skills API 路径 |
|
||||
| `registries.clawhub.download_path` | string | `""` | 下载 API 路径 |
|
||||
| `registries.clawhub.timeout` | int | 0 | 请求超时时间(秒),0 = 默认 |
|
||||
| `registries.clawhub.max_zip_size` | int | 0 | 技能 zip 最大大小(字节),0 = 默认 |
|
||||
| `registries.clawhub.max_response_size` | int | 0 | API 响应最大大小(字节),0 = 默认 |
|
||||
|
||||
### GitHub 集成
|
||||
|
||||
| 配置项 | 类型 | 默认值 | 描述 |
|
||||
|------------------|--------|--------|-------------------------------|
|
||||
| `github.proxy` | string | `""` | GitHub API 请求的 HTTP 代理 |
|
||||
| `github.token` | string | `""` | GitHub 个人访问令牌 |
|
||||
|
||||
### 搜索设置
|
||||
|
||||
| 配置项 | 类型 | 默认值 | 描述 |
|
||||
|----------------------------|------|--------|--------------------------|
|
||||
| `max_concurrent_searches` | int | 2 | 最大并发技能搜索请求数 |
|
||||
| `search_cache.max_size` | int | 50 | 最大缓存搜索结果数 |
|
||||
| `search_cache.ttl_seconds` | int | 300 | 缓存 TTL(秒) |
|
||||
|
||||
### 配置示例
|
||||
|
||||
```json
|
||||
{
|
||||
"tools": {
|
||||
"skills": {
|
||||
"registries": {
|
||||
"clawhub": {
|
||||
"enabled": true,
|
||||
"base_url": "https://clawhub.ai",
|
||||
"auth_token": ""
|
||||
}
|
||||
},
|
||||
"github": {
|
||||
"proxy": "",
|
||||
"token": ""
|
||||
},
|
||||
"max_concurrent_searches": 2,
|
||||
"search_cache": {
|
||||
"max_size": 50,
|
||||
"ttl_seconds": 300
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## 环境变量
|
||||
|
||||
所有配置选项都可以通过格式为 `PICOCLAW_TOOLS_<SECTION>_<KEY>` 的环境变量覆盖:
|
||||
|
||||
例如:
|
||||
|
||||
- `PICOCLAW_TOOLS_WEB_BRAVE_ENABLED=true`
|
||||
- `PICOCLAW_TOOLS_EXEC_ENABLED=false`
|
||||
- `PICOCLAW_TOOLS_EXEC_ENABLE_DENY_PATTERNS=false`
|
||||
- `PICOCLAW_TOOLS_CRON_EXEC_TIMEOUT_MINUTES=10`
|
||||
- `PICOCLAW_TOOLS_MCP_ENABLED=true`
|
||||
|
||||
注意:嵌套的映射式配置(例如 `tools.mcp.servers.<name>.*`)在 `config.json` 中配置,而非通过环境变量。
|
||||
|
||||
## Skills Tool
|
||||
|
||||
Skills 工具用于通过仓库源发现和安装 Skill,支持 ClawHub 与 GitHub。
|
||||
|
||||
### Registries
|
||||
|
||||
| 配置项 | 类型 | 默认值 | 说明 |
|
||||
|--------|------|--------|------|
|
||||
| `registries.clawhub.enabled` | bool | true | 是否启用 ClawHub |
|
||||
| `registries.clawhub.base_url` | string | `https://clawhub.ai` | ClawHub 基础地址 |
|
||||
| `registries.clawhub.auth_token` | string | `""` | ClawHub 认证令牌 |
|
||||
| `registries.github.enabled` | bool | true | 是否启用 GitHub |
|
||||
| `registries.github.base_url` | string | `https://github.com` | GitHub 或 GitHub Enterprise 基础地址 |
|
||||
| `registries.github.auth_token` | string | `""` | GitHub 访问令牌 |
|
||||
| `registries.github.proxy` | string | `""` | GitHub 请求代理 |
|
||||
|
||||
### 旧版 GitHub 配置
|
||||
|
||||
`github.*` 已废弃,建议迁移到 `registries.github.*`。当前仍保留兼容,后续可移除。
|
||||
|
||||
| 配置项 | 类型 | 默认值 | 说明 |
|
||||
|--------|------|--------|------|
|
||||
| `github.base_url` | string | `https://github.com` | 已废弃 |
|
||||
| `github.proxy` | string | `""` | 已废弃 |
|
||||
| `github.token` | string | `""` | 已废弃 |
|
||||
Reference in New Issue
Block a user