mirror of
https://github.com/sipeed/picoclaw.git
synced 2026-06-12 18:08:54 +00:00
feat(mcp): tool search tools (#1243)
* feat(mcp): tool search tools * removed unused call_discovered_tool * improvements and optimizations * fix gate mcp enabled * fix TOCTOU race BM25 cache version check * fix encapsulation bypass on registry internals * safety comment on TickTTL * added more unit tests * enhanced logs
This commit is contained in:
+124
-27
@@ -7,11 +7,21 @@ PicoClaw's tools configuration is located in the `tools` field of `config.json`.
|
||||
```json
|
||||
{
|
||||
"tools": {
|
||||
"web": { ... },
|
||||
"mcp": { ... },
|
||||
"exec": { ... },
|
||||
"cron": { ... },
|
||||
"skills": { ... }
|
||||
"web": {
|
||||
...
|
||||
},
|
||||
"mcp": {
|
||||
...
|
||||
},
|
||||
"exec": {
|
||||
...
|
||||
},
|
||||
"cron": {
|
||||
...
|
||||
},
|
||||
"skills": {
|
||||
...
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
@@ -23,7 +33,7 @@ Web tools are used for web search and fetching.
|
||||
### Brave
|
||||
|
||||
| Config | Type | Default | Description |
|
||||
| ------------- | ------ | ------- | ------------------------- |
|
||||
|---------------|--------|---------|---------------------------|
|
||||
| `enabled` | bool | false | Enable Brave search |
|
||||
| `api_key` | string | - | Brave Search API key |
|
||||
| `max_results` | int | 5 | Maximum number of results |
|
||||
@@ -31,14 +41,14 @@ Web tools are used for web search and fetching.
|
||||
### DuckDuckGo
|
||||
|
||||
| Config | Type | Default | Description |
|
||||
| ------------- | ---- | ------- | ------------------------- |
|
||||
|---------------|------|---------|---------------------------|
|
||||
| `enabled` | bool | true | Enable DuckDuckGo search |
|
||||
| `max_results` | int | 5 | Maximum number of results |
|
||||
|
||||
### Perplexity
|
||||
|
||||
| Config | Type | Default | Description |
|
||||
| ------------- | ------ | ------- | ------------------------- |
|
||||
|---------------|--------|---------|---------------------------|
|
||||
| `enabled` | bool | false | Enable Perplexity search |
|
||||
| `api_key` | string | - | Perplexity API key |
|
||||
| `max_results` | int | 5 | Maximum number of results |
|
||||
@@ -48,7 +58,7 @@ Web tools are used for web search and fetching.
|
||||
The exec tool is used to execute shell commands.
|
||||
|
||||
| Config | Type | Default | Description |
|
||||
| ---------------------- | ----- | ------- | ------------------------------------------ |
|
||||
|------------------------|-------|---------|--------------------------------------------|
|
||||
| `enable_deny_patterns` | bool | true | Enable default dangerous command blocking |
|
||||
| `custom_deny_patterns` | array | [] | Custom deny patterns (regular expressions) |
|
||||
|
||||
@@ -81,7 +91,10 @@ By default, PicoClaw blocks the following dangerous commands:
|
||||
"tools": {
|
||||
"exec": {
|
||||
"enable_deny_patterns": true,
|
||||
"custom_deny_patterns": ["\\brm\\s+-r\\b", "\\bkillall\\s+python"]
|
||||
"custom_deny_patterns": [
|
||||
"\\brm\\s+-r\\b",
|
||||
"\\bkillall\\s+python"
|
||||
]
|
||||
}
|
||||
}
|
||||
}
|
||||
@@ -92,24 +105,47 @@ By default, PicoClaw blocks the following dangerous commands:
|
||||
The cron tool is used for scheduling periodic tasks.
|
||||
|
||||
| Config | Type | Default | Description |
|
||||
| ---------------------- | ---- | ------- | ---------------------------------------------- |
|
||||
|------------------------|------|---------|------------------------------------------------|
|
||||
| `exec_timeout_minutes` | int | 5 | Execution timeout in minutes, 0 means no limit |
|
||||
|
||||
## MCP Tool
|
||||
|
||||
The MCP tool enables integration with external Model Context Protocol servers.
|
||||
|
||||
### Tool Discovery (Lazy Loading)
|
||||
|
||||
When connecting to multiple MCP servers, exposing hundreds of tools simultaneously can exhaust the LLM's context window
|
||||
and increase API costs. The **Discovery** feature solves this by keeping MCP tools *hidden* by default.
|
||||
|
||||
Instead of loading all tools, the LLM is provided with a lightweight search tool (using BM25 keyword matching or Regex).
|
||||
When the LLM needs a specific capability, it searches the hidden library. Matching tools are then temporarily "unlocked"
|
||||
and injected into the context for a configured number of turns (`ttl`).
|
||||
|
||||
### Global Config
|
||||
|
||||
| Config | Type | Default | Description |
|
||||
| --------- | ------ | ------- | ----------------------------------- |
|
||||
| `enabled` | bool | false | Enable MCP integration globally |
|
||||
| `servers` | object | `{}` | Map of server name to server config |
|
||||
| Config | Type | Default | Description |
|
||||
|-------------|--------|---------|----------------------------------------------|
|
||||
| `enabled` | bool | false | Enable MCP integration globally |
|
||||
| `discovery` | object | `{}` | Configuration for Tool Discovery (see below) |
|
||||
| `servers` | object | `{}` | Map of server name to server config |
|
||||
|
||||
### Discovery Config (`discovery`)
|
||||
|
||||
| Config | Type | Default | Description |
|
||||
|----------------------|------|---------|-----------------------------------------------------------------------------------------------------------------------------------|
|
||||
| `enabled` | bool | false | If true, MCP tools are hidden and loaded on-demand via search. If false, all tools are loaded |
|
||||
| `ttl` | int | 5 | Number of conversational turns a discovered tool remains unlocked |
|
||||
| `max_search_results` | int | 5 | Maximum number of tools returned per search query |
|
||||
| `use_bm25` | bool | true | Enable the natural language/keyword search tool (`tool_search_tool_bm25`). **Warning**: consumes more resources than regex search |
|
||||
| `use_regex` | bool | false | Enable the regex pattern search tool (`tool_search_tool_regex`) |
|
||||
|
||||
> **Note:** If `discovery.enabled` is `true`, you MUST enable at least one search engine (`use_bm25` or `use_regex`),
|
||||
> otherwise the application will fail to start.
|
||||
|
||||
### Per-Server Config
|
||||
|
||||
| Config | Type | Required | Description |
|
||||
| ---------- | ------ | -------- | ------------------------------------------ |
|
||||
|------------|--------|----------|--------------------------------------------|
|
||||
| `enabled` | bool | yes | Enable this MCP server |
|
||||
| `type` | string | no | Transport type: `stdio`, `sse`, `http` |
|
||||
| `command` | string | stdio | Executable command for stdio transport |
|
||||
@@ -122,8 +158,8 @@ The MCP tool enables integration with external Model Context Protocol servers.
|
||||
### Transport Behavior
|
||||
|
||||
- If `type` is omitted, transport is auto-detected:
|
||||
- `url` is set → `sse`
|
||||
- `command` is set → `stdio`
|
||||
- `url` is set → `sse`
|
||||
- `command` is set → `stdio`
|
||||
- `http` and `sse` both use `url` + optional `headers`.
|
||||
- `env` and `env_file` are only applied to `stdio` servers.
|
||||
|
||||
@@ -140,7 +176,11 @@ The MCP tool enables integration with external Model Context Protocol servers.
|
||||
"filesystem": {
|
||||
"enabled": true,
|
||||
"command": "npx",
|
||||
"args": ["-y", "@modelcontextprotocol/server-filesystem", "/tmp"]
|
||||
"args": [
|
||||
"-y",
|
||||
"@modelcontextprotocol/server-filesystem",
|
||||
"/tmp"
|
||||
]
|
||||
}
|
||||
}
|
||||
}
|
||||
@@ -170,20 +210,76 @@ The MCP tool enables integration with external Model Context Protocol servers.
|
||||
}
|
||||
```
|
||||
|
||||
#### 3) Massive MCP setup with Tool Discovery enabled
|
||||
|
||||
*In this example, the LLM will only see the `tool_search_tool_bm25`. It will search and unlock Github or Postgres tools
|
||||
dynamically only when requested by the user.*
|
||||
|
||||
```json
|
||||
{
|
||||
"tools": {
|
||||
"mcp": {
|
||||
"enabled": true,
|
||||
"discovery": {
|
||||
"enabled": true,
|
||||
"ttl": 5,
|
||||
"max_search_results": 5,
|
||||
"use_bm25": true,
|
||||
"use_regex": false
|
||||
},
|
||||
"servers": {
|
||||
"github": {
|
||||
"enabled": true,
|
||||
"command": "npx",
|
||||
"args": [
|
||||
"-y",
|
||||
"@modelcontextprotocol/server-github"
|
||||
],
|
||||
"env": {
|
||||
"GITHUB_PERSONAL_ACCESS_TOKEN": "YOUR_GITHUB_TOKEN"
|
||||
}
|
||||
},
|
||||
"postgres": {
|
||||
"enabled": true,
|
||||
"command": "npx",
|
||||
"args": [
|
||||
"-y",
|
||||
"@modelcontextprotocol/server-postgres",
|
||||
"postgresql://user:password@localhost/dbname"
|
||||
]
|
||||
},
|
||||
"slack": {
|
||||
"enabled": true,
|
||||
"command": "npx",
|
||||
"args": [
|
||||
"-y",
|
||||
"@modelcontextprotocol/server-slack"
|
||||
],
|
||||
"env": {
|
||||
"SLACK_BOT_TOKEN": "YOUR_SLACK_BOT_TOKEN",
|
||||
"SLACK_TEAM_ID": "YOUR_SLACK_TEAM_ID"
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Skills Tool
|
||||
|
||||
The skills tool configures skill discovery and installation via registries like ClawHub.
|
||||
|
||||
### Registries
|
||||
|
||||
| Config | Type | Default | Description |
|
||||
| ---------------------------------- | ------ | -------------------- | ----------------------- |
|
||||
| `registries.clawhub.enabled` | bool | true | Enable ClawHub registry |
|
||||
| `registries.clawhub.base_url` | string | `https://clawhub.ai` | ClawHub base URL |
|
||||
| Config | Type | Default | Description |
|
||||
|------------------------------------|--------|----------------------|----------------------------------------------|
|
||||
| `registries.clawhub.enabled` | bool | true | Enable ClawHub registry |
|
||||
| `registries.clawhub.base_url` | string | `https://clawhub.ai` | ClawHub base URL |
|
||||
| `registries.clawhub.auth_token` | string | `""` | Optional Bearer token for higher rate limits |
|
||||
| `registries.clawhub.search_path` | string | `/api/v1/search` | Search API path |
|
||||
| `registries.clawhub.skills_path` | string | `/api/v1/skills` | Skills API path |
|
||||
| `registries.clawhub.download_path` | string | `/api/v1/download` | Download API path |
|
||||
| `registries.clawhub.search_path` | string | `/api/v1/search` | Search API path |
|
||||
| `registries.clawhub.skills_path` | string | `/api/v1/skills` | Skills API path |
|
||||
| `registries.clawhub.download_path` | string | `/api/v1/download` | Download API path |
|
||||
|
||||
### Configuration Example
|
||||
|
||||
@@ -217,4 +313,5 @@ For example:
|
||||
- `PICOCLAW_TOOLS_CRON_EXEC_TIMEOUT_MINUTES=10`
|
||||
- `PICOCLAW_TOOLS_MCP_ENABLED=true`
|
||||
|
||||
Note: Nested map-style config (for example `tools.mcp.servers.<name>.*`) is configured in `config.json` rather than environment variables.
|
||||
Note: Nested map-style config (for example `tools.mcp.servers.<name>.*`) is configured in `config.json` rather than
|
||||
environment variables.
|
||||
|
||||
Reference in New Issue
Block a user