Compare commits

..

7 Commits

Author SHA1 Message Date
zepan da79c201c7 1. fix typo 2026-02-17 18:03:02 +08:00
zepan 5fb2721d22 1. add android phone termux quick guide 2026-02-17 18:01:39 +08:00
zepan 951b05d255 1. add AI Code Generation selection in pr template 2026-02-17 17:15:40 +08:00
zepan ac4b16dfb4 1. rename doc to docs 2026-02-17 16:51:38 +08:00
zepan 0fadbcd340 1. add roadmap.md 2026-02-17 16:03:07 +08:00
Guoguo a961a2df87 fix(ci): use env var for release tag (#342)
Signed-off-by: Guoguo <i@qwq.trade>
2026-02-17 14:32:51 +08:00
zepan 57dac394c5 update pr template 2026-02-17 09:30:30 +08:00
11 changed files with 191 additions and 409 deletions
+37
View File
@@ -0,0 +1,37 @@
## 📝 Description
## 🗣️ Type of Change
- [ ] 🐞 Bug fix (non-breaking change which fixes an issue)
- [ ] ✨ New feature (non-breaking change which adds functionality)
- [ ] 📖 Documentation update
- [ ] ⚡ Code refactoring (no functional changes, no api changes)
## 🤖 AI Code Generation
- [ ] 🤖 Fully AI-generated (100% AI, 0% Human)
- [ ] 🛠️ Mostly AI-generated (AI draft, Human verified/modified)
- [ ] 👨‍💻 Mostly Human-written (Human lead, AI assisted or none)
## 🔗 Linked Issue
## 📚 Technical Context (Skip for Docs)
* **Reference:** [URL]
* **Reasoning:** ...
## 🧪 Test Environment & Hardware
- **Hardware:** [e.g. Raspberry Pi 5, Orange Pi, PC]
- **OS:** [e.g. Debian 12, Ubuntu 22.04]
- **Model/Provider:** [e.g. OpenAI GPT-4o, Kimi k2, DeepSeek-V3]
- **Channels:** [e.g. Discord, Telegram, Feishu, ...]
## 📸 Proof of Work (Optional for Docs)
<details>
<summary>Click to view Logs/Screenshots</summary>
</details>
## ☑️ Checklist
- [ ] My code/docs follow the style of this project.
- [ ] I have performed a self-review of my own changes.
- [ ] I have updated the documentation accordingly.
+4 -2
View File
@@ -32,11 +32,13 @@ jobs:
- name: Create and push tag
shell: bash
env:
RELEASE_TAG: ${{ inputs.tag }}
run: |
git config user.name "github-actions[bot]"
git config user.email "github-actions[bot]@users.noreply.github.com"
git tag -a "${{ inputs.tag }}" -m "Release ${{ inputs.tag }}"
git push origin "${{ inputs.tag }}"
git tag -a "$RELEASE_TAG" -m "Release $RELEASE_TAG"
git push origin "$RELEASE_TAG"
release:
name: GoReleaser Release
+15 -1
View File
@@ -49,7 +49,7 @@
## 📢 News
2026-02-16 🎉 PicoClaw hit 12K stars in one week! Thank you all for your support! PicoClaw is growing faster than we ever imagined. Given the high volume of PRs, we urgently need community maintainers. Our volunteer roles and roadmap are officially posted [here](doc/picoclaw_community_roadmap_260216.md) —we cant wait to have you on board!
2026-02-16 🎉 PicoClaw hit 12K stars in one week! Thank you all for your support! PicoClaw is growing faster than we ever imagined. Given the high volume of PRs, we urgently need community maintainers. Our volunteer roles and roadmap are officially posted [here](docs/picoclaw_community_roadmap_260216.md) —we cant wait to have you on board!
2026-02-13 🎉 PicoClaw hit 5000 stars in 4days! Thank you for the community! There are so many PRs&issues come in (during Chinese New Year holidays), we are finalizing the Project Roadmap and setting up the Developer Group to accelerate PicoClaw's development.
🚀 Call to Action: Please submit your feature requests in GitHub Discussions. We will review and prioritize them during our upcoming weekly meeting.
@@ -99,6 +99,20 @@
</tr>
</table>
### 📱 Run on old Android Phones
Give your decade-old phone a second life! Turn it into a smart AI Assistant with PicoClaw. Quick Start:
1. **Install Termux** (Available on F-Droid or Google Play).
2. **Execute cmds**
```bash
# Note: Replace v0.1.1 with the latest version from the Releases page
wget https://github.com/sipeed/picoclaw/releases/download/v0.1.1/picoclaw-linux-arm64
chmod +x picoclaw-linux-arm64
pkg install proot
termux-chroot ./picoclaw-linux-arm64 onboard
```
And then follow the instructions in the "Quick Start" section to complete the configuration!
<img src="assets/termux.jpg" alt="PicoClaw" width="512">
### 🐜 Innovative Low-Footprint Deploy
PicoClaw can be deployed on almost any Linux device!
+18 -1
View File
@@ -50,7 +50,7 @@
## 📢 新闻 (News)
2026-02-16 🎉 PicoClaw 在一周内突破了12K star! 感谢大家的关注!PicoClaw 的成长速度超乎我们预期. 由于PR数量的快速膨胀,我们亟需社区开发者参与维护. 我们需要的志愿者角色和roadmap已经发布到了[这里](doc/picoclaw_community_roadmap_260216.md), 期待你的参与!
2026-02-16 🎉 PicoClaw 在一周内突破了12K star! 感谢大家的关注!PicoClaw 的成长速度超乎我们预期. 由于PR数量的快速膨胀,我们亟需社区开发者参与维护. 我们需要的志愿者角色和roadmap已经发布到了[这里](docs/picoclaw_community_roadmap_260216.md), 期待你的参与!
2026-02-13 🎉 **PicoClaw 在 4 天内突破 5000 Stars** 感谢社区的支持!由于正值中国春节假期,PR 和 Issue 涌入较多,我们正在利用这段时间敲定 **项目路线图 (Roadmap)** 并组建 **开发者群组**,以便加速 PicoClaw 的开发。
🚀 **行动号召:** 请在 GitHub Discussions 中提交您的功能请求 (Feature Requests)。我们将在接下来的周会上进行审查和优先级排序。
@@ -100,6 +100,23 @@
</tr>
</table>
### 📱 在手机上轻松运行
picoclaw 可以将你10年前的老旧手机废物利用,变身成为你的AI助理!快速指南:
1. 先去应用商店下载安装Termux
2. 打开后执行指令
```bash
# 注意: 下面的v0.1.1 可以换为你实际看到的最新版本
wget https://github.com/sipeed/picoclaw/releases/download/v0.1.1/picoclaw-linux-arm64
chmod +x picoclaw-linux-arm64
pkg install proot
termux-chroot ./picoclaw-linux-arm64 onboard
```
然后跟随下面的“快速开始”章节继续配置picoclaw即可使用!
<img src="assets/termux.jpg" alt="PicoClaw" width="512">
### 🐜 创新的低占用部署
PicoClaw 几乎可以部署在任何 Linux 设备上!
+116
View File
@@ -0,0 +1,116 @@
# 🦐 PicoClaw Roadmap
> **Vision**: To build the ultimate lightweight, secure, and fully autonomous AI Agent infrastructure.automate the mundane, unleash your creativity
---
## 🚀 1. Core Optimization: Extreme Lightweight
*Our defining characteristic. We fight software bloat to ensure PicoClaw runs smoothly on the smallest embedded devices.*
* [**Memory Footprint Reduction**](https://github.com/sipeed/picoclaw/issues/346)
* **Goal**: Run smoothly on 64MB RAM embedded boards (e.g., low-end RISC-V SBCs) with the core process consuming < 20MB.
* **Context**: RAM is expensive and scarce on edge devices. Memory optimization takes precedence over storage size.
* **Action**: Analyze memory growth between releases, remove redundant dependencies, and optimize data structures.
## 🛡️ 2. Security Hardening: Defense in Depth
*Paying off early technical debt. We invite security experts to help build a "Secure-by-Default" agent.*
* **Input Defense & Permission Control**
* **Prompt Injection Defense**: Harden JSON extraction logic to prevent LLM manipulation.
* **Tool Abuse Prevention**: Strict parameter validation to ensure generated commands stay within safe boundaries.
* **SSRF Protection**: Built-in blocklists for network tools to prevent accessing internal IPs (LAN/Metadata services).
* **Sandboxing & Isolation**
* **Filesystem Sandbox**: Restrict file R/W operations to specific directories only.
* **Context Isolation**: Prevent data leakage between different user sessions or channels.
* **Privacy Redaction**: Auto-redact sensitive info (API Keys, PII) from logs and standard outputs.
* **Authentication & Secrets**
* **Crypto Upgrade**: Adopt modern algorithms like `ChaCha20-Poly1305` for secret storage.
* **OAuth 2.0 Flow**: Deprecate hardcoded API keys in the CLI; move to secure OAuth flows.
## 🔌 3. Connectivity: Protocol-First Architecture
*Connect every model, reach every platform.*
* **Provider**
* [**Architecture Upgrade**](https://github.com/sipeed/picoclaw/issues/283): Refactor from "Vendor-based" to "Protocol-based" classification (e.g., OpenAI-compatible, Ollama-compatible). *(Status: In progress by @Daming, ETA 5 days)*
* **Local Models**: Deep integration with **Ollama**, **vLLM**, **LM Studio**, and **Mistral** (local inference).
* **Online Models**: Continued support for frontier closed-source models.
* **Channel**
* **IM Matrix**: QQ, WeChat (Work), DingTalk, Feishu (Lark), Telegram, Discord, WhatsApp, LINE, Slack, Email, KOOK, Signal, ...
* **Standards**: Support for the **OneBot** protocol.
* [**attachment**](https://github.com/sipeed/picoclaw/issues/348): Native handling of images, audio, and video attachments.
* **Skill Marketplace**
* [**Discovery skills**](https://github.com/sipeed/picoclaw/issues/287): Implement `find_skill` to automatically discover and install skills from the [GitHub Skills Repo] or other registries.
## 🧠 4. Advanced Capabilities: From Chatbot to Agentic AI
*Beyond conversation—focusing on action and collaboration.*
* **Operations**
* [**MCP Support**](https://github.com/sipeed/picoclaw/issues/290): Native support for the **Model Context Protocol (MCP)**.
* [**Browser Automation**](https://github.com/sipeed/picoclaw/issues/293): Headless browser control via CDP (Chrome DevTools Protocol) or ActionBook.
* [**Mobile Operation**](https://github.com/sipeed/picoclaw/issues/292): Android device control (similar to BotDrop).
* **Multi-Agent Collaboration**
* [**Basic Multi-Agent**](https://github.com/sipeed/picoclaw/issues/294) implement
* [**Model Routing**](https://github.com/sipeed/picoclaw/issues/295): "Smart Routing" — dispatch simple tasks to small/local models (fast/cheap) and complex tasks to SOTA models (smart).
* [**Swarm Mode**](https://github.com/sipeed/picoclaw/issues/284): Collaboration between multiple PicoClaw instances on the same network.
* [**AIEOS**](https://github.com/sipeed/picoclaw/issues/296): Exploring AI-Native Operating System interaction paradigms.
## 📚 5. Developer Experience (DevEx) & Documentation
*Lowering the barrier to entry so anyone can deploy in minutes.*
* [**QuickGuide (Zero-Config Start)**](https://github.com/sipeed/picoclaw/issues/350)
* Interactive CLI Wizard: If launched without config, automatically detect the environment and guide the user through Token/Network setup step-by-step.
* **Comprehensive Documentation**
* **Platform Guides**: Dedicated guides for Windows, macOS, Linux, and Android.
* **Step-by-Step Tutorials**: "Babysitter-level" guides for configuring Providers and Channels.
* **AI-Assisted Docs**: Using AI to auto-generate API references and code comments (with human verification to prevent hallucinations).
## 🤖 6. Engineering: AI-Powered Open Source
*Born from Vibe Coding, we continue to use AI to accelerate development.*
* **AI-Enhanced CI/CD**
* Integrate AI for automated Code Review, Linting, and PR Labeling.
* **Bot Noise Reduction**: Optimize bot interactions to keep PR timelines clean.
* **Issue Triage**: AI agents to analyze incoming issues and suggest preliminary fixes.
## 🎨 7. Brand & Community
* [**Logo Design**](https://github.com/sipeed/picoclaw/issues/297): We are looking for a **Mantis Shrimp (Stomatopoda)** logo design!
* *Concept*: Needs to reflect "Small but Mighty" and "Lightning Fast Strikes."
---
### 🤝 Call for Contributions
We welcome community contributions to any item on this roadmap! Please comment on the relevant Issue or submit a PR. Let's build the best Edge AI Agent together!
BIN
View File
Binary file not shown.

After

Width:  |  Height:  |  Size: 97 KiB

-10
View File
@@ -84,16 +84,6 @@ func createToolRegistry(workspace string, restrict bool, cfg *config.Config, msg
}
registry.Register(tools.NewWebFetchTool(50000))
// Browser automation tool (agent-browser CLI)
if cfg.Tools.Browser.Enabled {
registry.Register(tools.NewBrowserTool(tools.BrowserToolOptions{
Session: cfg.Tools.Browser.Session,
Headless: cfg.Tools.Browser.Headless,
Timeout: cfg.Tools.Browser.Timeout,
CDPPort: cfg.Tools.Browser.CDPPort,
}))
}
// Hardware tools (I2C, SPI) - Linux only, returns error on other platforms
registry.Register(tools.NewI2CTool())
registry.Register(tools.NewSPITool())
+1 -16
View File
@@ -211,17 +211,8 @@ type WebToolsConfig struct {
DuckDuckGo DuckDuckGoConfig `json:"duckduckgo"`
}
type BrowserConfig struct {
Enabled bool `json:"enabled" env:"PICOCLAW_TOOLS_BROWSER_ENABLED"`
Session string `json:"session" env:"PICOCLAW_TOOLS_BROWSER_SESSION"`
Headless bool `json:"headless" env:"PICOCLAW_TOOLS_BROWSER_HEADLESS"`
Timeout int `json:"timeout" env:"PICOCLAW_TOOLS_BROWSER_TIMEOUT"`
CDPPort int `json:"cdp_port" env:"PICOCLAW_TOOLS_BROWSER_CDP_PORT"`
}
type ToolsConfig struct {
Web WebToolsConfig `json:"web"`
Browser BrowserConfig `json:"browser"`
Web WebToolsConfig `json:"web"`
}
func DefaultConfig() *Config {
@@ -331,12 +322,6 @@ func DefaultConfig() *Config {
MaxResults: 5,
},
},
Browser: BrowserConfig{
Enabled: false,
Headless: true,
Timeout: 30,
CDPPort: 9222,
},
},
Heartbeat: HeartbeatConfig{
Enabled: true,
-229
View File
@@ -1,229 +0,0 @@
package tools
import (
"bytes"
"context"
"fmt"
"os/exec"
"strings"
"time"
)
// BrowserToolOptions configures the BrowserTool.
type BrowserToolOptions struct {
Session string // Session name for isolation
Headless bool // Run in headless mode (default true)
Timeout int // Command timeout in seconds (default 30)
CDPPort int // Chrome DevTools Protocol port (default 9222)
}
// BrowserTool wraps the agent-browser CLI for headless browser automation.
// It delegates all browser complexity to the external `agent-browser` binary.
type BrowserTool struct {
session string
headless bool
timeout time.Duration
cdpPort int
}
// NewBrowserTool creates a new BrowserTool with the given options.
func NewBrowserTool(opts BrowserToolOptions) *BrowserTool {
timeout := 30
if opts.Timeout > 0 {
timeout = opts.Timeout
}
cdpPort := 9222
if opts.CDPPort > 0 {
cdpPort = opts.CDPPort
}
return &BrowserTool{
session: opts.Session,
headless: opts.Headless,
timeout: time.Duration(timeout) * time.Second,
cdpPort: cdpPort,
}
}
func (t *BrowserTool) Name() string {
return "browser"
}
func (t *BrowserTool) Description() string {
return `Automate a headless browser via agent-browser CLI. Pass the subcommand as 'command'.
The browser daemon persists between calls — open a page first, then interact with it.
Core workflow:
browser open <url> → Navigate to URL
browser snapshot -i → Get interactive elements with refs (@e1, @e2, ...)
browser click @e2 → Click element by ref
browser fill @e3 "text" → Fill input by ref
browser type @e3 "text" → Type into element
browser press Enter → Press a key
browser screenshot [path] → Take screenshot
browser get text @e1 → Get text content of element
browser get title → Get page title
browser get url → Get current URL
browser eval "js code" → Run JavaScript
browser scroll down [px] → Scroll page
browser wait <selector|ms> → Wait for element or time
browser close → Close browser
CSS selectors also work: browser click "#submit"
Examples:
command: "open https://example.com"
command: "snapshot -i"
command: "click @e2"
command: "fill @e3 \"user@example.com\""
command: "get title"
command: "screenshot /tmp/page.png"
command: "close"`
}
func (t *BrowserTool) Parameters() map[string]interface{} {
return map[string]interface{}{
"type": "object",
"properties": map[string]interface{}{
"command": map[string]interface{}{
"type": "string",
"description": "The agent-browser subcommand to execute (e.g. 'open https://example.com', 'snapshot -i', 'click @e2')",
},
},
"required": []string{"command"},
}
}
func (t *BrowserTool) Execute(ctx context.Context, args map[string]interface{}) *ToolResult {
command, ok := args["command"].(string)
if !ok || strings.TrimSpace(command) == "" {
return ErrorResult("command is required (e.g. 'open https://example.com')")
}
// Build the full agent-browser command line
cmdArgs := t.buildArgs(command)
cmdCtx, cancel := context.WithTimeout(ctx, t.timeout)
defer cancel()
cmd := exec.CommandContext(cmdCtx, "agent-browser", cmdArgs...)
var stdout, stderr bytes.Buffer
cmd.Stdout = &stdout
cmd.Stderr = &stderr
err := cmd.Run()
output := stdout.String()
if stderr.Len() > 0 {
errOut := stderr.String()
// Filter out noise from stderr (daemon startup messages, etc.)
if !strings.Contains(errOut, "Daemon started") {
if output != "" {
output += "\n"
}
output += errOut
}
}
if err != nil {
if cmdCtx.Err() == context.DeadlineExceeded {
msg := fmt.Sprintf("Browser command timed out after %v: %s", t.timeout, command)
return &ToolResult{
ForLLM: msg,
ForUser: msg,
IsError: true,
}
}
// Include output even on error — agent-browser often puts useful info in stdout
if output == "" {
output = fmt.Sprintf("command failed: %v", err)
} else {
output += fmt.Sprintf("\nExit code: %v", err)
}
}
if output == "" {
output = "(no output)"
}
// Truncate long output
maxLen := 10000
if len(output) > maxLen {
output = output[:maxLen] + fmt.Sprintf("\n... (truncated, %d more chars)", len(output)-maxLen)
}
if err != nil {
return &ToolResult{
ForLLM: output,
ForUser: output,
IsError: true,
}
}
return &ToolResult{
ForLLM: output,
ForUser: output,
IsError: false,
}
}
// buildArgs constructs the argument list for the agent-browser command.
// It splits the user command string and prepends global flags.
func (t *BrowserTool) buildArgs(command string) []string {
var globalArgs []string
// Add CDP port
globalArgs = append(globalArgs, "--cdp", fmt.Sprintf("%d", t.cdpPort))
// Add session flag if configured
if t.session != "" {
globalArgs = append(globalArgs, "--session", t.session)
}
// Add --headed if not headless (agent-browser defaults to headless)
if !t.headless {
globalArgs = append(globalArgs, "--headed")
}
// Add --json for machine-readable output
globalArgs = append(globalArgs, "--json")
// Parse the command string into arguments, respecting quotes
cmdArgs := splitCommand(command)
return append(globalArgs, cmdArgs...)
}
// splitCommand splits a command string into arguments, respecting quoted strings.
func splitCommand(command string) []string {
var args []string
var current strings.Builder
inQuote := false
quoteChar := byte(0)
for i := 0; i < len(command); i++ {
ch := command[i]
switch {
case inQuote:
if ch == quoteChar {
inQuote = false
} else {
current.WriteByte(ch)
}
case ch == '"' || ch == '\'':
inQuote = true
quoteChar = ch
case ch == ' ' || ch == '\t':
if current.Len() > 0 {
args = append(args, current.String())
current.Reset()
}
default:
current.WriteByte(ch)
}
}
if current.Len() > 0 {
args = append(args, current.String())
}
return args
}
-150
View File
@@ -1,150 +0,0 @@
package tools
import (
"context"
"strings"
"testing"
)
func TestBrowserTool_Name(t *testing.T) {
tool := NewBrowserTool(BrowserToolOptions{})
if tool.Name() != "browser" {
t.Errorf("Expected name 'browser', got %q", tool.Name())
}
}
func TestBrowserTool_Description(t *testing.T) {
tool := NewBrowserTool(BrowserToolOptions{})
desc := tool.Description()
if !strings.Contains(desc, "agent-browser") {
t.Error("Description should mention agent-browser")
}
if !strings.Contains(desc, "snapshot") {
t.Error("Description should mention snapshot command")
}
}
func TestBrowserTool_Parameters(t *testing.T) {
tool := NewBrowserTool(BrowserToolOptions{})
params := tool.Parameters()
props, ok := params["properties"].(map[string]interface{})
if !ok {
t.Fatal("Expected properties map")
}
if _, ok := props["command"]; !ok {
t.Error("Expected 'command' in properties")
}
required, ok := params["required"].([]string)
if !ok {
t.Fatal("Expected required slice")
}
if len(required) != 1 || required[0] != "command" {
t.Errorf("Expected required=['command'], got %v", required)
}
}
func TestBrowserTool_MissingCommand(t *testing.T) {
tool := NewBrowserTool(BrowserToolOptions{})
ctx := context.Background()
// Empty args
result := tool.Execute(ctx, map[string]interface{}{})
if !result.IsError {
t.Error("Expected error for missing command")
}
// Empty string
result = tool.Execute(ctx, map[string]interface{}{"command": ""})
if !result.IsError {
t.Error("Expected error for empty command")
}
// Whitespace only
result = tool.Execute(ctx, map[string]interface{}{"command": " "})
if !result.IsError {
t.Error("Expected error for whitespace-only command")
}
}
func TestBrowserTool_BuildArgs(t *testing.T) {
tests := []struct {
name string
session string
command string
wantArgs []string
}{
{
name: "simple command",
command: "open https://example.com",
wantArgs: []string{"--cdp", "9222", "--headed", "--json", "open", "https://example.com"},
},
{
name: "with session",
session: "test-session",
command: "snapshot -i",
wantArgs: []string{"--cdp", "9222", "--session", "test-session", "--headed", "--json", "snapshot", "-i"},
},
{
name: "quoted arguments",
command: `fill @e3 "hello world"`,
wantArgs: []string{"--cdp", "9222", "--headed", "--json", "fill", "@e3", "hello world"},
},
{
name: "single quoted",
command: `fill @e3 'hello world'`,
wantArgs: []string{"--cdp", "9222", "--headed", "--json", "fill", "@e3", "hello world"},
},
}
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
tool := NewBrowserTool(BrowserToolOptions{Session: tt.session})
got := tool.buildArgs(tt.command)
if len(got) != len(tt.wantArgs) {
t.Errorf("buildArgs(%q) = %v (len %d), want %v (len %d)",
tt.command, got, len(got), tt.wantArgs, len(tt.wantArgs))
return
}
for i := range got {
if got[i] != tt.wantArgs[i] {
t.Errorf("buildArgs(%q)[%d] = %q, want %q",
tt.command, i, got[i], tt.wantArgs[i])
}
}
})
}
}
func TestSplitCommand(t *testing.T) {
tests := []struct {
input string
want []string
}{
{"open https://example.com", []string{"open", "https://example.com"}},
{`fill @e3 "test@example.com"`, []string{"fill", "@e3", "test@example.com"}},
{"snapshot -i -c -d 3", []string{"snapshot", "-i", "-c", "-d", "3"}},
{`eval "document.title"`, []string{"eval", "document.title"}},
{" click @e2 ", []string{"click", "@e2"}},
{`get text @e1`, []string{"get", "text", "@e1"}},
}
for _, tt := range tests {
t.Run(tt.input, func(t *testing.T) {
got := splitCommand(tt.input)
if len(got) != len(tt.want) {
t.Errorf("splitCommand(%q) = %v, want %v", tt.input, got, tt.want)
return
}
for i := range got {
if got[i] != tt.want[i] {
t.Errorf("splitCommand(%q)[%d] = %q, want %q", tt.input, i, got[i], tt.want[i])
}
}
})
}
}