lzh/picoclaw

Fork 0

mirror of https://github.com/sipeed/picoclaw.git synced 2026-06-12 18:08:54 +00:00

Files

T

RussellLuo 92678d1700 docs(voice): Update docs for audio-transcription

2026-03-22 21:04:10 +08:00

16 KiB

Raw Blame History

🔌 Providers & Model Configuration

Back to README

Providers

Note

Voice transcription can use a configured multimodal model via voice.model_name. Groq Whisper remains available as a fallback when no voice model is configured.

Provider	Purpose	Get API Key
`gemini`	LLM (Gemini direct)	aistudio.google.com
`zhipu`	LLM (Zhipu direct)	bigmodel.cn
`volcengine`	LLM(Volcengine direct)	volcengine.com
`openrouter`	LLM (recommended, access to all models)	openrouter.ai
`anthropic`	LLM (Claude direct)	console.anthropic.com
`openai`	LLM (GPT direct)	platform.openai.com
`deepseek`	LLM (DeepSeek direct)	platform.deepseek.com
`qwen`	LLM (Qwen direct)	dashscope.console.aliyun.com
`groq`	LLM + Voice transcription (Whisper)	console.groq.com
`cerebras`	LLM (Cerebras direct)	cerebras.ai
`vivgrid`	LLM (Vivgrid direct)	vivgrid.com
`nvidia`	LLM (NVIDIA NIM)	build.nvidia.com
`moonshot`	LLM (Kimi/Moonshot direct)	platform.moonshot.cn
`minimax`	LLM (Minimax direct)	platform.minimaxi.com
`avian`	LLM (Avian direct)	avian.io
`mistral`	LLM (Mistral direct)	console.mistral.ai
`longcat`	LLM (Longcat direct)	longcat.ai
`modelscope`	LLM (ModelScope direct)	modelscope.cn

Model Configuration (model_list)

What's New? PicoClaw now uses a model-centric configuration approach. Simply specify vendor/model format (e.g., zhipu/glm-4.7) to add new providers—zero code changes required!

This design also enables multi-agent support with flexible provider selection:

Different agents, different providers: Each agent can use its own LLM provider
Model fallbacks: Configure primary and fallback models for resilience
Load balancing: Distribute requests across multiple endpoints
Centralized configuration: Manage all providers in one place

📋 All Supported Vendors

Vendor	`model` Prefix	Default API Base	Protocol	API Key
OpenAI	`openai/`	`https://api.openai.com/v1`	OpenAI	Get Key
Anthropic	`anthropic/`	`https://api.anthropic.com/v1`	Anthropic	Get Key
智谱 AI (GLM)	`zhipu/`	`https://open.bigmodel.cn/api/paas/v4`	OpenAI	Get Key
DeepSeek	`deepseek/`	`https://api.deepseek.com/v1`	OpenAI	Get Key
Google Gemini	`gemini/`	`https://generativelanguage.googleapis.com/v1beta`	OpenAI	Get Key
Groq	`groq/`	`https://api.groq.com/openai/v1`	OpenAI	Get Key
Moonshot	`moonshot/`	`https://api.moonshot.cn/v1`	OpenAI	Get Key
通义千问 (Qwen)	`qwen/`	`https://dashscope.aliyuncs.com/compatible-mode/v1`	OpenAI	Get Key
NVIDIA	`nvidia/`	`https://integrate.api.nvidia.com/v1`	OpenAI	Get Key
Ollama	`ollama/`	`http://localhost:11434/v1`	OpenAI	Local (no key needed)
OpenRouter	`openrouter/`	`https://openrouter.ai/api/v1`	OpenAI	Get Key
LiteLLM Proxy	`litellm/`	`http://localhost:4000/v1`	OpenAI	Your LiteLLM proxy key
VLLM	`vllm/`	`http://localhost:8000/v1`	OpenAI	Local
Cerebras	`cerebras/`	`https://api.cerebras.ai/v1`	OpenAI	Get Key
VolcEngine (Doubao)	`volcengine/`	`https://ark.cn-beijing.volces.com/api/v3`	OpenAI	Get Key
神算云	`shengsuanyun/`	`https://router.shengsuanyun.com/api/v1`	OpenAI	-
BytePlus	`byteplus/`	`https://ark.ap-southeast.bytepluses.com/api/v3`	OpenAI	Get Key
Vivgrid	`vivgrid/`	`https://api.vivgrid.com/v1`	OpenAI	Get Key
LongCat	`longcat/`	`https://api.longcat.chat/openai`	OpenAI	Get Key
ModelScope (魔搭)	`modelscope/`	`https://api-inference.modelscope.cn/v1`	OpenAI	Get Token
Azure OpenAI	`azure/`	`https://{resource}.openai.azure.com`	Azure	Get Key
Antigravity	`antigravity/`	Google Cloud	Custom	OAuth only
GitHub Copilot	`github-copilot/`	`localhost:4321`	gRPC	-

Basic Configuration

{
  "model_list": [
    {
      "model_name": "ark-code-latest",
      "model": "volcengine/ark-code-latest",
      "api_key": "sk-your-api-key"
    },
    {
      "model_name": "gpt-5.4",
      "model": "openai/gpt-5.4",
      "api_key": "sk-your-openai-key"
    },
    {
      "model_name": "claude-sonnet-4.6",
      "model": "anthropic/claude-sonnet-4.6",
      "api_key": "sk-ant-your-key"
    },
    {
      "model_name": "glm-4.7",
      "model": "zhipu/glm-4.7",
      "api_key": "your-zhipu-key"
    }
  ],
  "agents": {
    "defaults": {
      "model_name": "gpt-5.4"
    }
  }
}

Voice Transcription

You can configure a dedicated model for audio transcription with voice.model_name. This lets you reuse existing multimodal providers that support audio input instead of relying only on Groq.

If voice.model_name is not configured, PicoClaw will continue to fall back to Groq transcription when a Groq API key is available.

{
  "model_list": [
    {
      "model_name": "voice-gemini",
      "model": "gemini/gemini-2.5-flash",
      "api_key": "your-gemini-key"
    }
  ],
  "voice": {
    "model_name": "voice-gemini",
    "echo_transcription": false
  },
  "providers": {
    "groq": {
      "api_key": "gsk_xxx"
    }
  }
}

Vendor-Specific Examples

OpenAI

{
  "model_name": "gpt-5.4",
  "model": "openai/gpt-5.4",
  "api_key": "sk-..."
}

VolcEngine (Doubao)

{
  "model_name": "ark-code-latest",
  "model": "volcengine/ark-code-latest",
  "api_key": "sk-..."
}

智谱 AI (GLM)

{
  "model_name": "glm-4.7",
  "model": "zhipu/glm-4.7",
  "api_key": "your-key"
}

DeepSeek

{
  "model_name": "deepseek-chat",
  "model": "deepseek/deepseek-chat",
  "api_key": "sk-..."
}

Anthropic (with API key)

{
  "model_name": "claude-sonnet-4.6",
  "model": "anthropic/claude-sonnet-4.6",
  "api_key": "sk-ant-your-key"
}

Run picoclaw auth login --provider anthropic to paste your API token.

Anthropic Messages API (native format)

For direct Anthropic API access or custom endpoints that only support Anthropic's native message format:

{
  "model_name": "claude-opus-4-6",
  "model": "anthropic-messages/claude-opus-4-6",
  "api_key": "sk-ant-your-key",
  "api_base": "https://api.anthropic.com"
}

Use anthropic-messages protocol when:

Using third-party proxies that only support Anthropic's native /v1/messages endpoint (not OpenAI-compatible /v1/chat/completions)

Connecting to services like MiniMax, Synthetic that require Anthropic's native message format

The existing anthropic protocol returns 404 errors (indicating the endpoint doesn't support OpenAI-compatible format)

Note: The anthropic protocol uses OpenAI-compatible format (/v1/chat/completions), while anthropic-messages uses Anthropic's native format (/v1/messages). Choose based on your endpoint's supported format.

Ollama (local)

{
  "model_name": "llama3",
  "model": "ollama/llama3"
}

Custom Proxy/API

{
  "model_name": "my-custom-model",
  "model": "openai/custom-model",
  "api_base": "https://my-proxy.com/v1",
  "api_key": "sk-...",
  "request_timeout": 300
}

LiteLLM Proxy

{
  "model_name": "lite-gpt4",
  "model": "litellm/lite-gpt4",
  "api_base": "http://localhost:4000/v1",
  "api_key": "sk-..."
}

PicoClaw strips only the outer litellm/ prefix before sending the request, so proxy aliases like litellm/lite-gpt4 send lite-gpt4, while litellm/openai/gpt-4o sends openai/gpt-4o.

Load Balancing

Configure multiple endpoints for the same model name—PicoClaw will automatically round-robin between them:

{
  "model_list": [
    {
      "model_name": "gpt-5.4",
      "model": "openai/gpt-5.4",
      "api_base": "https://api1.example.com/v1",
      "api_key": "sk-key1"
    },
    {
      "model_name": "gpt-5.4",
      "model": "openai/gpt-5.4",
      "api_base": "https://api2.example.com/v1",
      "api_key": "sk-key2"
    }
  ]
}

Migration from Legacy `providers` Config

The old providers configuration is deprecated but still supported for backward compatibility.

Old Config (deprecated):

{
  "providers": {
    "zhipu": {
      "api_key": "your-key",
      "api_base": "https://open.bigmodel.cn/api/paas/v4"
    }
  },
  "agents": {
    "defaults": {
      "provider": "zhipu",
      "model": "glm-4.7"
    }
  }
}

New Config (recommended):

{
  "model_list": [
    {
      "model_name": "glm-4.7",
      "model": "zhipu/glm-4.7",
      "api_key": "your-key"
    }
  ],
  "agents": {
    "defaults": {
      "model_name": "glm-4.7"
    }
  }
}

For detailed migration guide, see migration/model-list-migration.md.

Provider Architecture

PicoClaw routes providers by protocol family:

OpenAI-compatible protocol: OpenRouter, OpenAI-compatible gateways, Groq, Zhipu, and vLLM-style endpoints.
Anthropic protocol: Claude-native API behavior.
Codex/OAuth path: OpenAI OAuth/token authentication route.

This keeps the runtime lightweight while making new OpenAI-compatible backends mostly a config operation (api_base + api_key).

Zhipu

1. Get API key and base URL

Get API key

2. Configure

{
  "agents": {
    "defaults": {
      "workspace": "~/.picoclaw/workspace",
      "model_name": "glm-4.7",
      "max_tokens": 8192,
      "temperature": 0.7,
      "max_tool_iterations": 20
    }
  },
  "providers": {
    "zhipu": {
      "api_key": "Your API Key",
      "api_base": "https://open.bigmodel.cn/api/paas/v4"
    }
  }
}

3. Run

picoclaw agent -m "Hello"

Full config example

{
  "agents": {
    "defaults": {
      "model_name": "anthropic/claude-opus-4-5"
    }
  },
  "session": {
    "dm_scope": "per-channel-peer"
  },
  "providers": {
    "openrouter": {
      "api_key": "sk-or-v1-xxx"
    },
    "groq": {
      "api_key": "gsk_xxx"
    }
  },
  "voice": {
    "model_name": "voice-gemini",
    "echo_transcription": false
  },
  "channels": {
    "telegram": {
      "enabled": true,
      "token": "123456:ABC...",
      "allow_from": ["123456789"]
    },
    "discord": {
      "enabled": true,
      "token": "",
      "allow_from": [""]
    },
    "whatsapp": {
      "enabled": false,
      "bridge_url": "ws://localhost:3001",
      "use_native": false,
      "session_store_path": "",
      "allow_from": []
    },
    "feishu": {
      "enabled": false,
      "app_id": "cli_xxx",
      "app_secret": "xxx",
      "encrypt_key": "",
      "verification_token": "",
      "allow_from": []
    },
    "qq": {
      "enabled": false,
      "app_id": "",
      "app_secret": "",
      "allow_from": []
    }
  },
  "tools": {
    "web": {
      "brave": {
        "enabled": false,
        "api_key": "BSA...",
        "max_results": 5
      },
      "duckduckgo": {
        "enabled": true,
        "max_results": 5
      },
      "perplexity": {
        "enabled": false,
        "api_key": "",
        "max_results": 5
      },
      "searxng": {
        "enabled": false,
        "base_url": "http://localhost:8888",
        "max_results": 5
      }
    },
    "cron": {
      "exec_timeout_minutes": 5
    }
  },
  "heartbeat": {
    "enabled": true,
    "interval": 30
  }
}

📝 API Key Comparison

Service	Pricing	Use Case
OpenRouter	Free: 200K tokens/month	Multiple models (Claude, GPT-4, etc.)
Volcengine CodingPlan	¥9.9/first month	Best for Chinese users, multiple SOTA models (Doubao, DeepSeek, etc.)
Zhipu	Free: 200K tokens/month	Suitable for Chinese users
Brave Search	$5/1000 queries	Web search functionality
SearXNG	Free (self-hosted)	Privacy-focused metasearch (70+ engines)
Groq	Free tier available	Fast inference (Llama, Mixtral)
Cerebras	Free tier available	Fast inference (Llama, Qwen, etc.)
LongCat	Free: up to 5M tokens/day	Fast inference
ModelScope	Free: 2000 requests/day	Inference (Qwen, GLM, DeepSeek, etc.)

16 KiB Raw Blame History