Files
picoclaw/docs/architecture/hooks/README.md
T
Hoshina 795ee362ea refactor(events): emit agent runtime events directly
Remove the legacy EventKind/Event envelope mapping and let agent event emission build pkg/events.Event values directly.

Keep HookMeta as the shared hook metadata shape and preserve legacy observe string aliases by mapping them to runtime event kinds.

Validation: GOCACHE=/tmp/picoclaw-go-cache go test ./pkg/agent; make lint
2026-04-26 16:55:02 +08:00

744 lines
22 KiB
Markdown

# Hook System Guide
This document describes the hook system that is implemented in the current repository, not the older design draft.
The current implementation supports two mounting modes:
1. In-process hooks
2. Out-of-process process hooks (`JSON-RPC over stdio`)
The repository no longer ships standalone example source files. The Go and Python examples below are embedded directly in this document. If you want to use them, copy them into your own local files first.
## Supported Hook Types
| Type | Interface | Stage | Can modify data |
| --- | --- | --- | --- |
| Observer | `RuntimeEventObserver` | Runtime event bus broadcast | No |
| LLM interceptor | `LLMInterceptor` | `before_llm` / `after_llm` | Yes |
| Tool interceptor | `ToolInterceptor` | `before_tool` / `after_tool` | Yes |
| Tool approver | `ToolApprover` | `approve_tool` | No, returns allow/deny |
The currently exposed synchronous hook points are:
- `before_llm`
- `after_llm`
- `before_tool`
- `after_tool`
- `approve_tool`
Everything else is exposed as read-only events.
## Hook Actions
Hooks can return different actions to control the flow:
| Action | Applicable Stages | Effect |
| --- | --- | --- |
| `continue` | All interceptors | Pass through without modification |
| `modify` | `before_llm`, `after_llm`, `before_tool`, `after_tool` | Modify request/response and continue |
| `respond` | `before_tool` | Return a tool result directly, skip actual tool execution |
| `deny_tool` | `before_tool` | Deny tool execution, return error message |
| `abort_turn` | All interceptors | Abort the current turn |
| `hard_abort` | All interceptors | Force stop the entire agent loop |
### The `respond` Action
The `respond` action is special: it allows a `before_tool` hook to provide the tool result directly, skipping the actual tool execution. This is useful for:
1. **Plugin tool injection**: External hooks can implement tools without registering them in the tool registry
2. **Tool result caching**: Return cached results for repeated tool calls
3. **Tool mocking**: Return mock results for testing purposes
When a hook returns `respond` with a `HookResult`, the agent loop:
1. Skips the actual tool execution
2. Uses the provided result as if the tool had executed
3. Continues the turn normally with the result
Example (Go in-process hook):
```go
func (h *MyHook) BeforeTool(
ctx context.Context,
call *agent.ToolCallHookRequest,
) (*agent.ToolCallHookRequest, agent.HookDecision, error) {
if call.Tool == "my_plugin_tool" {
next := call.Clone()
next.HookResult = &tools.ToolResult{
ForLLM: "Plugin tool executed successfully",
Silent: false,
IsError: false,
}
return next, agent.HookDecision{Action: agent.HookActionRespond}, nil
}
return call, agent.HookDecision{Action: agent.HookActionContinue}, nil
}
```
Example (Python process hook):
```python
def handle_before_tool(params: dict) -> dict:
tool = params.get("tool", "")
if tool == "my_plugin_tool":
return {
"action": "respond",
"result": {
"for_llm": "Plugin tool executed successfully",
"silent": False,
"is_error": False
}
}
return {"action": "continue"}
```
## Execution Order
`HookManager` sorts hooks like this:
1. In-process hooks first
2. Process hooks second
3. Lower `priority` first within the same source
4. Name order as the final tie-breaker
## Timeouts
Global defaults live under `hooks.defaults`:
- `observer_timeout_ms`
- `interceptor_timeout_ms`
- `approval_timeout_ms`
Note: the current implementation does not support per-process-hook `timeout_ms`. Timeouts are global defaults.
## Quick Start
If your first goal is simply to prove that the hook flow works and observe real requests, the easiest path is the Python process-hook example below:
1. Enable `hooks.enabled`
2. Save the Python example from this document to a local file, for example `/tmp/review_gate.py`
3. Set `PICOCLAW_HOOK_LOG_FILE`
4. Restart the gateway
5. Watch the log file with `tail -f`
Example:
```json
{
"hooks": {
"enabled": true,
"processes": {
"py_review_gate": {
"enabled": true,
"priority": 100,
"transport": "stdio",
"command": [
"python3",
"/tmp/review_gate.py"
],
"observe": [
"agent.tool.exec_start",
"agent.tool.exec_end",
"agent.tool.exec_skipped"
],
"intercept": [
"before_tool",
"approve_tool"
],
"env": {
"PICOCLAW_HOOK_LOG_FILE": "/tmp/picoclaw-hook-review-gate.log"
}
}
}
}
}
```
Watch it with:
```bash
tail -f /tmp/picoclaw-hook-review-gate.log
```
If you are developing PicoClaw itself rather than only validating the protocol, continue with the Go in-process example as well.
## What The Two Examples Are For
- Go in-process example
Best for validating the host-side hook chain and understanding `MountHook()` plus the synchronous stages
- Python process example
Best for understanding the `JSON-RPC over stdio` protocol and verifying the message flow between PicoClaw and an external process
Both examples are intentionally safe: they only log, never rewrite, and never deny.
## Go In-Process Example
The following is a minimal logging hook for in-process use. It implements:
1. `RuntimeEventObserver`
2. `LLMInterceptor`
3. `ToolInterceptor`
4. `ToolApprover`
It only records activity. It does not rewrite requests or reject tools.
You can save it as your own Go file, for example `pkg/myhooks/example_logger.go`:
```go
package myhooks
import (
"context"
"encoding/json"
"os"
"path/filepath"
"strings"
"sync"
"time"
"github.com/sipeed/picoclaw/pkg/agent"
runtimeevents "github.com/sipeed/picoclaw/pkg/events"
"github.com/sipeed/picoclaw/pkg/logger"
)
type ExampleLoggerHookOptions struct {
LogFile string `json:"log_file,omitempty"`
LogEvents bool `json:"log_events,omitempty"`
}
type ExampleLoggerHook struct {
logFile string
logEvents bool
mu sync.Mutex
}
func NewExampleLoggerHook(opts ExampleLoggerHookOptions) *ExampleLoggerHook {
return &ExampleLoggerHook{
logFile: strings.TrimSpace(opts.LogFile),
logEvents: opts.LogEvents,
}
}
func (h *ExampleLoggerHook) OnRuntimeEvent(ctx context.Context, evt runtimeevents.Event) error {
_ = ctx
if h == nil || !h.logEvents {
return nil
}
h.record("event", evt.Scope, map[string]any{
"event": evt.Kind.String(),
"payload": evt.Payload,
}, nil)
return nil
}
func (h *ExampleLoggerHook) BeforeLLM(
ctx context.Context,
req *agent.LLMHookRequest,
) (*agent.LLMHookRequest, agent.HookDecision, error) {
_ = ctx
h.record("before_llm", req.Meta, req, agent.HookDecision{Action: agent.HookActionContinue})
return req, agent.HookDecision{Action: agent.HookActionContinue}, nil
}
func (h *ExampleLoggerHook) AfterLLM(
ctx context.Context,
resp *agent.LLMHookResponse,
) (*agent.LLMHookResponse, agent.HookDecision, error) {
_ = ctx
h.record("after_llm", resp.Meta, resp, agent.HookDecision{Action: agent.HookActionContinue})
return resp, agent.HookDecision{Action: agent.HookActionContinue}, nil
}
func (h *ExampleLoggerHook) BeforeTool(
ctx context.Context,
call *agent.ToolCallHookRequest,
) (*agent.ToolCallHookRequest, agent.HookDecision, error) {
_ = ctx
h.record("before_tool", call.Meta, call, agent.HookDecision{Action: agent.HookActionContinue})
return call, agent.HookDecision{Action: agent.HookActionContinue}, nil
}
func (h *ExampleLoggerHook) AfterTool(
ctx context.Context,
result *agent.ToolResultHookResponse,
) (*agent.ToolResultHookResponse, agent.HookDecision, error) {
_ = ctx
h.record("after_tool", result.Meta, result, agent.HookDecision{Action: agent.HookActionContinue})
return result, agent.HookDecision{Action: agent.HookActionContinue}, nil
}
func (h *ExampleLoggerHook) ApproveTool(
ctx context.Context,
req *agent.ToolApprovalRequest,
) (agent.ApprovalDecision, error) {
_ = ctx
decision := agent.ApprovalDecision{Approved: true}
h.record("approve_tool", req.Meta, req, decision)
return decision, nil
}
func (h *ExampleLoggerHook) record(stage string, refs any, payload any, decision any) {
logger.InfoCF("hooks", "Example hook observed", map[string]any{
"stage": stage,
})
if h == nil || h.logFile == "" {
return
}
entry := map[string]any{
"ts": time.Now().UTC(),
"stage": stage,
"refs": refs,
"payload": payload,
"decision": decision,
}
body, err := json.Marshal(entry)
if err != nil {
logger.WarnCF("hooks", "Example hook log encode failed", map[string]any{
"stage": stage,
"error": err.Error(),
})
return
}
h.mu.Lock()
defer h.mu.Unlock()
if dir := filepath.Dir(h.logFile); dir != "" && dir != "." {
if err := os.MkdirAll(dir, 0o755); err != nil {
logger.WarnCF("hooks", "Example hook log mkdir failed", map[string]any{
"stage": stage,
"path": h.logFile,
"error": err.Error(),
})
return
}
}
file, err := os.OpenFile(h.logFile, os.O_CREATE|os.O_WRONLY|os.O_APPEND, 0o644)
if err != nil {
logger.WarnCF("hooks", "Example hook log open failed", map[string]any{
"stage": stage,
"path": h.logFile,
"error": err.Error(),
})
return
}
defer func() { _ = file.Close() }()
if _, err := file.Write(append(body, '\n')); err != nil {
logger.WarnCF("hooks", "Example hook log write failed", map[string]any{
"stage": stage,
"path": h.logFile,
"error": err.Error(),
})
}
}
```
### Mounting It In Code
If code mounting is enough, call this after `AgentLoop` is initialized:
```go
hook := myhooks.NewExampleLoggerHook(myhooks.ExampleLoggerHookOptions{
LogFile: "/tmp/picoclaw-hook-example-logger.log",
LogEvents: true,
})
if err := al.MountHook(agent.NamedHook("example-logger", hook)); err != nil {
panic(err)
}
```
### If You Also Want Config Mounting
The hook system supports builtin hooks, but that requires you to compile the factory into your binary. In practice, that means you need registration code like this alongside the hook definition above:
```go
package myhooks
import (
"context"
"encoding/json"
"fmt"
"github.com/sipeed/picoclaw/pkg/agent"
"github.com/sipeed/picoclaw/pkg/config"
)
func init() {
if err := agent.RegisterBuiltinHook("example_logger", func(
ctx context.Context,
spec config.BuiltinHookConfig,
) (any, error) {
_ = ctx
var opts ExampleLoggerHookOptions
if len(spec.Config) > 0 {
if err := json.Unmarshal(spec.Config, &opts); err != nil {
return nil, fmt.Errorf("decode example_logger config: %w", err)
}
}
return NewExampleLoggerHook(opts), nil
}); err != nil {
panic(err)
}
}
```
Only after you register that builtin will the following config work:
```json
{
"hooks": {
"enabled": true,
"builtins": {
"example_logger": {
"enabled": true,
"priority": 10,
"config": {
"log_file": "/tmp/picoclaw-hook-example-logger.log",
"log_events": true
}
}
}
}
}
```
### How To Observe It
- If `log_file` is set, each hook call is appended as JSON Lines
- If `log_file` is not set, the hook still writes summaries to the gateway log
- Requests that only hit the LLM path usually show `before_llm` and `after_llm`
- Requests that trigger tools usually also show `before_tool`, `approve_tool`, and `after_tool`
- If `log_events=true`, you will also see `event`
Typical log lines:
```json
{"ts":"2026-03-21T14:10:00Z","stage":"before_tool","meta":{"session_key":"session-1"},"payload":{"tool":"echo_text","arguments":{"text":"hello"}},"decision":{"action":"continue"}}
{"ts":"2026-03-21T14:10:00Z","stage":"approve_tool","meta":{"session_key":"session-1"},"payload":{"tool":"echo_text","arguments":{"text":"hello"}},"decision":{"approved":true}}
```
If you only see `before_llm` and `after_llm`, that usually means the request did not trigger any tool call, not that the hook failed to mount.
## Python Process-Hook Example
The following script is a minimal process-hook example. It uses only the Python standard library and supports:
1. `hook.hello`
2. `hook.runtime_event`
3. `hook.before_tool`
4. `hook.approve_tool`
It only records activity. It does not rewrite or deny anything.
Save it to any local path, for example `/tmp/review_gate.py`:
```python
#!/usr/bin/env python3
from __future__ import annotations
import json
import os
import signal
import sys
from datetime import datetime, timezone
from typing import Any
LOG_EVENTS = os.getenv("PICOCLAW_HOOK_LOG_EVENTS", "1").lower() not in {"0", "false", "no"}
LOG_FILE = os.getenv("PICOCLAW_HOOK_LOG_FILE", "").strip()
def append_log(entry: dict[str, Any]) -> None:
if not LOG_FILE:
return
payload = {
"ts": datetime.now(timezone.utc).isoformat(),
**entry,
}
try:
log_dir = os.path.dirname(LOG_FILE)
if log_dir:
os.makedirs(log_dir, exist_ok=True)
with open(LOG_FILE, "a", encoding="utf-8") as handle:
handle.write(json.dumps(payload, ensure_ascii=True) + "\n")
except OSError as exc:
log_stderr(f"failed to write hook log file {LOG_FILE}: {exc}")
def send_response(message_id: int, result: Any | None = None, error: str | None = None) -> None:
payload: dict[str, Any] = {
"jsonrpc": "2.0",
"id": message_id,
}
if error is not None:
payload["error"] = {"code": -32000, "message": error}
else:
payload["result"] = result if result is not None else {}
append_log({
"direction": "out",
"id": message_id,
"response": payload.get("result"),
"error": payload.get("error"),
})
try:
sys.stdout.write(json.dumps(payload, ensure_ascii=True) + "\n")
sys.stdout.flush()
except BrokenPipeError:
raise SystemExit(0) from None
def log_stderr(message: str) -> None:
try:
sys.stderr.write(message + "\n")
sys.stderr.flush()
except BrokenPipeError:
raise SystemExit(0) from None
def handle_shutdown_signal(signum: int, _frame: Any) -> None:
raise KeyboardInterrupt(f"received signal {signum}")
def handle_before_tool(params: dict[str, Any]) -> dict[str, Any]:
_ = params
return {"action": "continue"}
def handle_approve_tool(params: dict[str, Any]) -> dict[str, Any]:
_ = params
return {"approved": True}
def handle_request(method: str, params: dict[str, Any]) -> dict[str, Any]:
if method == "hook.hello":
return {"ok": True, "name": "python-review-gate"}
if method == "hook.before_tool":
return handle_before_tool(params)
if method == "hook.approve_tool":
return handle_approve_tool(params)
if method == "hook.before_llm":
return {"action": "continue"}
if method == "hook.after_llm":
return {"action": "continue"}
if method == "hook.after_tool":
return {"action": "continue"}
raise KeyError(f"method not found: {method}")
def main() -> int:
try:
for raw_line in sys.stdin:
line = raw_line.strip()
if not line:
continue
try:
message = json.loads(line)
except json.JSONDecodeError as exc:
log_stderr(f"failed to decode request: {exc}")
append_log({
"direction": "in",
"decode_error": str(exc),
"raw": line,
})
continue
method = message.get("method")
message_id = message.get("id", 0)
params = message.get("params") or {}
if not isinstance(params, dict):
params = {}
append_log({
"direction": "in",
"id": message_id,
"method": method,
"params": params,
"notification": not bool(message_id),
})
if not message_id:
if method == "hook.runtime_event" and LOG_EVENTS:
log_stderr(f"observed event: {params.get('kind')}")
continue
try:
result = handle_request(str(method or ""), params)
except KeyError as exc:
send_response(int(message_id), error=str(exc))
continue
except Exception as exc:
send_response(int(message_id), error=f"unexpected error: {exc}")
continue
send_response(int(message_id), result=result)
except KeyboardInterrupt:
return 0
return 0
if __name__ == "__main__":
signal.signal(signal.SIGINT, handle_shutdown_signal)
signal.signal(signal.SIGTERM, handle_shutdown_signal)
raise SystemExit(main())
```
### Configuration
```json
{
"hooks": {
"enabled": true,
"processes": {
"py_review_gate": {
"enabled": true,
"priority": 100,
"transport": "stdio",
"command": [
"python3",
"/abs/path/to/review_gate.py"
],
"observe": [
"agent.tool.exec_start",
"agent.tool.exec_end",
"agent.tool.exec_skipped"
],
"intercept": [
"before_tool",
"approve_tool"
],
"env": {
"PICOCLAW_HOOK_LOG_FILE": "/tmp/picoclaw-hook-review-gate.log"
}
}
}
}
}
```
### Environment Variables
- `PICOCLAW_HOOK_LOG_EVENTS`
Whether to write `hook.runtime_event` summaries to `stderr`, enabled by default
- `PICOCLAW_HOOK_LOG_FILE`
Path to an external log file. When set, the script appends inbound hook requests, notifications, and outbound responses as JSON Lines
Note: `PICOCLAW_HOOK_LOG_FILE` has no default. If you do not set it, the script does not write any file logs.
### How To Confirm It Received Hooks
Watch two places:
- Gateway logs
Useful for confirming that the host successfully started the process and for seeing event summaries written to `stderr`
- `PICOCLAW_HOOK_LOG_FILE`
Useful for seeing the exact requests the script received and the exact responses it returned
Typical interpretation:
- Only `hook.hello`
The process started and completed the handshake, but no business hook request has arrived yet
- `hook.runtime_event`
The `observe` configuration is working
- `hook.before_tool`
The `intercept: ["before_tool", ...]` configuration is working
- `hook.approve_tool`
The approval hook path is working
Because this example never rewrites or denies, the expected responses look like:
```json
{"direction":"out","id":7,"response":{"action":"continue"},"error":null}
{"direction":"out","id":8,"response":{"approved":true},"error":null}
```
A complete sample:
```json
{"ts":"2026-03-21T14:12:00+00:00","direction":"in","id":1,"method":"hook.hello","params":{"name":"py_review_gate","version":1,"modes":["observe","tool","approve"]},"notification":false}
{"ts":"2026-03-21T14:12:00+00:00","direction":"out","id":1,"response":{"ok":true,"name":"python-review-gate"},"error":null}
{"ts":"2026-03-21T14:12:05+00:00","direction":"in","id":0,"method":"hook.runtime_event","params":{"kind":"agent.tool.exec_start"},"notification":true}
{"ts":"2026-03-21T14:12:05+00:00","direction":"in","id":7,"method":"hook.before_tool","params":{"tool":"echo_text","arguments":{"text":"hello"}},"notification":false}
{"ts":"2026-03-21T14:12:05+00:00","direction":"out","id":7,"response":{"action":"continue"},"error":null}
```
Additional notes:
- Timestamps are UTC
- `notification=true` means it was a notification such as `hook.runtime_event`, which does not expect a response
- `id` increases within a single hook process; if the process restarts, the counter starts over
## Process-Hook Protocol
Current process hooks use `JSON-RPC over stdio`:
- PicoClaw starts the external process
- Requests and responses are exchanged as one JSON message per line
- `hook.runtime_event` is a notification and does not need a response
- `hook.before_llm`, `hook.after_llm`, `hook.before_tool`, `hook.after_tool`, and `hook.approve_tool` are request/response calls
The host does not currently accept new RPCs initiated by the process hook. In practice, that means an external hook can only respond to PicoClaw calls; it cannot call back into the host to send channel messages.
## Configuration Fields
### `hooks.builtins.<name>`
- `enabled`
- `priority`
- `config`
### `hooks.processes.<name>`
- `enabled`
- `priority`
- `transport`
Currently only `stdio` is supported
- `command`
- `dir`
- `env`
- `observe`
- `intercept`
## Troubleshooting
If a hook looks like it is not firing, check these in order:
1. `hooks.enabled`
2. Whether the target builtin or process hook is `enabled`
3. Whether the process-hook `command` path is correct
4. Whether you are watching the correct log file
5. Whether the current request actually reached the stage you care about
6. Whether `observe` or `intercept` contains the hook point you want
A practical minimal troubleshooting pair is:
- Use the Python process-hook example from this document to validate the external protocol
- Use the Go in-process example from this document to validate the host-side chain
If the Python side shows `hook.hello` but no business hook requests, the protocol is usually fine; the current request simply did not trigger the stage you expected.
## Scope And Limits
The current hook system is best suited for:
- LLM request rewriting
- Tool argument normalization
- Pre-execution tool approval
- Auditing and observability
It is not yet well suited for:
- External hooks actively sending channel messages
- Suspending a turn and waiting for human approval replies
- Full inbound/outbound message interception across the whole platform
If you want a real human approval workflow, use hooks as the approval entry point and keep the state machine plus channel interaction in a separate `ApprovalManager`.