mirror of https://github.com/sipeed/picoclaw.git synced 2026-05-25 16:00:35 +00:00

Files

T

Hoshina e613258fa5 feat(gateway): publish lifecycle runtime events

Emit gateway.start, gateway.ready, and gateway.shutdown on the shared runtime event bus, while keeping reload events on the same helper path.

Update subturn architecture docs to refer to runtime event kinds instead of the removed agent EventBus names.

Validation: GOCACHE=/tmp/picoclaw-go-cache go test ./pkg/gateway ./pkg/events; GOCACHE=/tmp/picoclaw-go-cache go test ./pkg/bus ./pkg/channels ./pkg/mcp ./pkg/tools/integration ./pkg/events ./pkg/gateway; make lint

2026-04-26 17:02:48 +08:00

14 KiB

Raw Permalink Blame History

🔄 SubTurn Mechanism

Back to README

Overview

The SubTurn mechanism is a core feature in PicoClaw that allows tools to spawn isolated, nested agent loops to handle complex sub-tasks.

By using a SubTurn, an agent can break down a problem and run a separate LLM invocation in an independent, ephemeral session. This ensures that intermediate reasoning, background tasks, or sub-agent outputs do not pollute the main conversation history.

Core Capabilities

Context Isolation: Each SubTurn uses an ephemeralSessionStore. Its message history does not leak into the parent task and is destroyed upon completion. The ephemeral session holds at most 50 messages; older messages are automatically truncated when this limit is reached.
Depth & Concurrency Limits: Prevents infinite loops and resource exhaustion.
- Maximum Depth: Up to 3 nested levels.
- Maximum Concurrency: Up to 5 concurrent sub-turns per parent turn (managed via a semaphore with a 30-second timeout).
Context Protection: Supports soft context limits (MaxContextRunes). It proactively truncates old messages (while preserving system prompts and recent context) before hitting the provider's hard context window limit.
Error Recovery: Automatically detects and recovers from provider context length exceeded errors and truncation errors by compressing history and retrying.

Configuration (`SubTurnConfig`)

When spawning a SubTurn, you must provide a SubTurnConfig:

Field	Type	Description
`Model`	`string`	The LLM model to use for the sub-turn (e.g., `gpt-4o-mini`). Required.
`Tools`	`[]tools.Tool`	Tools granted to the sub-turn. If empty, it inherits the parent's tools.
`SystemPrompt`	`string`	The task description for the sub-turn. Sent as the first user message to the LLM (not as a system prompt override).
`ActualSystemPrompt`	`string`	Optional explicit system prompt to replace the agent's default. Leave empty to inherit the parent agent's system prompt.
`MaxTokens`	`int`	Maximum tokens for the generated response.
`Async`	`bool`	Controls the result delivery mode (Synchronous vs. Asynchronous).
`Critical`	`bool`	If `true`, the sub-turn continues running even if the parent finishes gracefully.
`Timeout`	`time.Duration`	Maximum execution time (default: 5 minutes).
`MaxContextRunes`	`int`	Soft context limit. `0` = auto-calculate (75% of model's context window, recommended), `-1` = no limit (disable soft truncation, rely only on hard context error recovery), `>0` = use specified rune limit.

Note: The Async flag does not make the call non-blocking. It only controls whether the result is also delivered to the parent's pendingResults channel. Both modes block the caller until the sub-turn completes. For true non-blocking execution, the caller must spawn the sub-turn in a separate goroutine.

Execution Modes

Synchronous (`Async: false`)

This is the standard mode where the caller needs the result immediately to proceed.

The caller blocks until the sub-turn completes.
The result is only returned directly via the function return value.
It is not delivered to the parent's pending results channel.

Example:

cfg := agent.SubTurnConfig{
    Model:        "gpt-4o-mini",
    SystemPrompt: "Analyze the provided codebase...",
    Async:        false,
}
result, err := agent.SpawnSubTurn(ctx, cfg)
// Process result immediately

Asynchronous (`Async: true`)

Used for "fire-and-forget" operations or parallel processing where the parent turn collects results later.

The result is delivered to the parent turn's pendingResults channel.
The result is also returned via the function return value (for consistency).
The parent's Agent Loop will poll this channel in subsequent iterations and automatically inject the results into the ongoing conversation context as [SubTurn Result].

Example:

cfg := agent.SubTurnConfig{
    Model:        "gpt-4o-mini",
    SystemPrompt: "Run a background security scan...",
    Async:        true,
}
result, err := agent.SpawnSubTurn(ctx, cfg)
// The result will also be injected into the parent loop later via channel

Error Recovery and Retries

SubTurns implement automatic retry mechanisms for transient errors:

Error Type	Max Retries	Recovery Action
Context Length Exceeded	2	Force compress history and retry
Response Truncated (`finish_reason="truncated"`)	2	Inject recovery prompt and retry

Truncation Recovery

When the LLM response is truncated (finish_reason="truncated"), SubTurn automatically:

Detects the truncation from turnState.lastFinishReason
Injects a recovery prompt: "Your previous response was truncated due to length. Please provide a shorter, complete response..."
Retries up to 2 times

Context Error Recovery

When the provider returns a context length error (e.g., context_length_exceeded):

Force compresses the message history (drops oldest 50% of conversation)
Retries with the compressed context
Up to 2 retries before failing

Lifecycle and Cancellation

SubTurns operate within an independent context but maintain a structural link to their parent turnState.

Graceful Parent Finish

When the parent task finishes naturally (Finish(false)):

Non-critical sub-turns receive a signal to exit gracefully without throwing an error.
Critical (Critical: true) sub-turns continue running in the background. Once finished, their results are emitted as Orphan Results so the data is not lost.

Hard Abort

When the parent task is forcefully aborted (e.g., user interrupts with /stop):

A cascading cancellation is triggered, instantly terminating all child and grandchild sub-turns.
The root turn's session history rolls back to the snapshot taken at turn start (initialHistoryLength), preventing dirty context. SubTurns are not affected by this rollback as they use ephemeral sessions that are discarded anyway.

Agent Loop Integration

Message Routing and Steering

When a message enters the Run() loop, the agent determines whether to start a new worker or enqueue to steering:

If no active turn exists for the message's session key, the session is atomically reserved and a worker goroutine is spawned. The worker processes the full turn lifecycle: processMessage → tool execution → steering drain → Continue for queued messages.
If an active turn already exists for the same session, the message is enqueued directly into that session's steering queue. It will be picked up by the existing worker's steering drain loop.

This ensures that:

Messages from different sessions are processed in parallel (up to max_parallel_turns concurrent workers)
Messages from the same session are strictly serialized — they go to the steering queue and are processed sequentially within the active turn
No background drain goroutine is needed; steering is handled by the worker itself after processing

Pending Result Polling

The agent loop polls for async SubTurn results at two points per iteration:

Before the LLM call: injects any arrived results as [SubTurn Result] messages into the conversation context.
After all tool executions: polls again during the tool loop to catch results that arrived during tool execution.
After the final iteration: one last poll before the turn ends to avoid losing late-arriving results.

Turn State Tracking

All active turns are registered in AgentLoop.activeTurnStates (sync.Map, keyed by session key). A reservation sentinel is stored atomically via LoadOrStore before the worker starts, then replaced with the real *turnState when runTurn registers. This prevents a TOCTOU race where multiple messages for the same session could spawn concurrent workers. The sentinel is cleaned up by the worker's deferred cleanup. This allows HardAbort and /subagents observability commands to find and operate on active turns.

Runtime Event Integration

SubTurns emit runtime events through pkg/events for observability and debugging:

Event Kind	When Emitted	Payload
`agent.subturn.spawn`	Sub-turn successfully initialized	`SubTurnSpawnPayload{AgentID, Label, ParentTurnID}`
`agent.subturn.end`	Sub-turn finishes (success or error)	`SubTurnEndPayload{AgentID, Status}`
`agent.subturn.result_delivered`	Async result successfully delivered to parent	`SubTurnResultDeliveredPayload{TargetChannel, TargetChatID, ContentLen}`
`agent.subturn.orphan`	Result cannot be delivered (parent finished or channel full)	`SubTurnOrphanPayload{ParentTurnID, ChildTurnID, Reason}`

API Reference

SpawnSubTurn (Public Entry Point)

func SpawnSubTurn(ctx context.Context, cfg SubTurnConfig) (*tools.ToolResult, error)

This is the exported package-level entry point for agent-internal code (e.g., tests, direct invocations). It retrieves AgentLoop and turnState from context and delegates to the internal spawnSubTurn.

Requirements:

AgentLoop must be injected into context via WithAgentLoop()
Parent turnState must exist in context (automatically set when called from tools)

Returns:

*tools.ToolResult: Contains ForLLM field with the sub-turn's output
error: One of the defined error types or context errors

AgentLoopSpawner (Interface Implementation)

type AgentLoopSpawner struct { al *AgentLoop }

func (s *AgentLoopSpawner) SpawnSubTurn(ctx context.Context, cfg tools.SubTurnConfig) (*tools.ToolResult, error)

This implements the tools.SubTurnSpawner interface for use by tools that need to spawn sub-turns without a direct import of the agent package (avoiding circular dependencies). It converts tools.SubTurnConfig → agent.SubTurnConfig before delegating to the internal spawnSubTurn.

NewSubTurnSpawner

func NewSubTurnSpawner(al *AgentLoop) *AgentLoopSpawner

Creates a new spawner instance for the given AgentLoop. Pass the returned value to SpawnTool.SetSpawner() or SubagentTool.SetSpawner() during tool registration.

Continue

func (al *AgentLoop) Continue(ctx context.Context, sessionKey, channel, chatID string) (string, error)

Resumes an idle agent turn by dequeuing steering messages for the given session and running them through the agent loop. Returns the response string if processing occurred, or empty string if no steering messages were pending. Uses session-aware active turn checking — it only blocks if a turn is active for the same session, not for unrelated sessions.

Context Propagation

SubTurn relies on context values for proper operation:

Context Key	Purpose
`agentLoopKey`	Stores `*AgentLoop` for tool access and SubTurn spawning
`turnStateKey`	Stores `*turnState` for hierarchy tracking and result delivery

Injecting Dependencies

// Before calling tools that may spawn SubTurns
ctx = WithAgentLoop(ctx, agentLoop)
ctx = withTurnState(ctx, turnState)

Independent Child Context

Important: The child SubTurn uses an independent context derived from context.Background(), not from the parent context. This design choice:

Allows critical SubTurns to continue after parent cancellation
Prevents parent timeout from affecting child execution
Child has its own timeout for self-protection (Timeout config or 5 minutes default)

Error Types

Error	Condition
`ErrDepthLimitExceeded`	SubTurn depth exceeds 3 levels
`ErrInvalidSubTurnConfig`	Required field `Model` is empty
`ErrConcurrencyTimeout`	All 5 concurrency slots occupied for 30+ seconds
Context errors	Parent context cancelled during semaphore acquisition

Thread Safety

SubTurns are designed for concurrent execution:

Parent-child relationships: Managed under mutex (parentTS.mu.Lock())
Active turn tracking: Uses sync.Map for concurrent access to activeTurnStates
ID generation: Uses atomic.Int64 for unique SubTurn IDs (format: subturn-N, globally monotonic per AgentLoop instance)
Result delivery: Reads parent state under lock, releases before channel send (small race window acceptable)

Orphan Results

An orphan result occurs when:

Parent turn finishes before the SubTurn completes
The pendingResults channel is full (buffer size: 16)

When a result becomes orphan:

agent.subturn.orphan is emitted to the runtime event bus
The result is NOT delivered to the LLM context
External systems can listen to this event for custom handling

Preventing Orphan Results

Use Critical: true for important SubTurns that must complete
Monitor agent.subturn.orphan for observability
Consider the 16-buffer limit when spawning many async SubTurns

Tool Inheritance

When `cfg.Tools` is empty:

SubTurn inherits all tools from the parent agent
Tools are registered in a new ToolRegistry instance
Tool TTL is managed independently from parent

When `cfg.Tools` is specified:

Only the specified tools are available to the SubTurn
Parent tools are NOT merged
Use this to restrict SubTurn capabilities for security or focus

Example - Restricted SubTurn:

cfg := agent.SubTurnConfig{
    Model: "gpt-4o-mini",
    Tools: []tools.Tool{readOnlyTool}, // Only read-only access
    SystemPrompt: "Analyze the file structure...",
}

Reference

Constant	Value
`maxSubTurnDepth`	3
`maxConcurrentSubTurns`	5
`concurrencyTimeout`	30s
`defaultSubTurnTimeout`	5m
`maxEphemeralHistorySize`	50 messages
`pendingResults` buffer	16
`MaxContextRunes` default	75% of model context window

14 KiB Raw Permalink Blame History