Files
picoclaw/pkg/seahorse/tool_expand_test.go
T
Liu Yuan 15a70ac45c feat(seahorse): implement short-term memory engine (LCM) (#2285)
* feat(seahorse): implement short-term memory engine of seahorse

Add pkg/seahorse/ module implementing a SQLite-backed DAG-based summary
hierarchy for context management, ported from lossless-claw's LCM design:

- types.go + short_constants.go: core types (Message, Summary, Conversation,
  ContextItem) and configuration constants (fanout, token targets, thresholds)
- migration.go: idempotent DB schema with FTS5 trigram tokenizer for CJK
- store.go: full SQLite CRUD (conversations, messages, summaries DAG,
  context_items with ordinal gap numbering, FTS5 search)
- short_engine.go: Engine lifecycle (NewEngine, Ingest, Assemble, Compact),
  session pattern filtering (ignore/stateless glob→regex compilation),
  per-session mutex via sync.Map
- short_assembler.go: budget-aware context assembly with fresh tail protection
  (32 messages), oldest-first eviction, summary XML formatting, RebuildContextItems
- short_compaction.go: leaf compaction (messages→summary) and condensed
  compaction (summaries→higher-level summary), 3-level LLM escalation,
  CompactUntilUnder for emergency overflow
- short_retrieval.go: lookupByID, FTS5/LIKE search, recursive expand with
  token cap
- context_seahorse.go: agent.ContextManager adapter, registered as "seahorse",
  provider↔seahorse message type conversion (ToolCalls, tool_result)

* fix(seahorse): correct 3 adapter bugs in context management

- TokenCount: use full message (Content+ToolCalls+Media) instead of Content-only
- Empty Content: rebuild Content from tool_result Parts when stored empty
- Duplicate summaries: summaries only in Summary field, not in History messages
- Grep: fix SearchResult.Snippet→Content for summaries
- Schema: fix FTS5 SQL uses VIRTUAL TABLE not TEMP TABLE
- TestFTS5SQLConstants: verify FTS5 SQL syntax correctness
- Test: fix flaky TestCompactLeaf

* fix(agent): ingest steering messages into seahorse SQLite

Steering messages were only persisted to session JSONL but not ingested
into seahorse SQLite, causing them to be missing from context assembly.

Added `ts.ingestMessage(turnCtx, al, pm)` call in the steering message
injection block alongside the existing JSONL persistence.

Test: TestSeahorseSteeringMessageIngested verifies steering messages
appear in seahorse SQLite DB after being processed.

* fix(seahorse): address 3 blocking bugs from code review

- Fix resequenceContextItemsTx scan error handling (store.go:850)
  Changed `return err` to `return scanErr` to properly propagate scan errors
  instead of returning nil (which silently corrupts data)

- Fix sql.NullString for INTEGER column (store.go:847)
  Changed `mid` from sql.NullString to sql.NullInt64 since message_id
  is INTEGER in schema. Removed unnecessary strconv.ParseInt call.

- Fix compactCondensed fallback deleting non-candidate items
  Added ReplaceContextItemsWithSummary method for per-item deletion
  when candidates are not contiguous in ordinal space.
  Optimized to use range deletion when candidates are consecutive.

* fix(seahorse): pass Budget to Compact for correct condensed threshold

Issue #4 from PR review: When Budget was not passed to seahorse.Compact,
it defaulted to `tokensBefore * 0.75`, making `tokensBefore > budget`
always true and causing condensed compaction to trigger unnecessarily.

Changes:
- context_seahorse.go: Forward Budget from CompactRequest to CompactInput
- loop.go: Pass Budget (ContextWindow) in all 3 Compact calls
- Add test verifying condensed is skipped when tokens < threshold
- Fix lint issues in store.go and store_test.go

* fix(seahorse): add mutex for assembler lazy initialization

Issue #5 from PR review: The check-then-create pattern for e.assembler
was a data race when multiple goroutines called Assemble() concurrently:
    if e.assembler == nil {
        e.assembler = &Assembler{...}
    }

Changes:
- Add assemblerMu sync.Mutex to Engine struct
- Add initAssemblerOnce() using double-checked locking (same pattern as initCompactionOnce)
- Add TestAssemblerLazyInitRace to verify thread-safety

* fix(seahorse): handle non-consecutive depths in selectShallowestCondensationCandidate

Issue #8 from PR review: the loop iterated depth 0, 1, 2... assuming
consecutive keys, but break when key was missing caused deeper depths
to never be checked.

Fix: collect all existing depth keys, sort, then iterate in order.

* fix(seahorse): wrap DeleteMessagesAfterID and appendContextItems in transactions

- DeleteMessagesAfterID: wrap all DELETE operations in a transaction for
  atomicity, remove redundant manual FTS delete (handled by trigger)
- appendContextItems: use transaction to fix read-then-write race condition
- Add GetMaxOrdinalTx and resolveItemTokenCountTx for transaction-scoped queries
- Remove unused resolveItemTokenCount function

Fixes PR review issues 6 and 7.

* fix(seahorse): derive readable content from Parts and cap CompactUntilUnder iterations

- Derive readable content from MessageParts in AddMessageWithParts so
  FTS5 indexing and summary formatting can access tool call information
- formatMessagesForSummary and truncateSummary now fall back to Parts
  when Content is empty, fixing blank summaries for Part-based messages
- Add MaxCompactIterations (20) to prevent CompactUntilUnder infinite
  loops; exceeded iterations are logged as warnings
2026-04-05 09:05:16 +08:00

137 lines
3.4 KiB
Go

package seahorse
import (
"context"
"encoding/json"
"fmt"
"testing"
)
func TestExpandToolByMessageIDs(t *testing.T) {
s := openTestStore(t)
ctx := context.Background()
conv, _ := s.GetOrCreateConversation(ctx, "test:expand-tool")
msg1, _ := s.AddMessage(ctx, conv.ConversationID, "user", "first message", 10)
msg2, _ := s.AddMessage(ctx, conv.ConversationID, "assistant", "second message", 10)
re := &RetrievalEngine{store: s}
tool := NewExpandTool(re)
result := tool.Execute(ctx, map[string]any{
"message_ids": []any{fmt.Sprintf("%d", msg1.ID), fmt.Sprintf("%d", msg2.ID)},
})
if result.IsError {
t.Fatalf("Expand failed: %s", result.ForLLM)
}
// Parse result
var output struct {
Success bool `json:"success"`
TokenCount int `json:"tokenCount"`
Messages []map[string]any `json:"messages"`
}
if err := json.Unmarshal([]byte(result.ForLLM), &output); err != nil {
t.Fatalf("Parse result: %v", err)
}
if !output.Success {
t.Error("expected success=true")
}
if len(output.Messages) != 2 {
t.Errorf("Messages = %d, want 2", len(output.Messages))
}
if output.TokenCount != 20 {
t.Errorf("TokenCount = %d, want 20", output.TokenCount)
}
}
func TestExpandToolMissingIDs(t *testing.T) {
s := openTestStore(t)
re := &RetrievalEngine{store: s}
tool := NewExpandTool(re)
result := tool.Execute(context.Background(), map[string]any{})
if !result.IsError {
t.Error("expected error for missing message_ids")
}
}
func TestExpandToolWithParts(t *testing.T) {
s := openTestStore(t)
ctx := context.Background()
conv, _ := s.GetOrCreateConversation(ctx, "test:expand-parts")
// Create message with parts
parts := []MessagePart{
{Type: "text", Text: "Hello"},
{Type: "tool_use", Name: "bash", Arguments: `{"command":"ls"}`, ToolCallID: "call_123"},
{Type: "tool_result", ToolCallID: "call_123", Text: "file1.txt\nfile2.txt"},
}
msg, _ := s.AddMessageWithParts(ctx, conv.ConversationID, "assistant", parts, 50)
re := &RetrievalEngine{store: s}
tool := NewExpandTool(re)
result := tool.Execute(ctx, map[string]any{
"message_ids": []any{fmt.Sprintf("%d", msg.ID)},
})
if result.IsError {
t.Fatalf("Expand failed: %s", result.ForLLM)
}
var output struct {
Messages []struct {
Parts []map[string]any `json:"parts"`
} `json:"messages"`
}
if err := json.Unmarshal([]byte(result.ForLLM), &output); err != nil {
t.Fatalf("Parse result: %v", err)
}
if len(output.Messages) != 1 {
t.Fatalf("Messages = %d, want 1", len(output.Messages))
}
// Verify parts are filtered correctly
foundText := false
foundToolUse := false
foundToolResult := false
for _, p := range output.Messages[0].Parts {
switch p["type"].(string) {
case "text":
foundText = true
if p["text"] != "Hello" {
t.Errorf("text = %v, want Hello", p["text"])
}
case "tool_use":
foundToolUse = true
if p["name"] != "bash" {
t.Errorf("name = %v, want bash", p["name"])
}
case "tool_result":
foundToolResult = true
// tool_result should NOT have content
if _, hasContent := p["content"]; hasContent {
t.Error("tool_result should not have content field")
}
if p["toolCallId"] != "call_123" {
t.Errorf("toolCallId = %v, want call_123", p["toolCallId"])
}
}
}
if !foundText {
t.Error("missing text part")
}
if !foundToolUse {
t.Error("missing tool_use part")
}
if !foundToolResult {
t.Error("missing tool_result part")
}
}