Commit Graph

8 Commits

Author SHA1 Message Date
Administrator 12a8590ada fix(agent): enhance SubTurn robustness and fix race conditions
Major improvements to SubTurn implementation:

**Fixes:**
- Channel close race condition (sync.Once)
- Semaphore blocking timeout (30s)
- Redundant context wrapping
- Memory accumulation (auto-truncate at 50 msgs)
- Channel draining on Finish()
- Missing depth limit logging
- Model validation

**Enhancements:**
- Comprehensive documentation (150+ lines)
- 11 new tests covering edge cases
- Improved error messages

All tests pass. Production-ready.

Related: #1316
2026-03-17 12:50:32 +08:00
Administrator 672d11c7d4 fix(agent): prevent double result delivery and panic bypass in SubTurn
- Fix synchronous SubTurn calls placing results in pendingResults channel,
  causing double delivery. Now only async calls (Async=true) use the channel.
- Move deliverSubTurnResult into defer to ensure result delivery even when
  runTurn panics. Add TestSpawnSubTurn_PanicRecovery to verify.
- Fix ContextWindow incorrectly set to MaxTokens; now inherits from
  parentAgent.ContextWindow.
- Add TestSpawnSubTurn_ResultDeliverySync to verify sync behavior.
2026-03-16 23:48:51 +08:00
Administrator 3c2d373a5c fix(agent): resolve race conditions and resource leaks in SubTurn
Critical fixes (5):
- Fix turnState hierarchy corruption in nested SubTurns by checking context
  before creating new root turnState in runAgentLoop
- Fix deadlock risk in deliverSubTurnResult by separating lock and channel ops
- Fix session rollback race in HardAbort by calling Finish() before rollback
- Fix resource leak by closing pendingResults channel in Finish() with recovery
- Add thread-safety docs for childTurnIDs and isFinished fields

Medium priority fixes (5):
- Move globalTurnCounter to AgentLoop.subTurnCounter to prevent ID conflicts
- Improve semaphore acquisition to ensure release even on early validation failures
- Document design choice: ephemeral sessions start empty for complete isolation
- Add final poll before Finish() to capture late-arriving SubTurn results
- Remove duplicate channel registration in spawnSubTurn to fix timing issues

Testing:
- Add 6 new tests covering hierarchy, deadlock, ordering, channel lifecycle,
  final poll, and semaphore behavior
- All 12 SubTurn tests passing with race detector

This resolves 10 critical and medium issues (5 race conditions, 2 resource leaks,
3 timing issues) identified in code review, bringing SubTurn to production-ready state.
2026-03-16 22:54:01 +08:00
Administrator 6b5d7e3fd7 fix(agent): resolve critical race conditions and resource leaks in SubTurn
- Fix turnState hierarchy corruption when SubTurns recursively call runAgentLoop
  by checking context for existing turnState before creating new root
- Fix deadlock risk in deliverSubTurnResult by separating lock and channel operations
- Fix session rollback race in HardAbort by calling Finish() before rollback
- Fix resource leak by closing pendingResults channel in Finish() with panic recovery
- Add thread-safety documentation for childTurnIDs and isFinished fields
- Move globalTurnCounter to AgentLoop.subTurnCounter to prevent ID conflicts
- Improve semaphore acquisition to ensure release even on early validation failures
- Document design choice: ephemeral sessions start empty for complete isolation
- Add 5 new tests: hierarchy, deadlock, order, channel close, and semaphore
2026-03-16 22:37:21 +08:00
Administrator acd436acfe feat(agent): add session state rollback on hard abort
- Add initialHistoryLength field to turnState to snapshot session state at turn start
- Save initial history length in runAgentLoop when creating root turnState
- Implement session rollback in HardAbort via SetHistory, truncating to initial length
- Add TestHardAbortSessionRollback to verify history rollback after abort
- Import providers package in subturn_test.go for Message type

This ensures that when a user triggers hard abort, all messages added during
the aborted turn are discarded, restoring the session to its pre-turn state.
2026-03-16 21:49:58 +08:00
Administrator 1236dd9e6d feat(agent): add concurrency semaphore and hard abort for SubTurn
- Add maxConcurrentSubTurns constant (5) and concurrencySem channel to turnState
- Acquire/release semaphore in spawnSubTurn to limit concurrent child turns per parent
- Add activeTurnStates sync.Map to AgentLoop for tracking root turn states by session
- Implement HardAbort(sessionKey) method to trigger cascading cancellation via turnState.Finish()
- Register/unregister root turnState in runAgentLoop for hard abort lookup
- Add TestSubTurnConcurrencySemaphore to verify semaphore capacity enforcement
- Add TestHardAbortCascading to verify context cancellation propagates to child turns
2026-03-16 21:03:58 +08:00
Administrator ceeae15d8a feat(agent): wire SubTurn into AgentLoop and Spawn Tool
- Add subTurnResults sync.Map to AgentLoop for per-session channel tracking
- Add register/unregister/dequeue methods in steering.go
- Poll SubTurn results in runLLMIteration at loop start and after each tool,
  injecting results as [SubTurn Result] messages into parent conversation
- Initialize root turnState in runAgentLoop, propagate via context
  (withTurnState/turnStateFromContext), call rootTS.Finish() on completion
- Wire Spawn Tool to spawnSubTurn via SetSpawner in registerSharedTools,
  recovering parentTS from context for proper turn hierarchy
- Refactor subagent.go to use SetSpawner pattern
- Add TestSubTurnResultChannelRegistration and TestDequeuePendingSubTurnResults
2026-03-16 20:44:04 +08:00
Administrator ae23193295 feat(agent): port subturn PoC to refactor/agent branch
- Replace duplicate types (ToolResult/Session/Message) with real project types
- Implement ephemeralSessionStore satisfying session.SessionStore interface
- Connect runTurn to real AgentLoop via runAgentLoop + AgentInstance
- Fix subturn_test.go to match updated signatures and types

Co-Authored-By: Claude Sonnet 4 <noreply@anthropic.com>
2026-03-16 14:31:32 +08:00