feat: add extended thinking support for Anthropic models (#1076)

* feat: add extended thinking support for Anthropic models

Support configurable thinking levels (off/low/medium/high/xhigh/adaptive)
via `agents.defaults.thinking_level` config field.

- "adaptive": uses Anthropic's adaptive thinking API (Claude 4.6+)
- "low/medium/high/xhigh": uses budget_tokens (all thinking-capable models)
- "off": disables thinking (default)

API constraints handled:
- Temperature cleared when thinking is enabled
- budget_tokens clamped to max_tokens-1
- Thinking response blocks parsed into Reasoning field

Relates to #645, #966

* fix: address PR review feedback for thinking support

- Add ThinkingCapable interface for provider capability detection
- Warn when thinking_level is set but provider doesn't support it
- Warn when temperature is cleared due to thinking enabled
- Adjust budget values per Anthropic best practices (medium=16K, xhigh=64K)
- Add budget clamp warning and 80% threshold warning
- Add parseResponse thinking block tests
- Add thinking_level field to config.example.json

* refactor: move ThinkingLevel from AgentDefaults to ModelConfig

Thinking is a model-level capability, not a global agent property.
Per-model config avoids silent ignoring on non-Anthropic providers
and eliminates spurious warning logs in multi-provider setups.

Addresses PR #1076 review feedback from @yinwm.
This commit is contained in:
Larry Koo
2026-03-05 09:51:18 +08:00
committed by GitHub
parent 325af2163b
commit 204038ec60
9 changed files with 401 additions and 17 deletions
+1
View File
@@ -507,6 +507,7 @@ type ModelConfig struct {
RPM int `json:"rpm,omitempty"` // Requests per minute limit
MaxTokensField string `json:"max_tokens_field,omitempty"` // Field name for max tokens (e.g., "max_completion_tokens")
RequestTimeout int `json:"request_timeout,omitempty"`
ThinkingLevel string `json:"thinking_level,omitempty"` // Extended thinking: off|low|medium|high|xhigh|adaptive
}
// Validate checks if the ModelConfig has all required fields.