Commit Graph

5 Commits

Author SHA1 Message Date
Mauro b114dcaeb1 feat(model): llm rate limiting (#2198)
* feat(model): rate limiting

* fix(agent): preserve per-model identity in rate limiting and fallback

* fix test
2026-04-02 19:26:26 +08:00
Liu Yuan e73d9d959e feat(config): support multiple API keys for failover (#1707)
* feat(config): support multiple API keys for failover

Add api_keys field to ModelConfig to support multiple API keys with
automatic failover. When multiple keys are configured, they are expanded
into separate model entries with fallbacks set up for key-level failover.

Example config:
  {
    "model_name": "glm-4.7",
    "model": "zhipu/glm-4.7",
    "api_keys": ["key1", "key2", "key3"]
  }

Expands internally to:
  - glm-4.7 (key1) -> fallbacks: [glm-4.7__key_1, glm-4.7__key_2]
  - glm-4.7__key_1 (key2)
  - glm-4.7__key_2 (key3)

Backward compatible: single api_key still works as before.

* fix(providers): change cooldown tracking from provider to ModelKey

This enables proper key-switching when multiple API keys share the same
provider. Previously, when one key failed, all keys were blocked because
cooldown was tracked per-provider.

Now each (provider, model) combination has independent cooldown, allowing
fallback to alternate keys when one is rate limited.

Includes TestMultiKeyWithModelFallback and related failover tests.
2026-03-19 00:57:20 +08:00
Yiliu fb96645ea9 fix(providers): support lookup-based fallback candidate resolution 2026-02-26 23:50:33 +08:00
Artem Yadelskyi 9e120f90ea feat(fmt): Run formatters 2026-02-18 21:48:23 +02:00
Leandro Barbosa 6e7149509a feat: add model fallback chain with error classification
Add 2-layer fallback system (text + image) with automatic candidate
resolution. Includes error classifier (~40 patterns), per-provider
cooldown (exponential backoff), and model reference parsing.

- FailoverError/FailoverReason types for structured error handling
- ErrorClassifier with rate_limit, billing, auth, timeout patterns
- FallbackChain with cooldown management and candidate rotation
- ModelRef parser for provider/model string format
- 128 tests, 95%+ coverage
2026-02-13 12:12:12 -03:00