ChatModel Failover Guide
Overview
ChatModelAgent has built-in model failover capability: when the primary model call fails, it automatically switches to a backup model, supporting both Generate (synchronous) and Stream (streaming). Configured via ModelFailoverConfig[M], it composes orthogonally with TypedModelRetryConfig[M] (same-model retry).
This document uses the default
*schema.Messagetype as an example. For generic usage, replace the APIs with theirTyped-prefixed versions and parameterize the message type asM MessageType.
Core Data Structures
ModelFailoverConfig[M]
type ModelFailoverConfig[M MessageType] struct {
// Maximum failover attempts. 0 means no failover;
// 1 means GetFailoverModel is called at most once.
// When lastSuccessModel exists, it is tried first before calling GetFailoverModel.
MaxRetries uint
// Determines whether to trigger failover. When ctx.Err() != nil, stops regardless of return value.
// When combined with ModelRetryConfig, outputErr is *RetryExhaustedError;
// the original error is obtained via RetryExhaustedError.LastErr.
// In streaming scenarios, outputMessage may carry partially received messages.
// This field is required when configuring ModelFailoverConfig.
ShouldFailover func(ctx context.Context, outputMessage M, outputErr error) bool
// Selects the next model and optionally transforms input messages.
// failoverCtx.FailoverAttempt starts from 1.
// Returning nil failoverModelInputMessages means using the original input.
// Returning non-nil failoverErr immediately terminates failover.
// This field is required when configuring ModelFailoverConfig.
GetFailoverModel func(ctx context.Context, failoverCtx *FailoverContext[M]) (
failoverModel model.BaseModel[M],
failoverModelInputMessages []M,
failoverErr error,
)
}
FailoverContext[M]
type FailoverContext[M MessageType] struct {
FailoverAttempt uint // Current attempt number, starting from 1
InputMessages []M // Original input before transformation
LastOutputMessage M // Output from last failure (partial message in streaming)
// When combined with ModelRetryConfig, this is *RetryExhaustedError
LastErr error // Error from last failure
}
Quick Start
Basic Usage: Dual-Model Failover
agent, err := adk.NewChatModelAgent(ctx, &adk.ChatModelAgentConfig{
Name: "my-agent",
Instruction: "You are a helpful assistant.",
Model: primaryModel, // model.BaseModel[*schema.Message], required
ModelFailoverConfig: &adk.ModelFailoverConfig{
MaxRetries: 1, // At most 1 failover (2 calls total)
ShouldFailover: func(ctx context.Context, msg *schema.Message, err error) bool {
return !errors.Is(err, context.Canceled) &&
!errors.Is(err, context.DeadlineExceeded)
},
GetFailoverModel: func(ctx context.Context, fc *adk.FailoverContext) (
model.BaseChatModel, []*schema.Message, error,
) {
return fallbackModel, nil, nil // nil messages β use original input
},
},
})
π‘
model.BaseChatModelis a type alias formodel.BaseModel[*schema.Message]; the two can be used interchangeably.
Transforming Input During Failover
When the backup model doesn’t support certain features (e.g., image input):
ModelFailoverConfig: &adk.ModelFailoverConfig{
MaxRetries: 1,
ShouldFailover: func(_ context.Context, _ *schema.Message, _ error) bool {
return true
},
GetFailoverModel: func(_ context.Context, fc *adk.FailoverContext) (
model.BaseChatModel, []*schema.Message, error,
) {
// Filter out image content, downgrade to text-only model
return textModel, filterTextOnly(fc.InputMessages), nil
},
},
Combining with Retry
Failover and Retry compose orthogonally. Semantics: each model first retries according to the Retry strategy; after retries are exhausted, Failover switches to a different model.
agent, _ := adk.NewChatModelAgent(ctx, &adk.ChatModelAgentConfig{
Model: primaryModel,
// ...
ModelRetryConfig: &adk.ModelRetryConfig{
MaxRetries: 2,
IsRetryAble: func(_ context.Context, err error) bool {
return isTransientError(err)
},
},
ModelFailoverConfig: &adk.ModelFailoverConfig{
MaxRetries: 1,
ShouldFailover: func(_ context.Context, _ *schema.Message, err error) bool {
// err is *RetryExhaustedError at this point
return true
},
GetFailoverModel: func(_ context.Context, _ *adk.FailoverContext) (
model.BaseChatModel, []*schema.Message, error,
) {
return fallbackModel, nil, nil
},
},
})
Streaming Failover Behavior
| Scenario | Behavior |
Stream()initialization failure | Same as Generate, directly triggers failover evaluation |
| Mid-stream error | Received chunks are concatenated intoLastOutputMessageand passed to ShouldFailover; after deciding to failover, the current stream is closed and restarted with the new model |
| Client impact | Events already sent during the failed attempt are not retracted. Clients should reset partial results or deduplicate by metadata when receiving a new stream round |
π‘
ErrStreamCanceled(caller actively abandons the stream) does not trigger failover and returns immediately.
Model Call Chain Execution Order
Position of Failover in the wrapper chain (outer to inner):
1. AgentMiddleware.BeforeChatModel
2. ChatModelAgentMiddleware.BeforeModelRewriteState
3. failoverModelWrapper β failover at this layer
4. retryModelWrapper β internal retry within each failover model
5. eventSenderModelWrapper
6. ChatModelAgentMiddleware.WrapModel (first registered = outermost)
7. callbackInjectionModelWrapper (handled internally by failoverProxyModel when failover is enabled)
8. failoverProxyModel / Model.Generate|Stream
9. ChatModelAgentMiddleware.AfterModelRewriteState
10. AgentMiddleware.AfterChatModel
Important Notes
- Required field validation: Both
ShouldFailoverandGetFailoverModelare required when configuringModelFailoverConfig; missing either causesNewChatModelAgentto return an error. TheModelfield is always required. - Attempt numbering:
FailoverAttemptstarts from 1. A single Model call executes at most1 + MaxRetriestimes (1 initial + up to MaxRetries failovers). - Input messages: When
GetFailoverModelreturnsnilmessages, the original input is used; when returning non-nil, it replaces the original input. - Error type when combined with Retry:
ShouldFailoverandFailoverContext.LastErrreceive*RetryExhaustedError; the original error is obtained viaRetryExhaustedError.LastErr.