> ## Documentation Index
> Fetch the complete documentation index at: https://phidatainc-redirect-agent-platform-overview.mintlify.site/llms.txt
> Use this file to discover all available pages before exploring further.

# Fallback Models

> Automatically switch to backup models when the primary model hits rate limits, outages, or context window limits.

Pass `fallback_models` to any Agent or Team. If the primary model fails after exhausting its retries, each fallback is tried in order until one succeeds.

```python theme={null}
from agno.agent import Agent
from agno.models.anthropic import Claude
from agno.models.openai import OpenAIChat

agent = Agent(
    model=OpenAIChat(id="gpt-4o"),
    fallback_models=[Claude(id="claude-sonnet-4-20250514")],
)
```

If `gpt-4o` fails after exhausting its own retries, Claude is tried automatically.

[Model strings](/models/model-as-string) work too:

```python theme={null}
from agno.agent import Agent

agent = Agent(
    model="openai:gpt-4o",
    fallback_models=["anthropic:claude-sonnet-4-20250514"],
)
```

## Usage with Teams

Fallback models apply to the team leader's model calls. Member agents keep their own models and are not affected by the leader's fallback config.

```python theme={null}
from agno.agent import Agent
from agno.models.anthropic import Claude
from agno.models.openai import OpenAIChat
from agno.team import Team

researcher = Agent(
    name="Researcher",
    role="You research topics and provide detailed findings.",
    model=OpenAIChat(id="gpt-4o-mini"),
)

writer = Agent(
    name="Writer",
    role="You write clear, concise summaries from research findings.",
    model=OpenAIChat(id="gpt-4o-mini"),
)

team = Team(
    name="Research Team",
    model=OpenAIChat(id="gpt-4o"),
    fallback_models=[Claude(id="claude-sonnet-4-20250514")],
    members=[researcher, writer],
    markdown=True,
)
```

## Error-Specific Fallbacks

`FallbackConfig` lets you route different error types to different fallback models. Instead of a flat list, you specify which models to try for rate limits, context window overflows, and general errors separately.

```python theme={null}
from agno.agent import Agent
from agno.models.fallback import FallbackConfig
from agno.models.anthropic import Claude
from agno.models.openai import OpenAIChat

agent = Agent(
    model=OpenAIChat(id="gpt-4o"),
    fallback_config=FallbackConfig(
        # On rate-limit (429/529) errors
        on_rate_limit=[
            OpenAIChat(id="gpt-4o-mini"),
            Claude(id="claude-sonnet-4-20250514"),
        ],
        # On context-window-exceeded errors
        on_context_overflow=[
            Claude(id="claude-sonnet-4-20250514"),
        ],
        # General fallback for any other retryable error
        on_error=[
            Claude(id="claude-sonnet-4-20250514"),
        ],
    ),
)
```

### Error routing

When the primary model fails, the error is classified and routed to the matching fallback list:

| Error Type              | Fallback List         | Example                                   |
| ----------------------- | --------------------- | ----------------------------------------- |
| Rate limit (429/529)    | `on_rate_limit`       | Provider throttling, Anthropic overloaded |
| Context window exceeded | `on_context_overflow` | Input too long for model's context window |
| Other retryable errors  | `on_error`            | Server errors (5xx), network failures     |

If a specific list (like `on_rate_limit`) is empty, `on_error` is used as a catch-all.

Non-retryable client errors like 400, 401, 403, 404, and 422 are **not** caught by fallback. These indicate configuration problems (bad API key, invalid request) that need to be fixed rather than masked by switching models.

## Fallback Callback

Use the `callback` parameter to get notified whenever a fallback model is activated. This is useful for logging, metrics, or alerting.

```python theme={null}
from agno.agent import Agent
from agno.models.fallback import FallbackConfig
from agno.models.anthropic import Claude
from agno.models.openai import OpenAIChat


def on_fallback(primary_model_id: str, fallback_model_id: str, error: Exception) -> None:
    print(f"[fallback] {primary_model_id} -> {fallback_model_id} (reason: {error})")


agent = Agent(
    model=OpenAIChat(id="gpt-4o"),
    fallback_config=FallbackConfig(
        on_error=[Claude(id="claude-sonnet-4-20250514")],
        callback=on_fallback,
    ),
)
```

The callback fires after the fallback model succeeds. For streaming calls, it fires after the full stream completes.

## Retry vs. Fallback

Retry and fallback are separate layers. Retry happens inside each model. Fallback only triggers after the primary model's retry loop is fully exhausted.

```
Primary model
  └── _invoke_with_retry()        # retries N times (per model config)
On failure
  └── classify error type
  └── select matching fallback list
  └── try each fallback in order
        └── fallback._invoke_with_retry()   # each fallback retries independently
```

Each model controls its own retry behavior:

```python theme={null}
agent = Agent(
    model=OpenAIChat(id="gpt-4o", retries=3, exponential_backoff=True),
    fallback_models=[
        Claude(id="claude-sonnet-4-20250514", retries=2),
    ],
)
```

The primary model retries 3 times with exponential backoff. Only after all 3 attempts fail does the fallback kick in, and it gets 2 retries of its own.

## Streaming

Fallback works with streaming responses. If the primary model fails mid-stream, the fallback model takes over and the response content is reset so the consumer receives a clean response from the fallback model only.

## Parameters

Available on both `Agent` and `Team`:

| Parameter         | Type                 | Description                                                                         |
| ----------------- | -------------------- | ----------------------------------------------------------------------------------- |
| `fallback_models` | `List[Model \| str]` | Models tried in order on any failure. Shorthand for `FallbackConfig(on_error=...)`. |
| `fallback_config` | `FallbackConfig`     | Error-specific routing. Takes precedence over `fallback_models` if both are set.    |

### FallbackConfig

| Field                 | Type                                    | Description                                                                                         |
| --------------------- | --------------------------------------- | --------------------------------------------------------------------------------------------------- |
| `on_error`            | `List[Model \| str]`                    | General fallback for any retryable error.                                                           |
| `on_rate_limit`       | `List[Model \| str]`                    | Fallback for rate-limit (429/529) errors. Falls back to `on_error` if empty.                        |
| `on_context_overflow` | `List[Model \| str]`                    | Fallback for context-window-exceeded errors. Falls back to `on_error` if empty.                     |
| `callback`            | `Callable[[str, str, Exception], None]` | Called when a fallback model is activated. Receives `(primary_model_id, fallback_model_id, error)`. |

## Developer Resources

* [Basic fallback example](https://github.com/agno-agi/agno/blob/main/cookbook/02_agents/17_fallback_models/01_basic_fallback.py)
* [Error-specific fallbacks example](https://github.com/agno-agi/agno/blob/main/cookbook/02_agents/17_fallback_models/02_error_specific_fallbacks.py)
* [Fallback callback example](https://github.com/agno-agi/agno/blob/main/cookbook/02_agents/17_fallback_models/04_fallback_callback.py)
* [Team fallback example](https://github.com/agno-agi/agno/blob/main/cookbook/03_teams/17_fallback_models/01_basic_fallback.py)