Overview Quickstart

Overview Streaming Structured Outputs Tool Calling Prompt Caching Model Fallbacks Verify Your Integration Agent Coding Tool Integrations

Guides

Model Fallbacks

Fallbacks are how you keep an application usable when one model, route, or provider path is unavailable.

Fallbacks are how you keep an application usable when one model, route, or provider path is unavailable.

Good fallback patterns

primary model plus one lower-cost alternative
primary premium model plus one stable baseline
one route family per workload, with a documented escape hatch

What to avoid

silent fallback that changes output quality without observability
fallback chains that cross incompatible request shapes
using fallback as a substitute for real entitlement checks

A practical fallback policy

choose one primary model per workload
pick one fallback in the same compatibility family
log every fallback event
expose degraded behavior clearly in internal dashboards

When not to fallback

Do not fallback silently when:

the user paid for a specific service tier
the request contains provider-specific guides that another route cannot honor
billing or compliance boundaries would change

Prompt Caching

Prompt caching reduces repeated-context cost and latency when the same prompt segments are reused. In practice, the main work is prompt design: stable prefixes are easier to reuse than constantly changing prompts.

Verify Your Integration

Run customer-facing checks before depending on AnyInt in production: model discovery, authentication, first requests, streaming, async tasks, callbacks, and error handling.

On this page

Good fallback patterns What to avoid A practical fallback policy When not to fallback