Guides
Model Fallbacks
Fallbacks are how you keep an application usable when one model, route, or provider path is unavailable.
Fallbacks are how you keep an application usable when one model, route, or provider path is unavailable.
Good fallback patterns
- primary model plus one lower-cost alternative
- primary premium model plus one stable baseline
- one route family per workload, with a documented escape hatch
What to avoid
- silent fallback that changes output quality without observability
- fallback chains that cross incompatible request shapes
- using fallback as a substitute for real entitlement checks
A practical fallback policy
- choose one primary model per workload
- pick one fallback in the same compatibility family
- log every fallback event
- expose degraded behavior clearly in internal dashboards
When not to fallback
Do not fallback silently when:
- the user paid for a specific service tier
- the request contains provider-specific guides that another route cannot honor
- billing or compliance boundaries would change
Prompt Caching
Prompt caching reduces repeated-context cost and latency when the same prompt segments are reused. In practice, the main work is prompt design: stable prefixes are easier to reuse than constantly changing prompts.
Verify Your Integration
Run customer-facing checks before depending on AnyInt in production: model discovery, authentication, first requests, streaming, async tasks, callbacks, and error handling.