AnyInt Docs
Guides

Model Fallbacks

Fallbacks are how you keep an application usable when one model, route, or provider path is unavailable.

Fallbacks are how you keep an application usable when one model, route, or provider path is unavailable.

Use fallback policy for reliability, not for hiding uncertainty. A fallback should be explicit enough that engineering, product, and support teams can tell when the application used a secondary path and what changed for the user.

Good fallback patterns

  • primary model plus one lower-cost alternative
  • primary premium model plus one stable baseline
  • one route family per workload, with a documented escape hatch
DecisionRecommendation
ScopeDefine fallbacks per workload, not globally for the whole application
CompatibilityPrefer a fallback in the same API family and request shape
TriggerRetry on temporary failures, but do not retry invalid requests or auth failures as fallbacks
VisibilityLog every fallback event with workload, primary model, fallback model, status, and reason
User experienceDecide whether users should see a degraded-mode message, especially for paid tiers

Example policy

WorkloadPrimaryFallbackAllowed trigger
Support chatpremium chat modelstable baseline chat modeltransient 5xx, timeout, or upstream unavailable
Batch extractionlow-cost extraction modelsame-family baseline modeltemporary provider failure
Code assistantstronger code-capable modelnone without user noticequality-sensitive paid workflow
Image understandingvision-capable modelsame-family vision-capable modeltransient provider failure only

What to avoid

  • silent fallback that changes output quality without observability
  • fallback chains that cross incompatible request shapes
  • using fallback as a substitute for real entitlement checks

A practical fallback policy

  1. Choose one primary model per workload.
  2. Pick one fallback in the same compatibility family.
  3. Verify the fallback request shape with the same payload class.
  4. Log every fallback event.
  5. Expose degraded behavior clearly in internal dashboards.
  6. Review fallback output quality before enabling it for user-visible workflows.

Error handling boundary

Error classFallback?Reason
400 invalid requestNoThe payload must be fixed before retrying
401 missing or invalid keyNoThis is an authentication problem
403 unavailable model or policy limitUsually noCheck entitlement before changing behavior
429 rate or quota limitMaybeBackoff first; fallback only if policy allows it
5xx or timeoutOften yesTemporary failures are the most common fallback trigger

When not to fallback

Do not fallback silently when:

  • the user paid for a specific service tier
  • the request contains provider-specific guides that another route cannot honor
  • billing or compliance boundaries would change

On this page