Guides
Guides for request behavior that cuts across API families, including streaming, tool calling, structured outputs, prompt caching, and fallbacks.
Use guides when you already know the API family you want to call, but need to make the integration production-ready. These pages cover behavior that cuts across route families: streaming, structured outputs, tool calling, prompt caching, fallback planning, and local agent integrations.
For exact request fields, always return to API Reference. A guide explains the pattern; the API page defines the public contract.
Choose by behavior
| Need | Page |
|---|---|
| Understand the feature set at a high level | Overview |
| Send partial output to users as it is generated | Streaming |
| Ask models to return predictable JSON-like output | Structured Outputs |
| Let models call functions or tools | Tool Calling |
| Reuse long prompt prefixes when the route supports it | Prompt Caching |
| Keep production workloads usable when a model path fails | Model Fallbacks |
| Configure coding agents or local tools against AnyInt | Agent Tool Integrations |
Recommended production order
| Stage | What to verify | Related page |
|---|---|---|
| First request | Key, base URL, model ID, and a minimal text response | Verify Your Integration |
| User experience | Whether the UI needs incremental chunks or a complete response | Streaming |
| Machine-readable output | Whether downstream code can safely parse the model response | Structured Outputs |
| External actions | Whether the model should call functions or tools controlled by your app | Tool Calling |
| Cost and latency | Whether repeated prompt prefixes can be reused | Prompt Caching |
| Reliability | What happens when the primary model or provider path fails | Model Fallbacks |
| Local tooling | How coding agents, CLIs, and automation should discover docs and endpoints | Agent Tool Integrations |
API family matters
These guides describe patterns, not universal request fields. Always confirm the exact payload shape in API Reference before shipping a production client.
| Family | Common guide concern | What changes by family |
|---|---|---|
| OpenAI-compatible | First integration, streaming, structured outputs, tool calling | Uses OpenAI-style messages, stream, and tool fields |
| Anthropic-compatible | Claude-style content blocks and image input | Uses Anthropic headers and content block shapes |
| Gemini-compatible | Gemini-native generation, streaming, and function declarations | Uses contents[].parts[] and route methods such as generateContent |
| Media and music APIs | Async tasks, polling, callbacks, and output URLs | Creation responses may return task IDs instead of final assets |
What to document in your own app
For each production workload, keep a small integration note next to the code that records:
- Primary API family and route.
- Model ID source and refresh process.
- Whether the request is sync, streaming, or task-based.
- Retry and fallback behavior.
- Error messages surfaced to users for auth, quota, invalid request, and temporary upstream failures.