AnyInt Docs
Models

Image Input

The current published catalog shows image-aware behavior most clearly in provider-compatible request bodies rather than in one generic vision endpoint.

The current published catalog shows image-aware behavior most clearly in provider-compatible request bodies rather than in one generic vision endpoint.

Image input today

Route familyWhat it supports
Anthropic-compatible messagesimage input plus text instructions in the same message
Gemini image generationtext-to-image and mixed text-plus-image output
DashScope image generationprompt-driven asset generation with image-specific parameters

Anthropic-style image input

The clearest published image-input route in the current catalog is:

POST /anthropic/v1/messages

In that request body, messages[].content can be an array of blocks such as:

  • image
  • text

That makes it a good fit for captioning, scene understanding, or document-like visual reasoning where your application already uses Claude-style content blocks.

Practical guidance

  • Use provider-native request shapes for image input instead of trying to force everything into one generic schema
  • If your application already uses Claude-style content blocks, Anthropic-compatible routes are the cleanest fit
  • If your application is generating or editing visuals, use the Gemini or DashScope media pages in this docs set

On this page