Qwen3 VL Embedding 8B

Summary: Qwen3 VL Embedding 8B is a multimodal embedding model by Alibaba's Qwen team that generates semantic vector representations from both text and images. Supporting over 30 languages and a 32,768-token context window, this model excels in multimodal search, visual document retrieval, and cross-modal semantic matching, making it ideal for applications that require understanding across both textual and visual content such as image-text retrieval, screenshot search, and PDF document discovery.

Intelligence

Speed

Sovereignty

Input

Output

High

Medium

Medium

Text, Image

Number Vector

Central parameters

Description: Multimodal embedding model by Alibaba's Qwen team, generating 4096-dimensional vectors from text and image inputs across 30+ languages.

Model identifier: Qwen/Qwen3-VL-Embedding-8B

IONOS CLOUD AI Model Hub Lifecycle and Alternatives

IONOS Launch

End of Life

Alternative

Successor

May 12, 2026

N/A

Origin

Provider

Country

License

Flavor

Release

Community

-

January 8, 2026

Technology

Input Length

Parameters

Tensor Type

Multilingual

Further details

32768

8B

bfloat16

Yes

Input and output: Each input produces one embedding vector, regardless of modality. A document can be text-only, image-only, or a combination of text and image, each produces a single vector up to 4096 dimensions.

Modalities

Text

Image

Audio

Input and output

Input

Not supported

Endpoints

Chat Completions

Embeddings

Image generation

Not supported

v1/embeddings

Not supported

Features

Streaming

Reasoning

Tool calling

Not supported

Not supported

Not supported

Usage examples

Text embeddings

The following example demonstrates how to generate text embeddings using Qwen3 VL Embedding 8B.

API Endpoint: POST https://openai.inference.de-txl.ionos.com/v1/embeddings

Request:

Response:

Multimodal embeddings

The following example demonstrates how to generate embeddings from combined text and image input using Qwen3 VL Embedding 8B.

Request:

Response:

Rate limits

Rate limits ensure fair usage and reliable access to the AI Model Hub. In addition to the contract-wide rate limits, no model-specific limits apply.

Last updated

Was this helpful?