# Qwen3 VL Embedding 8B

**Summary:** Qwen3 VL Embedding 8B is a multimodal embedding model by Alibaba's Qwen team that generates semantic vector representations from both text and images. Supporting over 30 languages and a 32,768-token context window, this model excels in multimodal search, visual document retrieval, and cross-modal semantic matching, making it ideal for applications that require understanding across both textual and visual content such as image-text retrieval, screenshot search, and PDF document discovery.

|                                          **Intelligence**                                          |                             **Speed**                             |                          **Sovereignty**                          |                                              **Input**                                             |                                             **Output**                                             |
| :------------------------------------------------------------------------------------------------: | :---------------------------------------------------------------: | :---------------------------------------------------------------: | :------------------------------------------------------------------------------------------------: | :------------------------------------------------------------------------------------------------: |
| ![](/files/dnDi7yuqXqkBFqwaxdnm) ![](/files/dnDi7yuqXqkBFqwaxdnm) ![](/files/dnDi7yuqXqkBFqwaxdnm) | ![](/files/evfYW3bq4dTBLlZH3dQf) ![](/files/evfYW3bq4dTBLlZH3dQf) | ![](/files/bNpzGRJfez9SidEjNCoy) ![](/files/bNpzGRJfez9SidEjNCoy) | ![](/files/45qlqURbT8c2Ekr8HJfK) ![](/files/DNwY25ymIs7iVH2CHTv1) ![](/files/PRglWWEC5Zoc5fgynNLM) | ![](/files/45qlqURbT8c2Ekr8HJfK) ![](/files/0mPVwOtrYhZrpz9clC3D) ![](/files/PRglWWEC5Zoc5fgynNLM) |
|                                               *High*                                               |                              *Medium*                             |                              *Medium*                             |                                            *Text, Image*                                           |                                           *Number Vector*                                          |

## Central parameters

**Description:** Multimodal embedding model by Alibaba's Qwen team, generating 4096-dimensional vectors from text and image inputs across 30+ languages.

**Model identifier:** `Qwen/Qwen3-VL-Embedding-8B`

## IONOS CLOUD AI Model Hub Lifecycle and Alternatives

| **IONOS Launch** | **End of Life** | **Alternative** | **Successor** |
| :--------------: | :-------------: | :-------------: | :-----------: |
|  *May 12, 2026*  |       N/A       |                 |               |

## Origin

|                                   **Provider**                                   | **Country** |                                           **License**                                          | **Flavor** |    **Release**    |
| :------------------------------------------------------------------------------: | :---------: | :--------------------------------------------------------------------------------------------: | :--------: | :---------------: |
| [<mark style="color:blue;">**Qwen (Alibaba)**</mark>](https://qwenlm.github.io/) |  Community  | [<mark style="color:blue;">**Apache 2.0**</mark>](https://www.apache.org/licenses/LICENSE-2.0) |      -     | *January 8, 2026* |

## Technology

| **Input Length** | **Parameters** | **Tensor Type** | **Multilingual** |                                           **Further details**                                          |
| :--------------: | :------------: | :-------------: | :--------------: | :----------------------------------------------------------------------------------------------------: |
|      *32768*     |      *8B*      |    *bfloat16*   |       *Yes*      | [<mark style="color:blue;">**Hugging Face**</mark>](https://huggingface.co/Qwen/Qwen3-VL-Embedding-8B) |

**Input and output:** Each input produces one embedding vector, regardless of modality. A document can be text-only, image-only, or a combination of text and image, each produces a single vector up to 4096 dimensions.

## Modalities

|     **Text**     | **Image** |   **Audio**   |
| :--------------: | :-------: | :-----------: |
| Input and output |   Input   | Not supported |

## Endpoints

| **Chat Completions** | **Embeddings** | **Image generation** |
| :------------------: | :------------: | :------------------: |
|     Not supported    |  v1/embeddings |     Not supported    |

## Features

| **Streaming** | **Reasoning** | **Tool calling** |
| :-----------: | :-----------: | :--------------: |
| Not supported | Not supported |   Not supported  |

## Usage examples

### Text embeddings

The following example demonstrates how to generate text embeddings using **Qwen3 VL Embedding 8B**.

**API Endpoint:** `POST https://openai.inference.de-txl.ionos.com/v1/embeddings`

**Request:**

```json
{
  "model": "Qwen/Qwen3-VL-Embedding-8B",
  "input": "The food was delicious and the waiter was friendly."
}
```

**Response:**

```json
{
  "object": "list",
  "data": [
    {
      "object": "embedding",
      "embedding": [
        0.0023064255,
        -0.009327292,
        0.0028842222
      ],
      "index": 0
    }
  ],
  "model": "Qwen/Qwen3-VL-Embedding-8B",
  "usage": {
    "prompt_tokens": 10,
    "total_tokens": 10
  }
}
```

### Multimodal embeddings

The following example demonstrates how to generate embeddings from combined text and image input using **Qwen3 VL Embedding 8B**.

**Request:**

```json
{
  "model": "Qwen/Qwen3-VL-Embedding-8B",
  "messages": [
    {
      "role": "user",
      "content": [
        {
          "type": "image_url",
          "image_url": {
            "url": "https://example.com/image.jpg"
          }
        },
        {
          "type": "text",
          "text": "A product image showing a laptop"
        }
      ]
    }
  ],
  "encoding_format": "float"
}
```

**Response:**

```json
{
  "object": "list",
  "data": [
    {
      "object": "embedding",
      "embedding": [
        0.0023064255,
        -0.009327292,
        0.0028842222
      ],
      "index": 0
    }
  ],
  "model": "Qwen/Qwen3-VL-Embedding-8B",
  "usage": {
    "prompt_tokens": 261,
    "total_tokens": 261
  }
}
```

## Rate limits

Rate limits ensure fair usage and reliable access to the AI Model Hub. In addition to the [<mark style="color:blue;">contract-wide rate limits</mark>](/cloud/ai/ai-model-hub/how-tos/rate-limits.md), no model-specific limits apply.


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.ionos.com/cloud/ai/ai-model-hub/models/embedding-models/qwen3-vl-embedding-8b.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.