# Qwen3 VL Reranker 8B

**Summary:** Qwen3 VL Reranker 8B is a multimodal reranking model by Alibaba's Qwen team that scores the relevance between a query and a set of candidate documents, supporting both text and images. Supporting over 30 languages and a 32,768-token context window, this model excels as a precision refinement step in two-stage retrieval pipelines, making it ideal for applications such as multimodal search, visual document retrieval, and cross-lingual reranking where high-precision relevance scoring is required.

|                                          **Intelligence**                                          |                             **Speed**                             |                          **Sovereignty**                          |                                              **Input**                                             |                                             **Output**                                             |
| :------------------------------------------------------------------------------------------------: | :---------------------------------------------------------------: | :---------------------------------------------------------------: | :------------------------------------------------------------------------------------------------: | :------------------------------------------------------------------------------------------------: |
| ![](/files/dnDi7yuqXqkBFqwaxdnm) ![](/files/dnDi7yuqXqkBFqwaxdnm) ![](/files/dnDi7yuqXqkBFqwaxdnm) | ![](/files/evfYW3bq4dTBLlZH3dQf) ![](/files/evfYW3bq4dTBLlZH3dQf) | ![](/files/bNpzGRJfez9SidEjNCoy) ![](/files/bNpzGRJfez9SidEjNCoy) | ![](/files/45qlqURbT8c2Ekr8HJfK) ![](/files/DNwY25ymIs7iVH2CHTv1) ![](/files/PRglWWEC5Zoc5fgynNLM) | ![](/files/45qlqURbT8c2Ekr8HJfK) ![](/files/0mPVwOtrYhZrpz9clC3D) ![](/files/PRglWWEC5Zoc5fgynNLM) |
|                                               *High*                                               |                              *Medium*                             |                              *Medium*                             |                                            *Text, Image*                                           |                                               *Score*                                              |

## Central parameters

**Description:** Multimodal reranking model by Alibaba's Qwen team, scoring query-document relevance from text and image inputs across 30+ languages.

**Model identifier:** `Qwen/Qwen3-VL-Reranker-8B`

## IONOS AI Model Hub Lifecycle and Alternatives

| **IONOS Launch** | **End of Life** | **Alternative** | **Successor** |
| :--------------: | :-------------: | :-------------: | :-----------: |
|  *May 12, 2026*  |       N/A       |                 |               |

## Origin

|                                   **Provider**                                   | **Country** |                                           **License**                                          | **Flavor** |    **Release**    |
| :------------------------------------------------------------------------------: | :---------: | :--------------------------------------------------------------------------------------------: | :--------: | :---------------: |
| [<mark style="color:blue;">**Qwen (Alibaba)**</mark>](https://qwenlm.github.io/) |  Community  | [<mark style="color:blue;">**Apache 2.0**</mark>](https://www.apache.org/licenses/LICENSE-2.0) |      -     | *January 8, 2026* |

## Technology

| **Input Length** | **Parameters** | **Tensor Type** | **Multilingual** |                                          **Further details**                                          |
| :--------------: | :------------: | :-------------: | :--------------: | :---------------------------------------------------------------------------------------------------: |
|      *32768*     |      *8B*      |    *bfloat16*   |       *Yes*      | [<mark style="color:blue;">**Hugging Face**</mark>](https://huggingface.co/Qwen/Qwen3-VL-Reranker-8B) |

**Image tokenisation:** Images are tokenised at 1 token per 32×32 pixel block. Images larger than 1,310,720 pixels are downscaled proportionally before tokenisation. The token cost for a given image is:

```
tokens = (resized_width / 32) × (resized_height / 32)
```

For example, a 1296×1936 px image (2,507,616 px) is downscaled to fit within 1,310,720 px, resulting in approximately 1,247 tokens. With a 32,768-token context window and \~200 tokens of query and prompt overhead, a single document can contain approximately 25 images of that resolution.

## Modalities

|     **Text**     | **Image** |   **Audio**   |
| :--------------: | :-------: | :-----------: |
| Input and output |   Input   | Not supported |

## Endpoints

| **Chat Completions** | **Rerank** |
| :------------------: | :--------: |
|     Not supported    |  v1/rerank |

## Features

| **Streaming** | **Reasoning** | **Tool calling** |
| :-----------: | :-----------: | :--------------: |
| Not supported | Not supported |   Not supported  |

## Usage example

### Rerank

The following example demonstrates how to rerank a set of documents by relevance to a query using **Qwen3 VL Reranker 8B**.

**API Endpoint:** `POST https://openai.inference.de-txl.ionos.com/v1/rerank`

**Request:**

```json
{
  "model": "Qwen/Qwen3-VL-Reranker-8B",
  "query": "What is the capital of France?",
  "documents": [
    "Paris is the capital of France.",
    "London is the capital of England.",
    "Berlin is the capital of Germany."
  ],
  "top_n": 2
}
```

**Response:**

```json
{
  "id": "rerank-abc123",
  "object": "list",
  "model": "Qwen/Qwen3-VL-Reranker-8B",
  "results": [
    {
      "index": 0,
      "document": {
        "text": "Paris is the capital of France."
      },
      "relevance_score": 0.9234
    },
    {
      "index": 2,
      "document": {
        "text": "Berlin is the capital of Germany."
      },
      "relevance_score": 0.0456
    }
  ],
  "usage": {
    "prompt_tokens": 65,
    "total_tokens": 65
  }
}
```

### Multimodal rerank

The following example demonstrates how to rerank image-only documents using **Qwen3 VL Reranker 8B**. The first document is a hosted image URL; the second is a base64-encoded image.

**API Endpoint:** `POST https://openai.inference.de-txl.ionos.com/v1/rerank`

**Request:**

```json
{
  "model": "Qwen/Qwen3-VL-Reranker-8B",
  "query": "Show me the Red Hat and IONOS partnership logo",
  "documents": [
    {
      "content": [
        {
          "type": "image_url",
          "image_url": {
            "url": "https://www.ionos.de/newsroom/wp-content/uploads/2026/01/Red-Hat_IONOS-1024x576.png"
          }
        }
      ]
    },
    {
      "content": [
        {
          "type": "image_url",
          "image_url": {
            "url": "data:image/jpeg;base64,/9j/4AAQSkZJRgAB..."
          }
        }
      ]
    }
  ],
  "top_n": 2
}
```

**Response:**

```json
{
  "id": "rerank-def456",
  "object": "list",
  "model": "Qwen/Qwen3-VL-Reranker-8B",
  "results": [
    {
      "index": 0,
      "document" : {
        "text" : null,
        "multi_modal" : [ {
          "image_url" : {
            "url" : "https://www.ionos.de/newsroom/wp-content/uploads/2026/01/Red-Hat_IONOS-1024x576.png"
          },
          "type" : "image_url"
        } ]
      },
      "relevance_score": 0.9187
    },
    {
      "index": 1,
      "document" : {
        "text" : null,
        "multi_modal" : [ {
          "image_url" : {
            "url" : "data:image/jpeg;base64,/9j/4AAQSkZJRgAB..."
          },
          "type" : "image_url"
        } ]
      },
      "relevance_score": 0.0341
    }
  ],
  "usage": {
    "prompt_tokens": 820,
    "total_tokens": 820
  }
}
```

## Rate limits

Rate limits ensure fair usage and reliable access to the AI Model Hub. In addition to the [<mark style="color:blue;">contract-wide rate limits</mark>](/cloud/ai/ai-model-hub/how-tos/rate-limits.md), no model-specific limits apply.


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.ionos.com/cloud/ai/ai-model-hub/models/reranking-models/qwen3-vl-reranker-8b.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
