# Mistral Nemo 12B

**Summary:** Mistral Nemo is a compact yet powerful 12B-parameter language model co-developed with NVIDIA, designed specifically for conversational agents and virtual assistants. This French-engineered model features an advanced Tekken tokenizer that provides superior efficiency for European languages, making it ideal for applications requiring fast response times while maintaining high-quality multilingual performance across German, French, Spanish, Italian, and Portuguese.

|                   **Intelligence**                  |                                                                **Speed**                                                               |                                            **Sovereignty**                                            |                                                                 **Input**                                                                 |                                                                 **Output**                                                                |
| :-------------------------------------------------: | :------------------------------------------------------------------------------------------------------------------------------------: | :---------------------------------------------------------------------------------------------------: | :---------------------------------------------------------------------------------------------------------------------------------------: | :---------------------------------------------------------------------------------------------------------------------------------------: |
| ![Intelligence active](/files/dnDi7yuqXqkBFqwaxdnm) | ![Speed active](/files/evfYW3bq4dTBLlZH3dQf) ![Speed active](/files/evfYW3bq4dTBLlZH3dQf) ![Speed active](/files/evfYW3bq4dTBLlZH3dQf) | ![Sovereignty active](/files/bNpzGRJfez9SidEjNCoy) ![Sovereignty active](/files/bNpzGRJfez9SidEjNCoy) | ![Text active](/files/45qlqURbT8c2Ekr8HJfK) ![Image inactive](/files/0mPVwOtrYhZrpz9clC3D) ![Audio inactive](/files/PRglWWEC5Zoc5fgynNLM) | ![Text active](/files/45qlqURbT8c2Ekr8HJfK) ![Image inactive](/files/0mPVwOtrYhZrpz9clC3D) ![Audio inactive](/files/PRglWWEC5Zoc5fgynNLM) |
|                        *Low*                        |                                                                 *High*                                                                 |                                               *Moderate*                                              |                                                                   *Text*                                                                  |                                                                   *Text*                                                                  |

## Central parameters

**Description:** Mistral Nemo replaces Mistral 7B as the latest generation small language model, offering enhanced performance with 12.2B parameters and a 128k context window. The model utilizes fp8 quantization for optimized inference efficiency.

**Model identifier:** `mistralai/Mistral-Nemo-Instruct-2407`

## IONOS CLOUD AI Model Hub Lifecycle and Alternatives

| **IONOS start date** | **End of Life** |                                                 **Alternative**                                                | **Successor** |
| :------------------: | :-------------: | :------------------------------------------------------------------------------------------------------------: | :-----------: |
|    *May 15, 2025*    |       N/A       | [<mark style="color:blue;">**Llama 3.1 (8B)**</mark>](/cloud/ai/ai-model-hub/models/llms/meta-llama-3-1-8b.md) |               |

## Origin

|                              **Provider**                              | **Country** |                                            **License**                                           | **Flavor** |   **Release**   |
| :--------------------------------------------------------------------: | :---------: | :----------------------------------------------------------------------------------------------: | :--------: | :-------------: |
| [<mark style="color:blue;">**Mistral AI**</mark>](https://mistral.ai/) |    France   | [<mark style="color:blue;">**License**</mark>](https://www.apache.org/licenses/LICENSE-2.0.html) |  Instruct  | *July 18, 2024* |

## Technology

| **Context window** | **Parameters** | **Quantization** | **Multilingual** |                                                **Further details**                                               |
| :----------------: | :------------: | :--------------: | :--------------: | :--------------------------------------------------------------------------------------------------------------: |
|       *128k*       |     *12.2B*    |       *fp8*      |       *Yes*      | [<mark style="color:blue;">**Hugging Face**</mark>](https://huggingface.co/mistralai/Mistral-Nemo-Instruct-2407) |

## Modalities

|     **Text**     |   **Image**   |   **Audio**   |
| :--------------: | :-----------: | :-----------: |
| Input and output | Not supported | Not supported |

## Endpoints

| **Chat Completions** | **Embeddings** | **Image generation** |
| :------------------: | :------------: | :------------------: |
|  v1/chat/completions |  Not supported |     Not supported    |

## Features

| **Streaming** | **Reasoning** | **Tool calling** |
| :-----------: | :-----------: | :--------------: |
|   Supported   | Not supported |     Supported    |

## Usage example

### Chat completions

The following example demonstrates how to use **Mistral Nemo** for instructional tasks, using a descriptive prompt to explain complex concepts clearly.

**API Endpoint:** `POST https://openai.inference.de-txl.ionos.com/v1/chat/completions`

**Request:**

```json
{
  "model": "mistralai/Mistral-Nemo-Instruct-2407",
  "messages": [
    {
      "role": "user",
      "content": "Explain the concept of quantum entanglement to a 5-year-old using simple analogies."
    }
  ],
  "temperature": 0.7,
  "top_p": 0.9,
  "n": 1,
  "stream": false,
  "max_tokens": 1000
}
```

**Response:**

```json
{
  "id": "chatcmpl-456",
  "object": "chat.completion",
  "created": 1677652290,
  "model": "mistralai/Mistral-Nemo-Instruct-2407",
  "choices": [{
    "index": 0,
    "message": {
      "role": "assistant",
      "content": "Imagine you have two magic dice. No matter how far apart they are, even if one is on Earth and the other is on Mars, if you roll a 6 on one, the other one will instantly show a 6 too! They are connected in a special way that lets them 'talk' to each other instantly."
    },
    "finish_reason": "stop"
  }],
  "usage": {
    "prompt_tokens": 25,
    "completion_tokens": 60,
    "total_tokens": 85
  }
}
```

## Rate limits

Rate limits ensure fair usage and reliable access to the AI Model Hub. In addition to the [<mark style="color:blue;">contract-wide rate limits</mark>](/cloud/ai/ai-model-hub/how-tos/rate-limits.md), no model-specific limits apply.


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.ionos.com/cloud/ai/ai-model-hub/models/llms/mistral-nemo.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.