# Llama 3.1 8B

**Summary:** Llama 3.1 8B is a compact, highly efficient language model from Meta's flagship Llama family, optimized for conversational agents and real-time applications. With an impressive 128k token context window and robust multilingual support, this model delivers exceptional performance for chatbots, virtual assistants, and interactive applications where speed and responsiveness are crucial while maintaining high-quality natural language understanding.

|                                                                                                     **Intelligence**                                                                                                     |                                                                                                                                                                                                                                                                                                                       **Speed**                                                                                                                                                                                                                                                                                                                       |                                                                                                     **Sovereignty**                                                                                                     |                                                                                                                                                                                                                                                                                                                       **Input**                                                                                                                                                                                                                                                                                                                      |                                                                                                                                                                                                                                                                                                                      **Output**                                                                                                                                                                                                                                                                                                                      |
| :----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------: | :---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------: | :---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------: | :--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------: | :--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------: |
| ![](https://1737632334-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-MifAzdGvKLDTtvJP8sm%2Fuploads%2Fgit-blob-b23196ddc0cba1be0b981aa5572379cec1538be3%2Fai-model-hub-intelligence.png?alt=media) | ![](https://1737632334-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-MifAzdGvKLDTtvJP8sm%2Fuploads%2Fgit-blob-be3201cb2eba83650220699adf5b3d9120c83377%2Fai-model-hub-speed.png?alt=media) ![](https://1737632334-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-MifAzdGvKLDTtvJP8sm%2Fuploads%2Fgit-blob-be3201cb2eba83650220699adf5b3d9120c83377%2Fai-model-hub-speed.png?alt=media) ![](https://1737632334-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-MifAzdGvKLDTtvJP8sm%2Fuploads%2Fgit-blob-be3201cb2eba83650220699adf5b3d9120c83377%2Fai-model-hub-speed.png?alt=media) | ![](https://1737632334-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-MifAzdGvKLDTtvJP8sm%2Fuploads%2Fgit-blob-2c04b225a16490c4ff8c3e062bb166f25e05e1c2%2Fai-model-hub-sovereignty.png?alt=media) | ![](https://1737632334-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-MifAzdGvKLDTtvJP8sm%2Fuploads%2Fgit-blob-cc6707e286bceb4641047e45e095950e8db880fd%2Fai-model-hub-text.png?alt=media) ![](https://1737632334-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-MifAzdGvKLDTtvJP8sm%2Fuploads%2Fgit-blob-bac2752d06f18e86dc7f0b9531ab32ec58f30aec%2Fai-model-hub-image.png?alt=media) ![](https://1737632334-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-MifAzdGvKLDTtvJP8sm%2Fuploads%2Fgit-blob-8b0332538fcac6644893a504f9fbbd1ba2b56d21%2Fai-model-hub-audio.png?alt=media) | ![](https://1737632334-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-MifAzdGvKLDTtvJP8sm%2Fuploads%2Fgit-blob-cc6707e286bceb4641047e45e095950e8db880fd%2Fai-model-hub-text.png?alt=media) ![](https://1737632334-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-MifAzdGvKLDTtvJP8sm%2Fuploads%2Fgit-blob-bac2752d06f18e86dc7f0b9531ab32ec58f30aec%2Fai-model-hub-image.png?alt=media) ![](https://1737632334-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-MifAzdGvKLDTtvJP8sm%2Fuploads%2Fgit-blob-8b0332538fcac6644893a504f9fbbd1ba2b56d21%2Fai-model-hub-audio.png?alt=media) |
|                                                                                                           *Low*                                                                                                          |                                                                                                                                                                                                                                                                                                                         *High*                                                                                                                                                                                                                                                                                                                        |                                                                                                          *Low*                                                                                                          |                                                                                                                                                                                                                                                                                                                        *Text*                                                                                                                                                                                                                                                                                                                        |                                                                                                                                                                                                                                                                                                                        *Text*                                                                                                                                                                                                                                                                                                                        |

## Central parameters

**Description:** Latest small-sized model from Meta's Llama 3.1 series with optimized architecture for efficient inference.

**Model identifier:** `meta-llama/Meta-Llama-3.1-8B-Instruct`

## IONOS AI Model Hub Lifecycle and Alternatives

| **IONOS Launch** | **End of Life** |                                                             **Alternative**                                                             | **Successor** |
| :--------------: | :-------------: | :-------------------------------------------------------------------------------------------------------------------------------------: | :-----------: |
|  *July 1, 2024*  |       N/A       | [<mark style="color:blue;">**Nemo (12B)**</mark>](https://docs.ionos.com/sections-test/guides/ai/ai-model-hub/models/llms/mistral-nemo) |               |

## Origin

|                            **Provider**                            | **Country** |                                       **License**                                      | **Flavor** |   **Release**   |
| :----------------------------------------------------------------: | :---------: | :------------------------------------------------------------------------------------: | :--------: | :-------------: |
| [<mark style="color:blue;">**Meta**</mark>](https://www.meta.com/) |     USA     | [<mark style="color:blue;">**License**</mark>](https://llama.meta.com/llama3/license/) |  Instruct  | *July 23, 2024* |

## Technology

| **Context window** | **Parameters** | **Quantization** | **Multilingual** |                                              **Further details**                                             |
| :----------------: | :------------: | :--------------: | :--------------: | :----------------------------------------------------------------------------------------------------------: |
|       *128k*       |     *8.03B*    |       *fp8*      |       *Yes*      | [<mark style="color:blue;">**Hugging Face**</mark>](https://huggingface.co/meta-llama/Llama-3.1-8B-Instruct) |

## Modalities

|     **Text**     |   **Image**   |   **Audio**   |
| :--------------: | :-----------: | :-----------: |
| Input and output | Not supported | Not supported |

## Endpoints

| **Chat Completions** | **Embeddings** | **Image generation** |
| :------------------: | :------------: | :------------------: |
|  v1/chat/completions |  Not supported |     Not supported    |

## Features

| **Streaming** | **Reasoning** | **Tool calling** |
| :-----------: | :-----------: | :--------------: |
|   Supported   | Not supported |     Supported    |

## Usage example

### Chat completions

The following example demonstrates how to use **Llama 3.1 8B** for conversational tasks.

**API Endpoint:** `POST https://openai.inference.de-txl.ionos.com/v1/chat/completions`

**Request:**

```json
{
  "model": "meta-llama/Meta-Llama-3.1-8B-Instruct",
  "messages": [
    {
      "role": "user",
      "content": "Compose a short poem about the sea."
    }
  ],
  "temperature": 0.7,
  "max_tokens": 100
}
```

**Response:**

```json
{
  "id": "chatcmpl-123",
  "object": "chat.completion",
  "created": 1677652288,
  "model": "meta-llama/Meta-Llama-3.1-8B-Instruct",
  "choices": [{
    "index": 0,
    "message": {
      "role": "assistant",
      "content": "The waves whisper secrets to the sand,\nA salt-kissed breeze sweeps across the land.\nBlue horizons stretch endlessly wide,\nWith the rhythm of the eternal tide."
    },
    "finish_reason": "stop"
  }],
  "usage": {
    "prompt_tokens": 15,
    "completion_tokens": 30,
    "total_tokens": 45
  }
}
```

## Rate limits

Rate limits ensure fair usage and reliable access to the AI Model Hub. In addition to the [<mark style="color:blue;">contract-wide rate limits</mark>](https://docs.ionos.com/sections-test/guides/ai/ai-model-hub/how-tos/rate-limits), no model-specific limits apply.