# Llama 3.3 70B

**Summary:** Llama 3.3 70B is a breakthrough medium-sized language model that delivers flagship-level quality traditionally associated with 405B models while maintaining significantly improved efficiency. This advanced model excels in complex reasoning, nuanced language understanding, and sophisticated problem-solving, making it ideal for enterprise applications, advanced chatbots, content generation, and professional AI assistants that demand high-quality responses without the computational overhead of larger models.

|                                                                                                                                                                                                                                                                                                                              **Intelligence**                                                                                                                                                                                                                                                                                                                              |                                                                                                                                                                                                              **Speed**                                                                                                                                                                                                              |                                                                                                     **Sovereignty**                                                                                                     |                                                                                                                                                                                                                                                                                                                       **Input**                                                                                                                                                                                                                                                                                                                      |                                                                                                                                                                                                                                                                                                                      **Output**                                                                                                                                                                                                                                                                                                                      |
| :------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------: | :---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------: | :---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------: | :--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------: | :--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------: |
| ![](https://1737632334-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-MifAzdGvKLDTtvJP8sm%2Fuploads%2Fgit-blob-b23196ddc0cba1be0b981aa5572379cec1538be3%2Fai-model-hub-intelligence.png?alt=media) ![](https://1737632334-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-MifAzdGvKLDTtvJP8sm%2Fuploads%2Fgit-blob-b23196ddc0cba1be0b981aa5572379cec1538be3%2Fai-model-hub-intelligence.png?alt=media) ![](https://1737632334-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-MifAzdGvKLDTtvJP8sm%2Fuploads%2Fgit-blob-b23196ddc0cba1be0b981aa5572379cec1538be3%2Fai-model-hub-intelligence.png?alt=media) | ![](https://1737632334-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-MifAzdGvKLDTtvJP8sm%2Fuploads%2Fgit-blob-be3201cb2eba83650220699adf5b3d9120c83377%2Fai-model-hub-speed.png?alt=media) ![](https://1737632334-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-MifAzdGvKLDTtvJP8sm%2Fuploads%2Fgit-blob-be3201cb2eba83650220699adf5b3d9120c83377%2Fai-model-hub-speed.png?alt=media) | ![](https://1737632334-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-MifAzdGvKLDTtvJP8sm%2Fuploads%2Fgit-blob-2c04b225a16490c4ff8c3e062bb166f25e05e1c2%2Fai-model-hub-sovereignty.png?alt=media) | ![](https://1737632334-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-MifAzdGvKLDTtvJP8sm%2Fuploads%2Fgit-blob-cc6707e286bceb4641047e45e095950e8db880fd%2Fai-model-hub-text.png?alt=media) ![](https://1737632334-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-MifAzdGvKLDTtvJP8sm%2Fuploads%2Fgit-blob-bac2752d06f18e86dc7f0b9531ab32ec58f30aec%2Fai-model-hub-image.png?alt=media) ![](https://1737632334-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-MifAzdGvKLDTtvJP8sm%2Fuploads%2Fgit-blob-8b0332538fcac6644893a504f9fbbd1ba2b56d21%2Fai-model-hub-audio.png?alt=media) | ![](https://1737632334-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-MifAzdGvKLDTtvJP8sm%2Fuploads%2Fgit-blob-cc6707e286bceb4641047e45e095950e8db880fd%2Fai-model-hub-text.png?alt=media) ![](https://1737632334-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-MifAzdGvKLDTtvJP8sm%2Fuploads%2Fgit-blob-bac2752d06f18e86dc7f0b9531ab32ec58f30aec%2Fai-model-hub-image.png?alt=media) ![](https://1737632334-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-MifAzdGvKLDTtvJP8sm%2Fuploads%2Fgit-blob-8b0332538fcac6644893a504f9fbbd1ba2b56d21%2Fai-model-hub-audio.png?alt=media) |
|                                                                                                                                                                                                                                                                                                                                   *High*                                                                                                                                                                                                                                                                                                                                   |                                                                                                                                                                                                               *Medium*                                                                                                                                                                                                              |                                                                                                          *Low*                                                                                                          |                                                                                                                                                                                                                                                                                                                        *Text*                                                                                                                                                                                                                                                                                                                        |                                                                                                                                                                                                                                                                                                                        *Text*                                                                                                                                                                                                                                                                                                                        |

## Central parameters

**Description:** Latest text-only model from Meta with 70B parameters, benchmarked to achieve 405B-level quality at 70B inference speeds.

**Model identifier:** `meta-llama/Llama-3.3-70B-Instruct`

## IONOS AI Model Hub Lifecycle and Alternatives

| **IONOS Launch** | **End of Life** |                                                                  **Alternative**                                                                 | **Successor** |
| :--------------: | :-------------: | :----------------------------------------------------------------------------------------------------------------------------------------------: | :-----------: |
| *March 15, 2025* |       N/A       | [<mark style="color:blue;">**gpt-oss-120b**</mark>](https://docs.ionos.com/sections-test/guides/ai/ai-model-hub/models/llms/openai-gpt-oss-120b) |               |

## Origin

The model available in AI Model Hub is an optimized variant, quantized by IONOS Cloud for high performance.

### IONOS Variant

| **Provider** | **Modification** |   **Release**  |
| :----------: | :--------------: | :------------: |
|     IONOS    | FP8 Quantization | *May 20, 2025* |

### Base Model

|                            **Provider**                            | **Country** |                                        **License**                                       | **Flavor** |     **Release**    |
| :----------------------------------------------------------------: | :---------: | :--------------------------------------------------------------------------------------: | :--------: | :----------------: |
| [<mark style="color:blue;">**Meta**</mark>](https://www.meta.com/) |     USA     | [<mark style="color:blue;">**License**</mark>](https://llama.meta.com/llama3_3/license/) |  Instruct  | *December 9, 2024* |

## Technology

| **Context window** | **Parameters** | **Quantization** | **Multilingual** |                                              **Further details**                                             |
| :----------------: | :------------: | :--------------: | :--------------: | :----------------------------------------------------------------------------------------------------------: |
|       *128k*       |     *70.6B*    |       *fp8*      |       *Yes*      | [<mark style="color:blue;">**Hugging Face**</mark>](https://huggingface.co/ionos/Llama-3.3-70B-Instruct-FP8) |

## Modalities

|     **Text**     |   **Image**   |   **Audio**   |
| :--------------: | :-----------: | :-----------: |
| Input and output | Not supported | Not supported |

## Endpoints

| **Chat Completions** | **Embeddings** | **Image generation** |
| :------------------: | :------------: | :------------------: |
|  v1/chat/completions |  Not supported |     Not supported    |

## Features

| **Streaming** | **Reasoning** | **Tool calling** |
| :-----------: | :-----------: | :--------------: |
|   Supported   | Not supported |     Supported    |

## Usage example

### Chat completions

The following example demonstrates how to use **Structured Outputs** to extract specific entities from unstructured text into a predefined JSON schema.

**API Endpoint:** `POST https://openai.inference.de-txl.ionos.com/v1/chat/completions`

**Request:**

```json
{
  "model": "meta-llama/Llama-3.3-70B-Instruct",
  "messages": [
    {
      "role": "user",
      "content": "Extract the name, age, and occupation from this text - \"John Doe is a 30-year-old software engineer from Berlin.\""
    }
  ],
  "response_format": {
    "type": "json_schema",
    "json_schema": {
      "name": "person_info",
      "schema": {
        "type": "object",
        "properties": {
          "name": { "type": "string" },
          "age": { "type": "integer" },
          "occupation": { "type": "string" }
        },
        "required": ["name", "age", "occupation"],
        "additionalProperties": false
      },
      "strict": true
    }
  },
  "temperature": 0.1,
  "max_completion_tokens": 500
}
```

**Response:**

```json
{
  "id": "chatcmpl-123",
  "object": "chat.completion",
  "created": 1677652288,
  "model": "meta-llama/Llama-3.3-70B-Instruct",
  "choices": [{
    "index": 0,
    "message": {
      "role": "assistant",
      "content": "{\"name\": \"John Doe\", \"age\": 30, \"occupation\": \"software engineer\"}"
    },
    "finish_reason": "stop"
  }],
  "usage": {
    "prompt_tokens": 20,
    "completion_tokens": 12,
    "total_tokens": 32
  }
}
```

## Rate limits

Rate limits ensure fair usage and reliable access to the AI Model Hub. In addition to the [<mark style="color:blue;">contract-wide rate limits</mark>](https://docs.ionos.com/sections-test/guides/ai/ai-model-hub/how-tos/rate-limits), no model-specific limits apply.
