# GPT-OSS 120B

**Summary:** GPT-OSS 120B is a cutting-edge open-source Mixture of Experts (MoE) model, specifically optimized for agent workflows and complex reasoning tasks. This medium-sized model combines the efficiency of selective parameter activation with exceptional language understanding, making it ideal for research, development, and production environments where transparency, customization, and sophisticated AI capabilities are essential.

|                                                                                                                                                                                                                                                                                                                              **Intelligence**                                                                                                                                                                                                                                                                                                                              |                                                                                                                                                                                                              **Speed**                                                                                                                                                                                                              |                                                                                                                                                                                                                 **Sovereignty**                                                                                                                                                                                                                 |                                                                                                                                                                                                                                                                                                                       **Input**                                                                                                                                                                                                                                                                                                                      |                                                                                                                                                                                                                                                                                                                      **Output**                                                                                                                                                                                                                                                                                                                      |
| :------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------: | :---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------: | :---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------: | :--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------: | :--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------: |
| ![](https://1737632334-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-MifAzdGvKLDTtvJP8sm%2Fuploads%2Fgit-blob-b23196ddc0cba1be0b981aa5572379cec1538be3%2Fai-model-hub-intelligence.png?alt=media) ![](https://1737632334-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-MifAzdGvKLDTtvJP8sm%2Fuploads%2Fgit-blob-b23196ddc0cba1be0b981aa5572379cec1538be3%2Fai-model-hub-intelligence.png?alt=media) ![](https://1737632334-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-MifAzdGvKLDTtvJP8sm%2Fuploads%2Fgit-blob-b23196ddc0cba1be0b981aa5572379cec1538be3%2Fai-model-hub-intelligence.png?alt=media) | ![](https://1737632334-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-MifAzdGvKLDTtvJP8sm%2Fuploads%2Fgit-blob-be3201cb2eba83650220699adf5b3d9120c83377%2Fai-model-hub-speed.png?alt=media) ![](https://1737632334-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-MifAzdGvKLDTtvJP8sm%2Fuploads%2Fgit-blob-be3201cb2eba83650220699adf5b3d9120c83377%2Fai-model-hub-speed.png?alt=media) | ![](https://1737632334-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-MifAzdGvKLDTtvJP8sm%2Fuploads%2Fgit-blob-2c04b225a16490c4ff8c3e062bb166f25e05e1c2%2Fai-model-hub-sovereignty.png?alt=media) ![](https://1737632334-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-MifAzdGvKLDTtvJP8sm%2Fuploads%2Fgit-blob-2c04b225a16490c4ff8c3e062bb166f25e05e1c2%2Fai-model-hub-sovereignty.png?alt=media) | ![](https://1737632334-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-MifAzdGvKLDTtvJP8sm%2Fuploads%2Fgit-blob-cc6707e286bceb4641047e45e095950e8db880fd%2Fai-model-hub-text.png?alt=media) ![](https://1737632334-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-MifAzdGvKLDTtvJP8sm%2Fuploads%2Fgit-blob-bac2752d06f18e86dc7f0b9531ab32ec58f30aec%2Fai-model-hub-image.png?alt=media) ![](https://1737632334-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-MifAzdGvKLDTtvJP8sm%2Fuploads%2Fgit-blob-8b0332538fcac6644893a504f9fbbd1ba2b56d21%2Fai-model-hub-audio.png?alt=media) | ![](https://1737632334-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-MifAzdGvKLDTtvJP8sm%2Fuploads%2Fgit-blob-cc6707e286bceb4641047e45e095950e8db880fd%2Fai-model-hub-text.png?alt=media) ![](https://1737632334-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-MifAzdGvKLDTtvJP8sm%2Fuploads%2Fgit-blob-bac2752d06f18e86dc7f0b9531ab32ec58f30aec%2Fai-model-hub-image.png?alt=media) ![](https://1737632334-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-MifAzdGvKLDTtvJP8sm%2Fuploads%2Fgit-blob-8b0332538fcac6644893a504f9fbbd1ba2b56d21%2Fai-model-hub-audio.png?alt=media) |
|                                                                                                                                                                                                                                                                                                                                   *High*                                                                                                                                                                                                                                                                                                                                   |                                                                                                                                                                                                               *Medium*                                                                                                                                                                                                              |                                                                                                                                                                                                                     *Medium*                                                                                                                                                                                                                    |                                                                                                                                                                                                                                                                                                                        *Text*                                                                                                                                                                                                                                                                                                                        |                                                                                                                                                                                                                                                                                                                        *Text*                                                                                                                                                                                                                                                                                                                        |

## Central parameters

**Description:** Open-source Mixture-of-Experts architecture with efficient expert routing for optimized inference performance.

**Model identifier:** `openai/gpt-oss-120b`

## IONOS AI Model Hub Lifecycle and Alternatives

|  **IONOS Launch** | **End of Life** |                                                                  **Alternative**                                                                 | **Successor** |
| :---------------: | :-------------: | :----------------------------------------------------------------------------------------------------------------------------------------------: | :-----------: |
| *August 12, 2025* |       N/A       | [<mark style="color:blue;">**Llama 3.3 70B**</mark>](https://docs.ionos.com/sections-test/guides/ai/ai-model-hub/models/llms/meta-llama-3-3-70b) |               |

## Origin

|                            **Provider**                            | **Country** |                                             **License**                                             | **Flavor** |  **Release**  |
| :----------------------------------------------------------------: | :---------: | :-------------------------------------------------------------------------------------------------: | :--------: | :-----------: |
| [<mark style="color:blue;">**OpenAI**</mark>](https://openai.com/) |     USA     | [<mark style="color:blue;">**Apache 2.0**</mark>](https://www.apache.org/licenses/LICENSE-2.0.html) |    Base    | *August 2025* |

## Technology

| **Context window** | **Parameters** | **Quantization** | **Multilingual** |                                       **Further details**                                       |
| :----------------: | :------------: | :--------------: | :--------------: | :---------------------------------------------------------------------------------------------: |
|       *128k*       |     *120B*     |      *MXFP4*     |       *Yes*      | [<mark style="color:blue;">**Hugging Face**</mark>](https://huggingface.co/openai/gpt-oss-120b) |

## Modalities

|     **Text**     |   **Image**   |   **Audio**   |
| :--------------: | :-----------: | :-----------: |
| Input and output | Not supported | Not supported |

## Endpoints

| **Chat Completions** | **Embeddings** | **Image generation** |
| :------------------: | :------------: | :------------------: |
|  v1/chat/completions |  Not supported |     Not supported    |

## Features

| **Streaming** | **Reasoning** | **Tool calling** |
| :-----------: | :-----------: | :--------------: |
|   Supported   |   Supported   |     Supported    |

### Reasoning Example

GPT-OSS 120B supports advanced reasoning capabilities with configurable reasoning effort. Control how deeply the model thinks by setting the `reasoning_effort` parameter to `low`, `medium` (default), or `high`. The model's reasoning process is included in the output response.

**Reasoning Effort Levels:**

* **Low**: Fast responses with minimal internal reasoning. Best for straightforward questions and when speed is prioritized. Uses fewer tokens.
* **Medium** (default): Balanced approach with moderate reasoning depth. Suitable for most use cases requiring thoughtful responses.
* **High**: Deep analytical thinking with extensive internal reasoning. Ideal for complex problem-solving, mathematical proofs, and multi-step reasoning tasks. Uses more tokens due to an extended reasoning process.

Higher reasoning effort levels result in more comprehensive analysis but consume additional tokens and increase response time. The reasoning tokens are included in token usage.

#### Request

```json
{
  "stream": false,
  "model": "openai/gpt-oss-120b",
  "reasoning_effort": "low",
  "messages": [
    {
      "role": "user",
      "content": "Answer me with one letter. Maybe A."
    }
  ]
}
```

#### Response (shortened for readability)

```json
{
  "choices": [
    {
      "message": {
        "role": "assistant",
        "content": "A",
        "reasoning": "User asks to answer with one letter, maybe A. So respond with a single letter."
      }
    }
  ]
}
```

## Usage example

### Chat completions

The following example demonstrates how to use **GPT-OSS 120B** for complex reasoning tasks, such as analyzing data trends and making predictions.

**API Endpoint:** `POST https://openai.inference.de-txl.ionos.com/v1/chat/completions`

**Request:**

```json
{
  "model": "openai/gpt-oss-120b",
  "messages": [
    {
      "role": "system",
      "content": "You are an expert data analyst. Analyze the provided data and give a summary of the key trends."
    },
    {
      "role": "user",
      "content": "Here is the sales data for Q1: Jan: $10k, Feb: $12k, Mar: $15k. Predict the trend for Q2 based on this growth."
    }
  ],
  "temperature": 0.5,
  "max_tokens": 1000
}
```

**Response:**

```json
{
  "id": "chatcmpl-890",
  "object": "chat.completion",
  "created": 1677652289,
  "model": "openai/gpt-oss-120b",
  "choices": [{
    "index": 0,
    "message": {
      "role": "assistant",
      "content": "Based on the Q1 data, there is a consistent monthly growth of roughly 20-25%. If this trend continues, Q2 sales are projected to reach approximately $18k in April, $21.6k in May, and $26k in June.",
      "reasoning": "The user provided Q1 sales data (Jan: 10k, Feb: 12k, Mar: 15k). Calculating growth: Jan to Feb is +20%, Feb to Mar is +25%. I will assume a continued growth trend of roughly 20-25% for Q2 (April, May, June) to make the prediction.",
      "reasoning_content": "The user provided Q1 sales data (Jan: 10k, Feb: 12k, Mar: 15k). Calculating growth: Jan to Feb is +20%, Feb to Mar is +25%. I will assume a continued growth trend of roughly 20-25% for Q2 (April, May, June) to make the prediction."
    },
    "finish_reason": "stop"
  }],
  "usage": {
    "prompt_tokens": 45,
    "completion_tokens": 48,
    "total_tokens": 93
  }
}
```

### Stream chat completions

To receive responses in real time, you can use streaming. When streaming is selected, usage statistics are not included by default. To obtain usage information in the final stream chunk, you must explicitly set `"stream_options": {"include_usage": true}`.

**API Endpoint:** `POST https://openai.inference.de-txl.ionos.com/v1/chat/completions`

**Request:**

```json
{
  "model": "openai/gpt-oss-120b",
  "messages": [
    {
      "role": "user",
      "content": "Compose a poem about the sea."
    }
  ],
  "stream": true,
  "stream_options": {
    "include_usage": true
  }
}
```

**Response:**

```json
...
data: {"choices":[],"created":1769419547,"id":"chatcmpl-3d0323bb-67f8-41c5-b757-8fc5a48230f4","model":"openai/gpt-oss-120b","object":"chat.completion.chunk","usage":{"completion_tokens":26,"prompt_tokens":78,"total_tokens":104}}

data: [DONE]
```

## Rate limits

Rate limits ensure fair usage and reliable access to the AI Model Hub. In addition to the [<mark style="color:blue;">contract-wide rate limits</mark>](https://docs.ionos.com/sections-test/guides/ai/ai-model-hub/how-tos/rate-limits), no model-specific limits apply.
