OpenAI Compatible Endpoints

Endpoints compatible with OpenAI's API specification

Create Chat Completions

post

Create Chat Completions by calling an available model in a format that is compatible with the OpenAI API. Supports both text-only and multimodal (text + images) inputs for compatible models. Rate limits apply per contract. Default limits apply unless a custom rate limit is configured for your contract. Exceeding the limit returns HTTP 429 with a Retry-After header.

Authorizations

AuthorizationstringRequired

Please provide header value as 'Bearer ' and don't forget to add 'Bearer' HTTP Authorization Scheme before the token.

Body

modelstringRequired

ID of the model to use

response_formatone ofOptional

An object specifying the format that the model must output. Use json_object for JSON mode or json_schema to enforce a specific schema (Structured Outputs). If omitted, default text output is used.

temperaturenumberOptional

The sampling temperature to be used

Default: 1

top_pnumberOptional

An alternative to sampling with temperature

Default: -1

nintegerOptional

The number of chat completion choices to generate for each input message

Default: 1

streambooleanOptional

If set to true, it sends partial message deltas

Default: false

stopstring[]Optional

Up to 4 sequences where the API will stop generating further tokens

max_tokensintegerOptionalDeprecated

The maximum number of tokens to generate in the chat. This value is now deprecated in favor of max_completion_tokens completion

Default: 16

max_completion_tokensintegerOptional

An upper bound for the number of tokens that can be generated for a completion, including visible output tokens

Default: 16

presence_penaltynumberOptional

It is used to penalize new tokens based on their existence in the text so far

Default: 0

frequency_penaltynumberOptional

It is used to penalize new tokens based on their frequency in the text so far

Default: 0

logit_biasobjectOptional

Used to modify the probability of specific tokens appearing in the completion

userstringOptional

A unique identifier representing your end-user

tool_choiceone ofOptional

Controls which (if any) tool is called by the model. none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools. Specifying a particular tool via {"type": "function", "function": {"name": "my_function"}} forces the model to call that tool. none is the default when no tools are present. auto is the default if tools are present.

string · enumOptional

none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools.

Possible values:

Responses

200

Successful operation

application/json

400

Bad request

429

Rate limit exceeded. Retry after the number of seconds indicated by the Retry-After header. Limits are contract-specific; check X-RateLimit-Limit and X-RateLimit-Burst in the response headers for the values that apply to your contract.

application/json

500

Server error

post

/v1/chat/completions

POST /v1/chat/completions HTTP/1.1
Host: openai.inference.de-txl.ionos.com
Authorization: Bearer YOUR_SECRET_TOKEN
Content-Type: application/json
Accept: */*
Content-Length: 323

{
  "model": "meta-llama/Llama-3.3-70B-Instruct",
  "messages": [
    {
      "role": "system",
      "content": "You are a helpful assistant."
    },
    {
      "role": "user",
      "content": "Please say hello."
    }
  ],
  "temperature": 0.7,
  "top_p": 0.9,
  "n": 1,
  "stream": false,
  "stop": [
    "\n"
  ],
  "max_tokens": 1000,
  "presence_penalty": 0,
  "frequency_penalty": 0,
  "logit_bias": {},
  "user": "user-123"
}

{
  "id": "text",
  "choices": [
    {
      "finish_reason": "text",
      "index": 1,
      "message": {
        "role": "text",
        "content": "text",
        "tool_calls": [
          {
            "id": "text",
            "type": "function",
            "function": {
              "name": "text",
              "arguments": "text"
            }
          }
        ],
        "refusal": "text"
      }
    }
  ],
  "created": 1,
  "object": "text",
  "model": "text",
  "system_fingerprint": "text",
  "usage": {
    "prompt_tokens": 1,
    "completion_tokens": 1,
    "total_tokens": 1
  }
}

Create Completions

post

Create Completions by calling an available model in a format that is compatible with the OpenAI API

Authorizations

AuthorizationstringRequired

Please provide header value as 'Bearer ' and don't forget to add 'Bearer' HTTP Authorization Scheme before the token.

Body

modelstringRequired

ID of the model to use

promptstringRequired

The prompt to generate completions from

temperaturenumberOptional

The sampling temperature to be used

top_pnumberOptional

An alternative to sampling with temperature

nintegerOptional

The number of chat completion choices to generate for each input message

streambooleanOptional

If set to true, it sends partial message deltas

stopstring[]Optional

Up to 4 sequences where the API will stop generating further tokens

max_tokensintegerOptional

The maximum number of tokens to generate in the chat completion

presence_penaltynumberOptional

It is used to penalize new tokens based on their existence in the text so far

frequency_penaltynumberOptional

It is used to penalize new tokens based on their frequency in the text so far

logit_biasobjectOptional

Used to modify the probability of specific tokens appearing in the completion

userstringOptional

A unique identifier representing your end-user

Responses

200

Successful operation

application/json

400

Bad request

429

application/json

500

Server error

post

/v1/completions

POST /v1/completions HTTP/1.1
Host: openai.inference.de-txl.ionos.com
Authorization: Bearer YOUR_SECRET_TOKEN
Content-Type: application/json
Accept: */*
Content-Length: 236

{
  "model": "meta-llama/Llama-3.3-70B-Instruct",
  "prompt": "Say this is a test",
  "temperature": 0.01,
  "top_p": 0.9,
  "n": 1,
  "stream": false,
  "stop": [
    "\n"
  ],
  "max_tokens": 1000,
  "presence_penalty": 0,
  "frequency_penalty": 0,
  "logit_bias": {},
  "user": "user-123"
}

{
  "id": "text",
  "choices": [
    {
      "finish_reason": "text",
      "index": 1,
      "text": "text"
    }
  ],
  "created": 1,
  "object": "text",
  "model": "text",
  "usage": {
    "prompt_tokens": 1,
    "completion_tokens": 1,
    "total_tokens": 1
  }
}

Get the entire list of available models

get

Get the entire list of available models in a format that is compatible with the OpenAI API

Authorizations

AuthorizationstringRequired

Please provide header value as 'Bearer ' and don't forget to add 'Bearer' HTTP Authorization Scheme before the token.

Responses

200

Successful operation

application/json

429

application/json

get

/v1/models

GET /v1/models HTTP/1.1
Host: openai.inference.de-txl.ionos.com
Authorization: Bearer YOUR_SECRET_TOKEN
Accept: */*

{
  "object": "text",
  "data": [
    {
      "id": "text",
      "object": "text",
      "created": 1,
      "owned_by": "text"
    }
  ]
}

Generate an image using a model

post

Generate an image using a model in a format that is compatible with the OpenAI API

Authorizations

AuthorizationstringRequired

Please provide header value as 'Bearer ' and don't forget to add 'Bearer' HTTP Authorization Scheme before the token.

Body

modelstringRequired

ID of the model to use. Please check /v1/models for available models

promptstringRequired

The prompt to generate images from

nintegerOptional

The number of images to generate. Defaults to 1.

Default: 1

sizestringOptional

The size of the image to generate. Defaults to "1024*1024". Must be one of "1024*1024", "1792*1024", or "1024*1792". The maximum supported resolution is "1792*1024"

Default: 1024*1024

response_formatstring · enumOptional

The format of the response.

Default: b64_jsonPossible values:

userstringOptional

A unique identifier representing your end-user

Responses

200

Successful operation

application/json

400

Bad request

429

application/json

500

Server error

post

/v1/images/generations

POST /v1/images/generations HTTP/1.1
Host: openai.inference.de-txl.ionos.com
Authorization: Bearer YOUR_SECRET_TOKEN
Content-Type: application/json
Accept: */*
Content-Length: 151

{
  "model": "stabilityai/stable-diffusion-xl-base-1.0",
  "prompt": "A beautiful sunset over the ocean",
  "n": 1,
  "size": "1024*1024",
  "response_format": "b64_json"
}

{
  "created": 1,
  "data": [
    {
      "url": "text",
      "b64_json": "text",
      "revised_prompt": "text"
    }
  ]
}

Creates an embedding vector.

post

Creates an embedding vector representing the input text.

Authorizations

AuthorizationstringRequired

Please provide header value as 'Bearer ' and don't forget to add 'Bearer' HTTP Authorization Scheme before the token.

Body

modelstringOptional

ID of the model to use. Please check /v1/models for available models

inputone ofOptional

stringOptional

The input text to create an embedding for (single string)

string[]Optional

The input text to create embeddings for (list of strings)

Responses

200

Successful operation

application/json

400

Bad request

429

application/json

500

Server error

post

/v1/embeddings

POST /v1/embeddings HTTP/1.1
Host: openai.inference.de-txl.ionos.com
Authorization: Bearer YOUR_SECRET_TOKEN
Content-Type: application/json
Accept: */*
Content-Length: 83

{
  "input": [
    "The food was delicious and the waiter."
  ],
  "model": "intfloat/e5-large-v2"
}

{
  "model": "text",
  "object": "text",
  "data": [
    {
      "index": 1,
      "object": "text",
      "embedding": [
        1
      ]
    }
  ],
  "usage": {
    "prompt_tokens": 1,
    "total_tokens": 1
  }
}

PreviousIONOS Cloud AI Model Hub OpenAI compatible API NextModels

Last updated 4 days ago

Was this helpful?

Good afternoon

hashtagCreate Chat Completions

hashtagCreate Completions

hashtagGet the entire list of available models

hashtagGenerate an image using a model

hashtagCreates an embedding vector.

Create Chat Completions

Create Completions

Get the entire list of available models

Generate an image using a model

Creates an embedding vector.