OpenAI Compatible Endpoints

Create Chat Completions

post

Create Chat Completions by calling an available model in a format that is compatible with the OpenAI API

Authorizations

Body

modelstringRequired

ID of the model to use

temperaturenumberOptional

The sampling temperature to be used

Default: 1

top_pnumberOptional

An alternative to sampling with temperature

Default: -1

nintegerOptional

The number of chat completion choices to generate for each input message

Default: 1

streambooleanOptional

If set to true, it sends partial message deltas

Default: false

stopstring[]Optional

Up to 4 sequences where the API will stop generating further tokens

max_tokensintegerOptionalDeprecated

The maximum number of tokens to generate in the chat. This value is now deprecated in favor of max_completion_tokens completion

Default: 16

max_completion_tokensintegerOptional

An upper bound for the number of tokens that can be generated for a completion, including visible output tokens

Default: 16

presence_penaltynumberOptional

It is used to penalize new tokens based on their existence in the text so far

Default: 0

frequency_penaltynumberOptional

It is used to penalize new tokens based on their frequency in the text so far

Default: 0

logit_biasobjectOptional

Used to modify the probability of specific tokens appearing in the completion

userstringOptional

A unique identifier representing your end-user

tool_choiceone ofOptional

Controls which (if any) tool is called by the model. none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools. Specifying a particular tool via {"type": "function", "function": {"name": "my_function"}} forces the model to call that tool. none is the default when no tools are present. auto is the default if tools are present.

string · enumOptional

none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools.

Possible values:

or

Responses

200

Successful operation

application/json

400

Bad request

500

Server error

post

POST /v1/chat/completions HTTP/1.1
Host: openai.inference.de-txl.ionos.com
Authorization: Bearer JWT
Content-Type: application/json
Accept: */*
Content-Length: 326

{
  "model": "meta-llama/Meta-Llama-3-70B-Instruct",
  "messages": [
    {
      "role": "system",
      "content": "You are a helpful assistant."
    },
    {
      "role": "user",
      "content": "Please say hello."
    }
  ],
  "temperature": 0.7,
  "top_p": 0.9,
  "n": 1,
  "stream": false,
  "stop": [
    "\n"
  ],
  "max_tokens": 1000,
  "presence_penalty": 0,
  "frequency_penalty": 0,
  "logit_bias": {},
  "user": "user-123"
}

{
  "id": "text",
  "choices": [
    {
      "finish_reason": "text",
      "index": 1,
      "message": {
        "role": "text",
        "content": "text",
        "tool_calls": [
          {
            "id": "text",
            "type": "function",
            "function": {
              "name": "text",
              "arguments": "text"
            }
          }
        ]
      }
    }
  ],
  "created": 1,
  "object": "text",
  "model": "text",
  "system_fingerprint": "text",
  "usage": {
    "prompt_tokens": 1,
    "completion_tokens": 1,
    "total_tokens": 1
  }
}

Create Completions

post

Create Completions by calling an available model in a format that is compatible with the OpenAI API

Authorizations

Body

modelstringRequired

ID of the model to use

promptstringRequired

The prompt to generate completions from

temperaturenumberOptional

The sampling temperature to be used

top_pnumberOptional

An alternative to sampling with temperature

nintegerOptional

The number of chat completion choices to generate for each input message

streambooleanOptional

If set to true, it sends partial message deltas

stopstring[]Optional

Up to 4 sequences where the API will stop generating further tokens

max_tokensintegerOptional

The maximum number of tokens to generate in the chat completion

presence_penaltynumberOptional

It is used to penalize new tokens based on their existence in the text so far

frequency_penaltynumberOptional

It is used to penalize new tokens based on their frequency in the text so far

logit_biasobjectOptional

Used to modify the probability of specific tokens appearing in the completion

userstringOptional

A unique identifier representing your end-user

Responses

200

Successful operation

application/json

400

Bad request

500

Server error

post

POST /v1/completions HTTP/1.1
Host: openai.inference.de-txl.ionos.com
Authorization: Bearer JWT
Content-Type: application/json
Accept: */*
Content-Length: 239

{
  "model": "meta-llama/Meta-Llama-3-70B-Instruct",
  "prompt": "Say this is a test",
  "temperature": 0.01,
  "top_p": 0.9,
  "n": 1,
  "stream": false,
  "stop": [
    "\n"
  ],
  "max_tokens": 1000,
  "presence_penalty": 0,
  "frequency_penalty": 0,
  "logit_bias": {},
  "user": "user-123"
}

{
  "id": "text",
  "choices": [
    {
      "finish_reason": "text",
      "index": 1,
      "text": "text"
    }
  ],
  "created": 1,
  "object": "text",
  "model": "text",
  "usage": {
    "prompt_tokens": 1,
    "completion_tokens": 1,
    "total_tokens": 1
  }
}

Get the entire list of available models

get

Get the entire list of available models in a format that is compatible with the OpenAI API

Authorizations

Responses

200

Successful operation

get

GET /v1/models HTTP/1.1
Host: openai.inference.de-txl.ionos.com
Authorization: Bearer JWT
Accept: */*

200

Successful operation

No content

Generate one or more images using a model

post

Generate one or more images using a model in a format that is compatible with the OpenAI API

Authorizations

Body

modelstringRequired

ID of the model to use. Please check /v1/models for available models

promptstringRequired

The prompt to generate images from

nintegerOptional

The number of images to generate. Defaults to 1.

Default: 1

sizestringOptional

The size of the image to generate. Defaults to "1024*1024". Must be one of "1024*1024", "1792*1024", or "1024*1792". The maximum supported resolution is "1792*1024"

Default: 1024*1024

response_formatstring · enumOptional

The format of the response.

Default: b64_jsonPossible values:

userstringOptional

A unique identifier representing your end-user

Responses

200

Successful operation

application/json

400

Bad request

500

Server error

post

POST /v1/images/generations HTTP/1.1
Host: openai.inference.de-txl.ionos.com
Authorization: Bearer JWT
Content-Type: application/json
Accept: */*
Content-Length: 151

{
  "model": "stabilityai/stable-diffusion-xl-base-1.0",
  "prompt": "A beautiful sunset over the ocean",
  "n": 1,
  "size": "1024*1024",
  "response_format": "b64_json"
}

{
  "created": 1,
  "data": [
    {
      "url": "text",
      "b64_json": "text",
      "revised_prompt": "text"
    }
  ]
}

Creates an embedding vector.

post

Creates an embedding vector representing the input text.

Authorizations

Body

modelstringOptional

ID of the model to use. Please check /v1/models for available models

inputone ofOptional

stringOptional

The input text to create an embedding for (single string)

or

string[]Optional

The input text to create embeddings for (list of strings)

Responses

200

Successful operation

application/json

400

Bad request

500

Server error

post

POST /v1/embeddings HTTP/1.1
Host: openai.inference.de-txl.ionos.com
Authorization: Bearer JWT
Content-Type: application/json
Accept: */*
Content-Length: 83

{
  "input": [
    "The food was delicious and the waiter."
  ],
  "model": "intfloat/e5-large-v2"
}

{
  "model": "text",
  "object": "text",
  "data": [
    {
      "index": 1,
      "object": "text",
      "embedding": [
        1
      ]
    }
  ],
  "usage": {
    "prompt_tokens": 1,
    "total_tokens": 1
  }
}