Migration Guide from Predictions Endpoint

circle-exclamation

The /collections together with /documents and /query endpoints for managing document collections and documents will remain unaffected.

This guide explains how to migrate from the native /predictions endpoint to the OpenAI-compatible API for text, image and Retrieval Augmented Generation in the IONOS AI Model Hub.


If you are currently using the native /predictions endpoint (Example: https://inference.de-txl.ionos.com/models/{modelId}/predictions), you can migrate to the OpenAI-compatible API for standard text and image generation use cases. This migration facilitates more straightforward integration with OpenAI-compatible tools and SDKs and provides a more standardized developer experience.


Text Generation Migration Example

  • Native Endpoint: POST https://inference.de-txl.ionos.com/models/{modelId}/predictions

  • Native Request Body:

    {
      "type": "prediction",
      "properties": {
        "input": "Please give me 5 domain suggestions for a flower shop in Berlin. Provide for each domain name a paragraph explaining the domain name and why it is valuable.",
        "options": {
          "max_length": "1000",
          "temperature": "0.5"
        }
      }
    }
  • OpenAI-Compatible Endpoint: POST https://openai.inference.de-txl.ionos.com/v1/chat/completions

  • Model Selection: The modelId for the OpenAI-compatible API is taken from the list of available models at https://openai.inference.de-txl.ionos.com/models. For example, you can use openai/gpt-oss-120b as the model ID.

  • OpenAI-Compatible Request Body:

    {
      "model": "openai/gpt-oss-120b",
      "messages": [
        { "role": "system", "content": "You are a helpful assistant." },
        { "role": "user", "content": "Please give me 5 domain suggestions for a flower shop in Berlin. Provide for each domain name a paragraph explaining the domain name and why it is valuable." }
      ],
      "max_tokens": 2000,
      "temperature": 0.5
    }
  • Field Mapping:

    • properties.inputmessages[1].content

    • properties.options.max_lengthmax_tokens

    • properties.options.temperaturetemperature

    • modelId in the URL → model field in the request body


Image Generation Migration Example

  • Native Endpoint: POST https://inference.de-txl.ionos.com/models/{modelId}/predictions

  • Native Request Body:

  • OpenAI-Compatible Endpoint: POST https://openai.inference.de-txl.ionos.com/v1/images/generations

  • Model Selection: The modelId for the OpenAI-compatible API is taken from the list of available models at https://openai.inference.de-txl.ionos.com/models. For example, you can use black-forest-labs/FLUX.1-schnell as the model ID.

  • OpenAI-Compatible Request Body:

  • Field Mapping:

    • properties.inputprompt

    • properties.options.sizesize

    • modelId in the URL → model field in the request body


Migrating from /predictions for Retrieval Augmented Generation (RAG) use case

Users who require Retrieval Augmented Generation (RAG) or document-based querying should migrate to the native /query endpoint and OpenAI-Compatible API. The new approach separates document retrieval from text generation into two steps:

Step 1: Query Your Document Collection

Endpoint: POST https://inference.de-txl.ionos.com/collections/{collectionId}/query

Request Body:

Response Example:

Step 2: Generate a Response Using Retrieved Context

Endpoint: POST https://openai.inference.de-txl.ionos.com/v1/chat/completions

Request Body:

Best Practice: Place the retrieved context in the system message to clearly separate instructional context from the user's question. This approach provides cleaner separation of concerns, easier conversation continuation for follow-up questions, and better model adherence to grounding information.

Last updated

Was this helpful?