Migration Guide from Predictions Endpoint

Scheduled to Retire: The native /predictions endpoint is scheduled to retire on May 5, 2026.

The /collections together with /documents and /query endpoints for managing document collections and documents will remain unaffected.

This guide explains how to migrate from the native /predictions endpoint to the OpenAI-compatible API for text, image and Retrieval Augmented Generation in the IONOS AI Model Hub.

If you are currently using the native /predictions endpoint (Example: https://inference.de-txl.ionos.com/models/{modelId}/predictions), you can migrate to the OpenAI-compatible API for standard text and image generation use cases. This migration facilitates more straightforward integration with OpenAI-compatible tools and SDKs and provides a more standardized developer experience.

Text Generation Migration Example

Native Endpoint: POST https://inference.de-txl.ionos.com/models/{modelId}/predictions

Native Request Body:

{
  "type": "prediction",
  "properties": {
    "input": "Please give me 5 domain suggestions for a flower shop in Berlin. Provide for each domain name a paragraph explaining the domain name and why it is valuable.",
    "options": {
      "max_length": "1000",
      "temperature": "0.5"
    }
  }
}

OpenAI-Compatible Endpoint: POST https://openai.inference.de-txl.ionos.com/v1/chat/completions
Model Selection: The modelId for the OpenAI-compatible API is taken from the list of available models at https://openai.inference.de-txl.ionos.com/v1/models. For example, you can use openai/gpt-oss-120b as the model ID.

OpenAI-Compatible Request Body:

{
  "model": "openai/gpt-oss-120b",
  "messages": [
    {
      "role": "system",
      "content": "You are a helpful assistant."
    },
    {
      "role": "user",
      "content": "Please give me 5 domain suggestions for a flower shop in Berlin. Provide for each domain name a paragraph explaining the domain name and why it is valuable."
    }
  ],
  "max_tokens": 2000,
  "temperature": 0.5
}

Field Mapping:
- properties.input → messages[].content (user role)
- properties.options.max_length → max_tokens
- properties.options.temperature → temperature
- modelId in the URL → model field in the request body

Image Generation Migration Example

Native Endpoint: POST https://inference.de-txl.ionos.com/models/{modelId}/predictions

Native Request Body:

{
  "type": "prediction",
  "properties": {
    "input": "Draw an image of a futuristic city skyline at sunset, digital art.",
    "options": {
      "size": "1024x1024"
    }
  }
}

OpenAI-Compatible Endpoint: POST https://openai.inference.de-txl.ionos.com/v1/images/generations
Model Selection: The modelId for the OpenAI-compatible API is taken from the list of available models at https://openai.inference.de-txl.ionos.com/v1/models. For example, you can use black-forest-labs/FLUX.1-schnell as the model ID.

OpenAI-Compatible Request Body:

{
  "model": "black-forest-labs/FLUX.1-schnell",
  "prompt": "A futuristic city skyline at sunset, digital art.",
  "n": 1,
  "size": "1024x1024"
}

Field Mapping:
- properties.input → prompt
- properties.options.size → size
- modelId in the URL → model field in the request body

Migrating from `/predictions` for Retrieval Augmented Generation (RAG) use case

Users who require Retrieval Augmented Generation (RAG) or document-based querying should migrate to the native /query endpoint and OpenAI-Compatible API. The new approach separates document retrieval from text generation into two steps:

Step 1: Query Your Document Collection

Endpoint: POST https://inference.de-txl.ionos.com/collections/{collectionId}/query

Request Body:

{
  "query": "What are the supported models for AI Model Hub?",
  "limit": 5
}

Response Example:

{
  "results": [
    {
      "documentId": "doc-123",
      "content": "IONOS AI Model Hub supports various models including GPT-OSS-120B, Llama 3, Mistral, and FLUX.1 for image generation...",
      "score": 0.92
    },
    {
      "documentId": "doc-456",
      "content": "The following embedding models are available: text-embedding-ada-002...",
      "score": 0.87
    }
  ]
}

Step 2: Generate a Response Using Retrieved Context

Endpoint: POST https://openai.inference.de-txl.ionos.com/v1/chat/completions

Request Body:

{
  "model": "openai/gpt-oss-120b",
  "messages": [
    {
      "role": "system",
      "content": "You are a helpful assistant. Answer the user's question based only on the provided context. If the context doesn't contain enough information, say so.\n\nContext:\n---\nIONOS AI Model Hub supports various models including GPT-OSS-120B, Llama 3, Mistral, and FLUX.1 for image generation...\n\nThe following embedding models are available: text-embedding-ada-002...\n---"
    },
    {
      "role": "user",
      "content": "Which models does IONOS AI Model Hub offer?"
    }
  ],
  "max_tokens": 1000,
  "temperature": 0
}

Best Practice: Place the retrieved context in the system message to clearly separate instructional context from the user's question. This approach provides cleaner separation of concerns, easier conversation continuation for follow-up questions, and better model adherence to grounding information.

PreviousText Generation NextEnrich Text with AI-Generated Images

Last updated 8 days ago

Was this helpful?

Good afternoon

hashtagText Generation Migration Example

hashtagImage Generation Migration Example

hashtagMigrating from /predictions for Retrieval Augmented Generation (RAG) use case

hashtagStep 1: Query Your Document Collection

hashtagStep 2: Generate a Response Using Retrieved Context

Text Generation Migration Example

Image Generation Migration Example

Migrating from `/predictions` for Retrieval Augmented Generation (RAG) use case

Step 1: Query Your Document Collection

Step 2: Generate a Response Using Retrieved Context