# Overview

{% hint style="warning" %}
**Scheduled to Retire**: The native `/predictions` endpoint is scheduled to retire on May 5, 2026.

* For text and image generation use cases, please switch to using the OpenAI-compatible API endpoints.
* For Retrieval Augmented Generation (RAG) scenarios, please continue to use the native API endpoints, `/collections`, `/documents`, and `/query`.

For details on how to migrate your existing implementations, see the [<mark style="color:blue;">Migration Guide from Predictions Endpoint</mark>](/cloud/ai/ai-model-hub/how-tos/migration-guide.md). This migration facilitates:

* More straightforward integration with OpenAI-compatible tools and SDKs.
* Provides a more standardized developer experience by straightforwardly integrating with OpenAI-compatible tools and SDKs. straightforward integration with OpenAI-compatible tools and SDKs and provides a more standardized developer experience.
  {% endhint %}

The IONOS AI Model Hub is designed to simplify the deployment and management of advanced machine learning models, eliminating the complexities associated with hardware and infrastructure. This inference service serves a range of powerful AI models that enable developers to implement sophisticated AI solutions without concerns about underlying hardware and operational overhead.

IONOS' AI Model Hub supports various use cases, including:

* **Text Generation**: Utilize pre-trained Large Language Models (LLMs) to generate text and answer queries using textual descriptions.
* **Image Generation**: Utilize pre-trained text-to-image models to create images based on textual descriptions.
* **Document Collections**: Store and query extensive document collections based on semantic similarity.
* **Retrieval Augmented Generation (RAG)**: Enhance responses by combining Large Language Models with contextually relevant documents stored in a vector database.
* **Tool Calling**: Enable AI models to interact with external systems by invoking APIs or executing predefined functions. This allows for dynamic, task-based automation such as triggering workflows, retrieving real-time data, or integrating with business applications —all initiated through natural language prompts.

## Features

The IONOS AI Model Hub Service offers a wide array of features tailored to meet the needs of modern developers:

* **Managed Hosting**: Utilize AI models without needing to maintain the underlying infrastructure.
* **Security and Compliance**: Keep your data secure and compliant with regulations, as data processing is confined within Germany. Your input data remains exclusively for your use and is excluded from training purposes.
* **Scalability**: Scale your AI deployments seamlessly to meet your needs.
* **Integration Options**: Easily integrate with your applications using REST APIs that are fully OpenAI-compatible, with support for popular programming languages like Python and Bash.
* **Diverse Model Offerings**: Choose from various foundation models, including Large Language Models and text-to-image models, each capable of generating innovative, and sophisticated AI outputs.
* **Document Collections**: Store and manage document collections and perform semantic similarity searches to extract contextually relevant information.
* **Retrieval Augmented Generation**: Combine vector databases and Large Language Models to generate enhanced outputs that are contextually aware, providing more accurate and helpful responses.
* **Token-based Billing**: Pay for the services based on the number of tokens used, enabling cost-efficient usage and transparency in billing.

## Concepts

Understanding the foundational concepts of the IONOS AI Model Hub will help you leverage its full potential:

### Foundation Models

Foundation models are pre-trained on massive datasets to perform a wide range of language and image processing tasks. They can generate text, answer questions, and create images based on textual descriptions. With IONOS, you can access these models through APIs, simplifying the process of integrating advanced AI capabilities into your applications.

#### Key Points

* Access various open-source Large Language Models for text generation and text-to-image models for image generation.
* Use models without managing underlying hardware.
* Maintain data privacy and comply with German data protection regulations.

### Document Collections

Vector databases provide a way to store and manage document collections, enabling semantic similarity searches. Documents are converted to embeddings (vector representations), allowing the discovery of related content through similarity searches.

#### Key points

* Persist documents and search for semantically similar content.
* Use API endpoints to manage document collections and perform searches.
* Ensure document storage and processing stays within Germany.

### Retrieval Augmented Generation (RAG)

Retrieval Augmented Generation enhances the performance of Large Language Models by combining their inherent capabilities with contextually relevant information retrieved from document collections stored in vector databases. This approach allows the model to produce highly accurate and detailed responses tailored to specific queries.

#### Key points

* Use Large Language Models together with document collections from vector databases.
* Improve response accuracy and relevance by incorporating more context.
* Implement sophisticated AI solutions using a combination of querying and generation.

## Components

### API Endpoints

Use dedicated REST API endpoints to interact with various models and services. These endpoints are designed to facilitate the quick and easy integration of AI capabilities into your applications. The IONOS AI Model Hub provides two API options for maximum flexibility: its native IONOS AI Model Hub API and an OpenAI-compatible API, making it easy to work with tools that support OpenAI endpoints.

#### OpenAI-Compatible endpoints

These endpoints mirror [<mark style="color:blue;">OpenAI’s API structure</mark>](https://platform.openai.com/docs/api-reference), allowing for seamless integration with tools and platforms already designed for OpenAI:

1. [<mark style="color:blue;">**Models**</mark>](https://platform.openai.com/docs/api-reference/models): Retrieve the list of available models and their details.
2. [<mark style="color:blue;">**Chat Completions**</mark>](https://platform.openai.com/docs/api-reference/chat): Generate conversational responses using supported Large Language Models.
3. [<mark style="color:blue;">**Image Generations**</mark>](https://platform.openai.com/docs/api-reference/images): Generate high-quality images based on text prompts.
4. [<mark style="color:blue;">**Embeddings**</mark>](https://platform.openai.com/docs/api-reference/embeddings): Generate text embeddings as numerical vectors for semantic search, text similarity, and clustering.

#### Native IONOS AI Model Hub endpoints

1. **Model Management**: Endpoints for retrieving model lists, querying models, and managing predictions.
2. **Document Management**: Endpoints to create, modify, retrieve, and delete document collections, and individual documents.
3. **Querying and Generating**: Endpoints for combining semantic searches with Large Language Models to implement Retrieval Augmented Generation scenarios.

> **Recommendation:**

* **Pure text generation** (Plain completions, chat‑style conversations, code completion, or any use‑case that does **not** require external knowledge): Use the **OpenAI‑compatible endpoints** (`POST /v1/completions` or `POST /v1/chat/completions`).
* **Retrieval Augmented Generation** (When the model must consult a document collection or a vector store): Use the native\*\* IONOS AI Model Hub APIs\*\* (the prediction/query endpoints) so that retrieval and generation are performed together inside the AI Model Hub.

### Authentication and Authorization

Security is paramount, and IONOS provides robust mechanisms to authenticate and authorize API requests. You must generate and use API tokens to access the AI services securely. For more information about generating a corresponding token, see [<mark style="color:blue;">Access Management</mark>](/cloud/ai/ai-model-hub/how-tos/access-management.md).

### Data Privacy and Compliance

IONOS ensures that all data processing complies with German and European data protection regulations. Your data is processed within Germany, providing more layer of security and compliance. For more information, see [<mark style="color:blue;">Data Handling</mark>](/cloud/ai/ai-model-hub/governance-and-compliance/data-handling.md).

### Technical Support

IONOS offers expert technical support to help you troubleshoot and optimize your AI deployments. Whether you need assistance with API integration or model performance, the support and Professional Service team is available to ensure your success during German business hours.

### Backup of Collections in Vector Database

IONOS recommends implementing a backup strategy for the data saved to collections in the vector database. This ensures that your collections can be restored in case of accidental deletion or other unforeseen events.


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.ionos.com/cloud/ai/ai-model-hub/ai-model-hub.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
