The IONOS AI Model Hub API allows you to access vector databases to persist your document collections and find semantically similar documents.
The vector database is used to persist documents in document collections. Each document is any form of pure text. In the document collection not only the input text is persisted, but also a transformation of the input text into an embedding. Each embedding is a vector of numbers. Input texts which are semantically similar have similar embeddings. A similarity search on a document collection finds the most similar embeddings for a given input text. These embeddings and the corresponding input text are returned to the user.
This tutorial is intended for developers. It assumes you have basic knowledge of:
REST APIs and how to call them
A programming language to handle REST API endpoints (for illustration purposes, the tutorials uses Python and Bash scripting)
By the end of this tutorial, you'll be able to:
Create, delete and query a document collection in the IONOS vector database
Save, delete and modify documents in the document collection and
Answer customer queries using the document collection.
The IONOS AI Model Hub API offers a vector database that you can use to persist text in document collections without having to manage corresponding hardware yourself.
Our AI Model Hub API provides all required functionality without your data being transfered out of Germany.
To get started, you should open your IDE to enter Python code.
Next generate a header document to authenticate yourself against the endpoints of our REST API:
After this step, you have one variable header you can use to access our vector database.
To get started, you should open a terminal and ensure that curl and jq is installed. While curl is essential for communicating with our API service, we use jq throughout our examples the improve the readability of the results of our API.
In this section you learn how to create a document collection. We will use this document collection to fill it with the data from your knowledge base in the next step.
To track, if something went wrong this section also shows how to:
List existing document collections
Remove document collections
Get meta data of a document collection
To create a document collection, you have to specify the name of the collection and a description and invoke the endpoint to generate document collections:
If the creation of the document collection was successful, the status code of the request is 201 and it returns a JSON document with all relevant information concerning the document collection.
To modify the document collection you need its identifier. You can extract it from the returned JSON document in the variable id.
To ensure that the previous step went as expected, you can list the existing document collections.
To retrieve a list of all document collections saved by you:
This query returns a JSON document consisting of your document collections and corresponding meta information
The result consists of 8 attributes per collection of which 3 are relevant for you:
id: The identifier of the document collection
properties.description: The textual description of the document collection
properties.documentsCount: The number of documents persisted in the document collection
If you have not created a collection yet, the field items is an empty list.
If the list of document collections consists of document collections you do not need anymore, you can remove a document collection by invoking:
This query returns a status code which indicates whether the deletion was successful:
204: Status code for successfull deletion
404: Status code given the collection did not exist
If you are interested in the meta data of a collection, you can extract it by invoking:
This query returns a status code which indicates whether the collection exists:
200: Status code if the collection exists
404: Status code given the collection does not exist
The body of the request consists of all meta data of the document collection.
In this section, you learn how to add documents to the newly created document collection. To validate your insertion, this section also shows how to
List the documents in the document collection,
Get meta data for a document,
Update an existing document and
Prune a document collection.
To add an entry to the document collection, you need to at least specify the content, the name of the content and the contentType:
Note:
You need to encode your content using base64 prior to adding it to the document collection. This is done here in line 7 of the source code.
This request returns a status code 200 if adding the document to the document collection was successful.
To ensure that the previous step went as expected, you can list the existing documents of your document collection.
To retrieve a list of all documents in the document collection saved by you:
This query returns a JSON document consisting of your documents in the document collection and corresponding meta information
The result has a field items with all documents in the collection. This field consists of 10 attributes per entry of which 5 are relevant for you:
id: The identifier of the document
properties.content: The base64 encoded content of the document
properties.name: The name of the document
properties.description: The description of the document
properties.labels.number_of_tokens: The number of tokens in the document
If you have not created the collection yet, the request will return a status code 404. It will return a JSON document with the field items set to an empty list if no documents were added yet.
If you are interested in the metadata of a document, you can extract it by invoking:
This query returns a status code which indicates whether the document exists:
200: Status code if the document exists
404: Status code given the document does not exist
The body of the request consists of all meta data of the document.
If you want to update a document, invoke:
This will replace the existing entry in the document collection with the given id by the payload of this request.
If you want to remove all documents from a document collection invoke:
This query returns the status code 204 if pruning the document collection was successful.
Finally, this section shows how to use the document collection and the contained documents to answer a user query.
To retrieve the documents relevant for answering the user query, invoke the query endpoint as follows:
This will return a list of the NUM_OF_DOCUMENTS most relevant documents in your document collection for answering the user query.
In this tutorial you learned how to use the IONOS AI Model Hub API to conduct semantic similarity searches using our vector database.
Namely, you learned how to:
Create a necessary document collection in the vector database and modify it
Insert your documents into the document collection and modify the documents
Conduct semantic similarity searches using your document collection.
The IONOS AI Model Hub allows you to combine Large Language Models and a vector database to implement Retrieval Augmented Generation use cases.
Retrieval Augmented Generation is an approach that allows you to teach an existing Large Language Model, such as LLama or Mistral, to answer not only based on the knowledge the model learned during training, but also based on the knowledge you specified yourself.
Retrieval Augmented Generation uses two components:
a Large Language Model (we offer corresponding models for ) and
If one of your users queries your Retrieval Augmented Generation system, you first get the most similar documents from the corresponding document collection. Second, you ask the Large Language Model to answer the query by using both the knowledge it was trained on and the most similar documents from your document collection.
This tutorial is intended for developers. It assumes you have basic knowledge of:
REST APIs and how to call them
A programming language to handle REST API endpoints (for illustration purposes, the tutorials use Python and Bash scripting)
You should also be familiar with the IONOS:
By the end of this tutorial, you'll be able to: Answer customer queries using a Large Language Model which adds data from your document collections to the answers.
The IONOS AI Model Hub API offers both document embeddings and Large Language Models that you can use to implement retrieval augmented generation without having to manage corresponding hardware yourself.
Our AI Model Hub API provides all required functionality without your data being transferred out of Germany.
You will need this identifier in the subsequent steps.
To get started, you should open your IDE to enter Python code.
Next generate a header document to authenticate yourself against the endpoints of our REST API:
After this step, you have one variable header you can use to access our vector database.
To get started, you should open a terminal and ensure that curl
and jq
are installed. While curl
is essential for communicating with our API service, we use jq
throughout our examples the improve the readability of the results of our API.
To retrieve a list of Large Language Models supported by the IONOS AI Model Hub API enter:
This query returns a JSON document consisting of all foundation models and corresponding meta information.
The JSON document consists an entry items*. This is a list of all available foundation models. Of the 7 attributes per foundation model 3 are relevant for you:
id: The identifier of the foundation model
properties.description: The textual description of the model
properties.name: The name of the model
Note:
The identifiers for the foundation models differ between our API for Retrival Augmented Generation and for the image generation and text generation endpoints compatible with OpenAI.
From the list you generated in the previous step, choose the model you want to use and the id. You will use this id in the next step to use the foundation model.
This section shows how to use the document collection and the contained documents to answer a user query.
To retrieve the documents relevant to answering the user query, invoke the query endpoint as follows:
This will return a list of the NUM_OF_DOCUMENTS
most relevant documents in your document collection for answering the user query.
Now, combine the user query and the result from the document collection in one prompt:
The result will be a JSON-Document
consisting of the answer to the customer and some meta information. You can access it in the field at properties.output
Note:
The best prompt strongly depends on the Large Language Model used. You might need to adapt your prompt to improve results.
The IONOS AI Model Hub allows for automating the process described above. Namely, by specifying the collection ID and the collection query directly to our foundation model endpoint, it first queries the document collection and returns it in a variable which you can then directly use in your prompt. This section describes how to do this.
To implement a Retrieval Augmented Generation use case with only one prompt, you have to invoke the /predictions endpoint of the Large Language Model you want to use and send the prompt as part of the body of this query:
This query conducts all steps necessary to answer a user query using Retrieval Augmented Generation:
The user query (saved at collectionQuery) is sent to the collection (specified at collectionId).
The results of this query are saved in a variable .context, while the user query is saved in a variable .collection_query. You can use both variables in your prompt.
The example prompt uses the variables .context and .collection_query to answer the customer query.
Note:
The best prompt strongly depends on the Large Language Model used. You might need to adapt your prompt to improve results.
In this tutorial, you learned how to use the IONOS AI Model Hub API to implement Retrieval Augmented Generation use cases.
Namely, you learned how to: Derive answers to user queries using the content of your document collection and one of the IONOS foundation models.
The IONOS AI Model Hub offers powerful AI capabilities to meet various needs. Here are five pivotal use cases you can implement with this service:
Text generation models enable advanced language processing tasks, such as content creation, summarization, conversational responses, and question-answering. These models are pre-trained on extensive datasets, allowing for high-quality text generation with minimal setup.
Key Features:
Access open-source Large Language Models (LLMs) via an OpenAI-compatible API.
Ensure data privacy with processing confined within Germany.
For step-by-step instructions on text generation, see the tutorial.
Image generation models allow you to create high-quality, detailed images from descriptive text prompts. These models can be used for applications in creative design, marketing visuals, and more.
Key Features:
Generate photorealistic or stylized images based on specific prompts.
Choose from models optimized for realism or creative, artistic output.
To learn how to implement image generation, see the tutorial.
Vector databases enable you to store and query large collections of documents based on semantic similarity. Converting documents into embeddings allows you to perform effective similarity searches, making it ideal for applications like document retrieval and recommendation systems.
Key Features:
Persist documents and search for semantically similar content.
Manage document collections through simple API endpoints.
RAG combines the strengths of foundation models and vector databases. It retrieves the most relevant documents from the database and uses them to augment the output of a foundation model. This approach enriches the responses, making them more accurate and context-aware.
Key Features:
Use foundation models with additional context from document collections.
Enhance response accuracy and relevance for user queries.
The IONOS AI Model Hub can be seamlessly integrated into various frontend tools that use Large Language Models or text-to-image models through its OpenAI-compatible API. This integration allows you to leverage foundation models in applications without complex setups. For example, using the tool AnythingLLM, you can configure and connect to the IONOS AI Model Hub to serve as the backend for Large Language Model functionalities.
Key Features:
Easily connect to third-party tools with the OpenAI-compatible API.
Enable custom applications with IONOS-hosted foundation models.
These tutorials will guide you through each use case, providing clear and actionable steps to integrate advanced AI capabilities into your applications using the IONOS AI Model Hub.
The IONOS AI Model Hub provides an OpenAI-compatible API that enables high-quality image generation using state-of-the-art foundation models. By inputting descriptive prompts, users can create detailed images directly through the API, without the need for managing underlying hardware or infrastructure.
The following models are currently available for image generation, each suited to different types of visual outputs:
In this tutorial, you will learn how to generate images using foundation models via the IONOS API. This tutorial is intended for developers with basic knowledge of:
REST APIs
A programming language for handling REST API endpoints (Python and Bash examples are provided)
By the end, you will be able to:
Retrieve a list of available image generation models in the IONOS AI Model Hub.
Use prompts to generate images with these models.
To use image generation models, first set up your environment and authenticate using the OpenAI-compatible API endpoints.
Fetch a list of models to see which are available for your use case:
This query returns a JSON document listing each model's name, which you’ll use to specify a model for image generation in later steps.
To generate an image, send a prompt to the /images/generations
endpoint. Customize parameters like size
for the resolution of the output image.
The returned JSON includes several key fields, most importantly:
data.[].b64_json
: The generated image in base64 format.
usage.prompt_tokens
: Token count for the input prompt.
usage.total_tokens
: Token count for the entire process (usually zero for image generation, as billing is per image).
In this tutorial, you learned how to:
Access available image generation models.
Use descriptive prompts to generate high-quality images, ideal for applications in design, creative work, and more.
To get started, set up a document collection using and get the identifier of this document collection.
For detailed instructions, see tutorial.
To learn how to implement Retrieval Augmented Generation, see the tutorial.
For detailed guidance on integrating with tools, see the tutorial.
For information on text generation, refer to our dedicated tutorial on models.
stability.ai (License)
Stable Diffusion XL
Generates photorealistic images, ideal for marketing visuals, product mockups, and natural scenes.
BlackForestLab (License)
FLUX.1-schnell
Generates artistic, stylized images, well-suited for creative projects, digital art, and unique concept designs.
The IONOS AI Model Hub offers an OpenAI-compatible API that enables powerful text generation capabilities through foundation models. These Large Language Models (LLMs) can perform a wide variety of tasks, such as generating conversational responses, summaries, and contextual answers, without requiring you to manage hardware or extensive infrastructure.
The following models are currently available for text generation, each suited to different applications:
Llama 3.1 Instruct (8B, 70B and 405B)
Ideal for dialogue use cases and natural language tasks: conversational agents, virtual assistants, and chatbots.
Code Llama Instruct HF (13B)
Focuses on generating different kinds of computer code, understands programming languages
Mistral Instruct v0.3 (7B), Mixtral (8x7B)
Ideal for: Conversational agents, virtual assistants, and chatbots; Comparison to Llama 3: better with European languages; supports longer context length
In this tutorial, you will learn how to generate text using foundation models via the IONOS API. This tutorial is intended for developers with basic knowledge of:
REST APIs
A programming language for handling REST API endpoints (Python and Bash examples are provided)
By the end, you will be able to:
Retrieve a list of text generation models available in the IONOS AI Model Hub.
Apply prompts to these models to generate text responses, supporting applications like virtual assistants and content creation.
To use text generation models, first set up your environment and authenticate using the OpenAI-compatible API endpoints.
Fetch a list of models to see which are available for your use case:
This query returns a JSON document listing each models name, which you’ll use to specify a model for text generation in later steps.
To generate text, send a prompt to the chat/completions endpoint.
The returned JSON includes several key fields, most importantly:
choices.[].message.content
: The generated text based on your prompt.
usage.prompt_tokens
: Token count for the input prompt.
usage.completion_tokens
: Token count for the generated output.
In this tutorial, you learned how to:
Access available text generation models.
Use prompts to generate text responses, ideal for applications such as conversational agents, content creation, and more.
For information on image generation, refer to our dedicated tutorial on text-to-image models.
The IONOS AI Model Hub provides an OpenAI-compatible API, allowing seamless integration with various frontend tools that use Large Language Models (LLMs). This guide walks you through the setup process, using AnythingLLM as an example tool.
By the end of this tutorial, you will be able to configure AnythingLLM to use the IONOS AI Model Hub as its backend for AI-powered responses.
You will need an authentication token to access the IONOS AI Model Hub. For more information about how to generate your token in the IONOS DCD, see Generate authentication token.
Save this token in a secure place, as you’ll need to enter it into AnythingLLM during setup.
The IONOS AI Model Hub offers a variety of Large Language Models to suit different needs. Choose the model that best fits your use case from the table below:
Llama 3.1 Instruct, 8B
meta-llama/Meta-Llama-3.1-8B-Instruct
Suitable for general-purpose dialogue and language tasks.
Llama 3.1 Instruct, 70B
meta-llama/Meta-Llama-3.1-70B-Instruct
Ideal for more complex conversational agents and virtual assistants.
Llama 3.1 Instruct, 405B
meta-llama/Meta-Llama-3.1-405B-Instruct-FP8
Optimized for extensive dialogue tasks, supporting large context windows.
Mistral Instruct v0.3, 7B
mistralai/Mistral-7B-Instruct-v0.3
Designed for conversational agents, with enhanced European language support.
Mixtral, 8x7B
mistralai/Mixtral-8x7B-Instruct-v0.1
Supports multilingual interactions and is optimized for diverse contexts.
During setup, you’ll enter the model’s "Model Name" value into AnythingLLM’s configuration.
For connecting to the IONOS AI Model Hub, use the following Base URL for the OpenAI-compatible API:
You will enter this URL in the configuration settings of AnythingLLM.
With your authentication token, selected model name, and base URL in hand, you’re ready to set up AnythingLLM:
Open AnythingLLM and go to the configuration page for the Large Language Model (LLM) settings.
In AnythingLLM, this can be accessed by clicking the wrench icon in the lower left corner, then navigating to AI Providers -> LLM.
Choose Generic OpenAI as the provider.
Enter the following information in the respective fields:
API Key: Your IONOS authentication token.
Model Name: The name of the model you selected from the table (e.g., meta-llama/Meta-Llama-3.1-8B-Instruct
).
Base URL: https://openai.inference.de-txl.ionos.com/v1
Your screen should look similar to the image below:
Click Save Changes to apply the settings.
From now on, AnythingLLM will use the IONOS AI Model Hub as its backend, enabling AI-powered functionality based on your chosen Large Language Model.
This guide provides a straightforward path for integrating the IONOS AI Model Hub into third-party frontend tools using the OpenAI-compatible API. For other tools and more advanced configurations, the steps will be similar: generate an API key, select a model, and configure the tool’s API settings.
Meta ()
Meta ()
Mistral AI ()