The IONOS AI Model Hub is designed to simplify the deployment and management of advanced machine learning models, eliminating the complexities associated with hardware and infrastructure. This service hosts a range of powerful AI models that facilitate developers' implementation of sophisticated AI solutions without worrying about underlying hardware and operational overheads.
IONOS' AI Model Hub supports various use cases, including:
Foundation Models: Utilize pre-trained Large Language Models (LLMs) and text-to-image models.
Document Embeddings: Store and query extensive document collections based on semantic similarity.
Retrieval Augmented Generation (RAG): Enhance responses by combining LLMs with contextually relevant documents stored in a vector database.
The IONOS AI Model Hub Service offers a wide array of features tailored to meet the needs of modern developers:
Managed Hosting: Utilize AI models without needing to maintain the underlying infrastructure.
Security and Compliance: Keep your data secure and compliant with regulations, as data processing is confined within Germany. Your input data is not used for training purposes in any way.
Scalability: Scale your AI deployments seamlessly based on your needs.
Integration Options: Easily integrate with your applications using REST APIs, with support for popular programming languages like Python and Bash.
Diverse Model Offerings: Choose from various foundation models, including LLMs and text-to-image models, each capable of generating innovative and sophisticated AI outputs.
Document Embeddings: Store and manage document collections and perform semantic similarity searches to extract contextually relevant information.
Retrieval Augmented Generation: Combine vector databases and foundation models to generate enhanced outputs that are contextually aware, providing more accurate and helpful responses.
Token-based Billing: Pay for the services based on the number of tokens used, enabling cost-efficient usage and transparency in billing.
Understanding the foundational concepts of the IONOS AI Model Hub will help you leverage its full potential:
Foundation models are pre-trained on massive datasets to perform a wide range of language and image processing tasks. They can generate text, answer questions, and create images based on textual descriptions. With IONOS, you can access these models via APIs, simplifying the process of integrating advanced AI capabilities into your applications.
Access various open-source LLMs and text-to-image models.
Use models without managing underlying hardware.
Maintain data privacy and comply with German data protection regulations.
Vector databases provide a way to store and manage document collections, enabling semantic similarity searches. Documents are converted to embeddings (vector representations), allowing the discovery of related content through similarity searches.
Persist documents and search for semantically similar content.
Use API endpoints to manage document collections and perform searches.
Ensure document storage and processing stays within Germany.
RAG combines the capabilities of foundation models and vector databases to improve the quality of responses. By supplementing the inherent knowledge of LLMs with specific, contextually relevant information from document collections, RAG provides more accurate and detailed answers.
Use foundation models together with document collections from vector databases.
Improve response accuracy and relevance by incorporating additional context.
Implement sophisticated AI solutions using a combination of querying and generation.
Use dedicated REST API endpoints to interact with various models and services. These endpoints are designed to facilitate the quick and easy integration of AI capabilities into your applications.
Model Management: Endpoints for retrieving model lists, querying models, and managing predictions.
Document Management: Endpoints for creating, modifying, retrieving, and deleting document collections and individual documents.
Querying and Generating: Endpoints for combining semantic searches with generative models to implement RAG scenarios.
Security is paramount, and IONOS provides robust mechanisms to authenticate and authorize API requests. Users must generate and use API tokens to access the AI services securely.
IONOS ensures that all data processing complies with German and European data protection regulations. Your data is processed within Germany, providing an additional layer of security and compliance.
IONOS offers expert technical support to help you troubleshoot and optimize your AI deployments. Whether you need assistance with API integration or model performance, the support and Professional Service team is available to ensure your success during German business hours.
IONOS does not backup the data saved to collections in the vector database. Please ensure that you can restore the content of your collections in case of deletion.
The IONOS AI Model Hub offers powerful AI capabilities to meet various needs. Here are three pivotal use cases you can implement with this service:
Foundation models are pre-trained on extensive datasets, allowing you to leverage state-of-the-art AI for text and image generation. These models can streamline tasks such as content generation, summarization, and question-answering.
Key Features:
Access various open-source Large Language Models (LLMs) and text-to-image models without managing the hardware.
Ensure data privacy with processing confined within Germany.
For a step-by-step guide on using Foundation Models, see Foundation Models tutorial.
Vector databases enable you to store and query large collections of documents based on semantic similarity. Converting documents into embeddings allows you to perform effective similarity searches, making it ideal for applications like document retrieval and recommendation systems.
Key Features:
Persist documents and search for semantically similar content.
Manage document collections through simple API endpoints.
For detailed instructions, see Document Embeddings tutorial.
RAG combines the strengths of foundation models and vector databases. It retrieves the most relevant documents from the database and uses them to augment the output of a foundation model. This approach enriches the responses, making them more accurate and context-aware.
Key Features:
Use foundation models with additional context from document collections.
Enhance response accuracy and relevance for user queries.
To learn how to implement RAG, see the Retrieval Augmented Generation tutorial.
These tutorials will guide you through each use case, providing clear and actionable steps to integrate advanced AI capabilities into your applications using the IONOS AI Model Hub.
The IONOS AI Model Hub is a comprehensive platform that empowers developers to easily implement advanced AI functionalities. You can enhance your applications' capabilities by leveraging managed foundation models, vector databases, and advanced retrieval augmented generation techniques while ensuring security and compliance. Explore the potential of IONOS AI Model Hub Service to transform your AI projects today.
Prerequisite: Prior to using the AI Model Hub, make sure to have a working Authentication Token. Without Authentication Token, you cannot access the AI Model Hub.
The IONOS AI Model Hub API allows you to access foundation models, namely Large Language and text-to-image models. Currently we offer the following foundation models:
In this tutorial, you will learn how to access all foundation models hosted by IONOS. This tutorial is intended for developers. It assumes you have basic knowledge of:
REST APIs and how to call them
A programming language to handle REST API endpoints (for illustration purposes, the tutorials uses Python and Bash scripting)
By the end of this tutorial, you will be able to:
Get a list of all foundation models IONOS currently offers
Apply your prompt to one of the offered foundation models
The IONOS AI Model Hub API is an inference service that you can use to apply deep learning foundation models without having to manage necessary hardware yourself.
Our foundation models offering provides many state of the art open source models, you can use with your data being transfered out of Germany.
Using the foundation models enables you to use Generative Artificial Intelligence out of the box.
To get started, you should open your IDE to enter Python code.
Install required libraries
You need to install the module requests to your python environment. Optionally, we install pandas to format results:
2. Import required libraries
You need to import the module requests and pandas:
After this step, you have installed all python modules to use the foundation models API endpoints.
To get started, you should open a terminal and ensure that curl and jq is installed. While curl is essential for communicating with our API service, we use jq throughout our examples the improve the readability of the results of our API.
Invoke endpoint to get all models
To retrieve a list of foundation models supported by the IONOS AI Model Hub API enter:
This query returns a JSON document consisting of all foundation models and corresponding meta information
Convert list of endpoints to a human readable form
You can convert this JSON document to a pandas dataframe using:
You can convert this JSON document to a pandas dataframe using:
You can pretty print the content of this JSON document using jq:
The JSON document consists of 7 attributes per foundation model of which 3 are relevant for you:
id: The identifier of the foundation model
properties.description (IONOS API only): The textual description of the model
properties.name (IONOS API only): The name of the model
Note:
The identifier for the foundation models differ between IONOS API and OpenAI API.
Select the model to use
From the list you generated in the previous step, choose the model you want to use and the id. You will use this id in the next step to use the foundation model.
Apply prompt to foundation model
To use a foundation model with a prompt you wrote, you have to invoke the /predictions
endpoint of the model and send the prompt as part of the body of this query:
The endpoint will return the result after applying the prompt to the foundation model.
Our Large Language Models support two parameters when querying:
max_length (max_tokens for OpenAI compatiblity) specifies the maximum length of the output generated by the Large Language Model in tokens.
temperature specifies the temperature, that is the degree of creativity of the Large Language Model. The temperature can vary between 0 and 1. Lower values stand for less, higher values for more creativity.
Extract result
The result of the endpoint consists of several meta data and the output of the foundation model in one JSON object. The relevant data is saved in the field properties. You can access it using:
The field properties again consists of several key values pairs. The most relevant are:
input: The prompt you specified
output: The output of the foundation model after applying your prompt
inputLengthInTokens: The length of tokens of your input
outputLengthInTokens: The length of tokens of your output
The result consists of several key values pairs. The most relevant are:
choices.[].message.content: The output of the foundation model after applying your prompt
usage.prompt_tokens: The length of tokens of your input
usage.completion_tokens: The length of tokens of your output
The field properties again consists of several key values pairs. The most relevant are:
input: The prompt you specified
output: The output of the foundation model after applying your prompt
inputLengthInTokens: The length of tokens of your input
outputLengthInTokens: The length of tokens of your output
Note:
You are billed based on the length of your input and output in tokens. That is, you can calculate the cost of each query based on the fields inputLengthInTokens and outputLengthInTokens when using the IONOS API and usage.prompt_tokens and usage.completion_tokens when using the OpenAI API.
In this tutorial you learned how to use the IONOS AI Model Hub API to apply your prompts to the hosted foundation models.
Namely, you learned how to:
Get the list of supported foundation models
Make predictions by inputing your prompt to one of the foundation models.
The IONOS AI Model Hub allows you to combine foundation models and a vector database to implement retrieval augmented generation use cases.
Retrieval augmented generation is an approach that allows you to teach an existing Large Language Model, such as LLama or Mistral, to answer not only based on the knowledge the model learned during training, but also based on the knowledge you specified yourself.
Retrieval augmented generation uses two components:
If one of your users queries your retrieval augmented generation system, you first get the most similar documents from the corresponding document collection. Second, you ask the Large Language Model to answer the query by using both the knowledge it was trained on and the most similar documents from your document collection.
This tutorial is intended for developers. It assumes you have basic knowledge of:
REST APIs and how to call them
A programming language to handle REST API endpoints (for illustration purposes, the tutorials use Python and Bash scripting)
You should also be familiar with the IONOS:
By the end of this tutorial, you'll be able to: Answer customer queries using a Large Language Model which adds data from your document collections to the answers.
The IONOS AI Model Hub API offers both document embeddings and Large Language Models that you can use to implement retrieval augmented generation without having to manage corresponding hardware yourself.
Our AI Model Hub API provides all required functionality without your data being transferred out of Germany.
To get started,
You will need both identifiers in the subsequent steps.
Next, you should open your IDE to enter Python code.
Install required libraries
You need to install the modules requests and pandas to your Python environment:
2. Import required libraries
You need to import the following modules:
3. Generate header for API requests
Next, generate a header document to authenticate yourself against the REST API:
After this step, you have installed all Python modules and have one variable header you can use to implement your first retrieval augmented generation use case.
To get started, you should open a terminal and ensure that curl
and jq
are installed. While curl
is essential for communicating with our API service, we use jq
throughout our examples the improve the readability of the results of our API.
This section shows how to use the document collection and the contained documents to answer a user query.
Retrieve documents relevant for querying
To retrieve the documents relevant to answering the user query, invoke the query endpoint as follows:
This will return a list of the NUM_OF_DOCUMENTS
most relevant documents in your document collection for answering the user query.
Decode Base64 encoded documents
Now, decode the retrieved documents back to a string using:
Generate final answer
Now, combine the user query and the result from the document collection in one prompt:
The result will be a JSON-Document
consisting of the answer to the customer and some meta information. You can access the answer using:
Note:
The best prompt strongly depends on the Large Language Model used. You might need to adapt your prompt to improve results.
To implement a retrieval augmented generation use case with only one prompt, you have to invoke the /predictions endpoint of the Large Language Model you want to use and send the prompt as part of the body of this query:
This query conducts all steps necessary to answer a user query using retrieval augmented generation:
The user query (saved at collectionQuery) is sent to the collection (specified at collectionId).
The results of this query are saved in a variable .context, while the user query is saved in a variable .collection_query. You can use both variables in your prompt.
The example prompt uses the variables .context and .collection_query to answer the customer query.
Note:
The best prompt strongly depends on the Large Language Model used. You might need to adapt your prompt to improve results.
In this tutorial, you learned how to use the IONOS AI Model Hub API to implement retrieval augmented generation use cases.
Namely, you learned how to: Derive answers to user queries using the content of your document collection and one of the IONOS foundation models.
The IONOS AI Model Hub API allows you to access vector databases to persist your document collections and find semantically similar documents.
The vector database is used to persist documents in document collections. Each document is any form of pure text. In the document collection not only the input text is persisted, but also a transformation of the input text into an embedding. Each embedding is a vector of numbers. Input texts which are semantically similar have similar embeddings. A similarity search on a document collection finds the most similar embeddings for a given input text. These embeddings and the corresponding input text are returned to the user.
This tutorial is intended for developers. It assumes you have basic knowledge of:
REST APIs and how to call them
A programming language to handle REST API endpoints (for illustration purposes, the tutorials uses Python and Bash scripting)
By the end of this tutorial, you'll be able to:
Create, delete and query a document collection in the IONOS vector database
Save, delete and modify documents in the document collection and
Answer customer queries using the document collection.
The IONOS AI Model Hub API offers a vector database that you can use to persist text in document collections without having to manage corresponding hardware yourself.
Our AI Model Hub API provides all required functionality without your data being transfered out of Germany.
To get started, you should open your IDE to enter Python code.
Install required libraries
You need to install the modules requests and pandas to your python environment:
Import required libraries
You need to import the following modules:
Generate header for API requests
Next generate a header document to authenticate yourself against the REST API:
After this step, you have installed all python modules and have one variable header you can use to access our vector database.
To get started, you should open a terminal and ensure that curl and jq is installed. While curl is essential for communicating with our API service, we use jq throughout our examples the improve the readability of the results of our API.
In this section you learn how to create a document collection. We will use this document collection to fill it with the data from your knowledge base in the next step.
To track, if something went wrong this section also shows how to:
List existing document collections
Remove document collections
Get meta data of a document collection
Create a document collections
To create a document collection, you have to specify the name of the collection and a description and invoke the endpoint to generate document collections:
If the creation of the document collection was successful, the status code of the request is 201 and it returns a JSON document with all relevant information concerning the document collection.
Extract collection id from request result
To modify the document collection you need its identifier. You can extract it using:
To ensure that the previous step went as expected, you can list the existing document collections.
List all existing document collections
To retrieve a list of all document collections saved by you:
This query returns a JSON document consisting of your document collections and corresponding meta information
Convert list of endpoints to a pandas dataframe
You can convert this JSON document to a human readable form using:
The result consists of 8 attributes of which 3 are relevant for you:
id: The identifier of the document collection
properties.description: The textual description of the document collection
properties.documentsCount: The number of documents persisted in the document collection
If you have not created a collection yet, the field items is an empty list.
If the list of document collections consists of document collections you do not need anymore, you can remove a document collection by invoking:
This query returns a status code which indicates whether the deletion was successful:
204: Status code for successfull deletion
404: Status code given the collection did not exist
Access meta data from a document collection
If you are interested in the meta data of a collection, you can extract it by invoking:
This query returns a status code which indicates whether the collection exists:
200: Status code if the collection exists
404: Status code given the collection does not exist
Extract collection meta data from request result
The body of the request consists of all meta data of the document collection.
In this section, you learn how to add documents to the newly created document collection. To validate your insertion, this section also shows how to
List the documents in the document collection,
Get meta data for a document,
Update an existing document and
Prune a document collection.
To add an entry to the document collection, you need to at least specify the content, the name of the content and the contentType:
Note:
You need to encode your content using base64 prior to adding it to the document collection. This is done here in line 4 of the source code.
This request returns a status code 200 if adding the document to the document collection was successful.
To ensure that the previous step went as expected, you can list the existing documents of your document collection.
List all existing documents in a document collections
To retrieve a list of all documents in the document collection saved by you:
This query returns a JSON document consisting of your documents in the document collection and corresponding meta information
Convert list of documents to a pandas dataframe
You can convert this JSON document to a pandas dataframe using:
The result consists of 10 attributes of which 5 are relevant for you:
id: The identifier of the document
properties.content: The base64 encoded content of the document
properties.name: The name of the document
properties.description: The description of the document
properties.labels.number_of_tokens: The number of tokens in the document
If you have not created the collection yet, the request will return a status code 404. It will return a JSON document with the field items set to an empty list if no documents were added yet.
Access meta data from a document
If you are interested in the metadata of a document, you can extract it by invoking:
This query returns a status code which indicates whether the document exists:
200: Status code if the document exists
404: Status code given the document does not exist
Extract collection meta data from request result
The body of the request consists of all meta data of the document.
If you want to update a document, invoke:
This will replace the existing entry in the document collection with the given id by the payload of this request.
If you want to remove all documents from a document collection invoke:
This query returns the status code 204 if pruning the document collection was successful.
Finally, this section shows how to use the document collection and the contained documents to answer a user query.
Retrieve document relevant for querying
To retrieve the documents relevant for answering the user query, invoke the query endpoint as follows:
This will return a list of the NUM_OF_DOCUMENTS most relevant documents in your document collection for answering the user query.
Decode Base64 encoded documents
Now, decode the retrieved documents back to string using:
In this tutorial you learned how to use the IONOS AI Model Hub API to conduct semantic similarity searches using our vector database.
Namely, you learned how to:
Create a necessary document collection in the vector database and modify it
Insert your documents into the document collection and modify the documents
Conduct semantic similarity searches using your document collection.
From | Foundation Model | Purpose |
---|---|---|
a Large Language Model (we offer a corresponding model as part of our ) and
set up a document collection using and get the identifier of this document collection.
choose a Large Language Model out of our and derive the identifier of this Large Language Model.
For details on how to use the foundation model, see .
Our allows for automating the process described above. Namely, by specifying the collection ID and the collection query directly to our foundation model endpoint, it first queries the document collection and returns it in a variable which you can then directly use in your prompt. This section describes how to do this.
For details on how to use the foundation model, see .
Use the API to access Foundation Models
Use the API to persist Document Embeddings.
Use Foundation Models and Document Embeddings to implement a Retrievel Augmented Generation Use Case.
Meta (Licence)
Llama 3.1 Instruct (8B and 70B)
Ideal for dialogue use cases and natural language tasks: conversational agents, virtual assistants, and chatbots.
Meta (Licence)
Code Llama Instruct HF (13B)
Focuses on generating different kinds of computer code, understands programming languages
Mistral AI (Licence)
Mistral Instruct v0.3 (7B), Mixtral (8x7B)
Ideal for: Conversational agents, virtual assistants, and chatbots; Comparison to Llama 3: better with European languages; supports longer context length
stability.ai (Licence)
Stable Diffusion XL
Text to high-quality images