Text Embeddings

AI Model Hub for free: From December 1, 2024 until March 31, 2025, IONOS offers all foundation models of the AI Model Hub for free. Create your contract now and get your AI journey started today!

The IONOS AI Model Hub provides an OpenAI-compatible API that enables embedding generation for text input using state-of-the-art embedding models. Embeddings are multi-dimensional vectors, i.e. lists of numerical values. The more semantically similar the text input, the more similar the embeddings.

Supported embedding models

The following models are currently available for embedding generation in the IONOS AI Model Hub, each suited for different use cases:

Model Source
Model Name
Description

Paraphrase Multilingual MPNet base v2

Transformer model supporting several different languages with high performance and short input length (128 tokens).

BAAI Large EN V1.5

Embedding model specific for english, medium sized inputs (512 tokens).

BAAI M3

Multipurpose embedding model for multilingual text(100 working languages) and large documents (8192 tokens).

Embeddings and semantic similarity

One trivial task for each of us is identifying semantically similar concepts. Think, for example, of the following three texts:

  • "AI Model Hub"

  • "Micheal Jackson"

  • "best selling music artists"

For a human, it would be trivial to decide that "Michael Jackson" and "best selling music artists" are semantically similar, while "AI Model Hub" is not.

Embeddings, a central concept in modern foundation models, offer an effective solution for identifying semantically similar texts. An embedding is a numerical vector of the text with one central property: Semantically similar texts have similar vectors.

In this sense, the three texts above could be transferred into embeddings:

  • "AI Model Hub": (0.10; 0.10)

  • "Michael Jackson": (0.95; 0.90)

  • "best selling music artists": (0.96; 0.87)

Embedding vectors typically have dozens to thousands of dimensions, but for simplicity, we use 2D vectors in this example. One could illustrate these embeddings in a chart as follows:

Embedding Space

As you can see, the texts "Michael Jackson" and "best selling music arists" are close to each other, while "AI Model Hub" is not.

Embedding models are models which transfer texts into such embeddings. They are available for texts in a single language, multiple languages, images, spoken language, and more. The IONOS AI Model Hub currently supports embedding models for texts in English as well as models for multiple languages.

Additional information

Mathematically, to compare the similarity of two texts, one has to define a "similarity score". A frequently used "similarity score" is the dot product, sometimes called "cosine similarity". Cosine similarity measures the cosine of the angle between two vectors. It returns values between -1 and 1, where concepts semantically similar are close to 1, while concepts very different are close to -1.

Overview

In this tutorial, you will learn how to generate embeddings via the OpenAI compatible API. This tutorial is intended for developers with basic knowledge of:

  • REST APIs

  • A programming language for handling REST API endpoints (Python and Bash examples are provided)

By the end, you will be able to:

  1. Retrieve a list of available embedding models in the IONOS AI Model Hub.

  2. Use the API to generate embeddings with these models.

  3. Use the generated embeddings as input to calculate similarity scores.

Getting Started with Embedding Generation

To use embedding models, first set up your environment and authenticate using the OpenAI-compatible API endpoints.

Step 1: Retrieve Available Models

Fetch a list of embedding models to see which models are available for your use case:

# Python example to retrieve available models
import requests

IONOS_API_TOKEN = "[YOUR API TOKEN HERE]"

endpoint = "https://openai.inference.de-txl.ionos.com/v1/models"

header = {
    "Authorization": f"Bearer {IONOS_API_TOKEN}", 
    "Content-Type": "application/json"
}
requests.get(endpoint, headers=header).json()

Output

      {
         "id":"sentence-transformers/paraphrase-multilingual-mpnet-base-v2",
         "object":"model",
         "created":1677610602,
      },
      {
         "id":"BAAI/bge-m3",
         "object":"model",
         "created":1677610602,
      },
      {
         "id":"BAAI/bge-large-en-v1.5",
         "object":"model",
         "created":1677610602,
      },

This query returns a JSON document listing each model's name, which you’ll use to specify a model for embedding generation in later steps.

Step 2: Generate Embeddings with Your Prompt

To generate an embedding, send the text to the /embeddings endpoint.

# Python example for embedding generation
import requests

IONOS_API_TOKEN = "[YOUR API TOKEN HERE]"
MODEL_NAME = "[MODEL NAME HERE]"
INPUT = ["Michael Jackson", "Metallica"]

endpoint = "https://openai.inference.de-txl.ionos.com/v1/embeddings"

header = {
    "Authorization": f"Bearer {IONOS_API_TOKEN}", 
    "Content-Type": "application/json"
}
body = {
    "model": MODEL_NAME,
    "input": INPUT
}
result = requests.post(endpoint, json=body, headers=header)

Step 3: Calculate Similarity Scores

The returned JSON includes several key fields, most importantly:

  • data.[..].embedding: The generated embedding as a vector of numeric values.

  • usage.prompt_tokens: Token count for the input prompt.

  • usage.total_tokens: Token count for the entire process.

Using python, you can calculate the similarity of two results:

# Python example for similarity scoring
import numpy as np
import requests

IONOS_API_TOKEN = "[YOUR API TOKEN HERE]"
MODEL_NAME = "sentence-transformers/paraphrase-multilingual-mpnet-base-v2"
INPUT = ["Michael Jackson", "Metallica"]

endpoint = "https://openai.inference.de-txl.ionos.com/v1/embeddings"

header = {
    "Authorization": f"Bearer {IONOS_API_TOKEN}", 
    "Content-Type": "application/json"
}
body = {
    "model": MODEL_NAME,
    "input": INPUT
}
result = requests.post(endpoint, json=body, headers=header).json()

embedding_1 = result['data'][0]['embedding']
embedding_2 = result['data'][1]['embedding']

similarity = np.dot(embedding_1, embedding_2)

# 0.18887

The Embeddings API uses standard HTTP error codes to indicate the outcome of a request. The error codes and their description are as below:

  • 200 OK: The request was successful.

  • 401 Unauthorized: The request was unauthorized.

  • 404 Not Found: The requested resource was not found.

  • 500 Internal Server Error: An internal server error occurred.

Summary

In this tutorial, you learned how to:

  1. Access available embedding models.

  2. Generate embeddings with these models.

  3. Calculate similarity scores using the numpy library.

For information on how to use embeddings in document collections, refer to our dedicated tutorial on Document Collections.

Last updated

Revision created

updated