# Embeddings

One seemingly simple task for humans is identifying semantically similar concepts. Take the following three texts, for example:

* "AI Model Hub"
* "Michael Jackson"
* "best-selling music artists"

For a human, it is easy to tell that "Michael Jackson" and "best-selling music artists" are related, while "AI Model Hub" is not.

**Embeddings**, a core concept in modern foundation models, provide an effective way for machines to perform this kind of semantic comparison. An embedding is a numerical vector representation of a text, where the key property is this: **semantically similar texts have similar vectors.**

## Semantic similarity

To illustrate this, imagine we convert the three example texts into embeddings:

* "AI Model Hub": (0.10; 0.10)
* "Michael Jackson": (0.95; 0.90)
* "best-selling music artists": (0.96; 0.87)

In practice, embedding vectors have dozens or even thousands of dimensions. Here, we simplify them to 2D for clarity. These embeddings can be visualized like this:

![Embedding space](/files/0ZBNZ7g3iMonvx8xXzdn)

As shown, the embeddings for "Michael Jackson" and "best-selling music artists" are close to each other, while "AI Model Hub" is farther away, mirroring our intuitive understanding.

**Embedding models** are models that convert data—like text, into these embeddings. They exist for different types of content, including text (in one or multiple languages), images, audio, and more. The IONOS AI Model Hub supports embedding models for English text and multilingual models.

To measure how similar two embeddings are, we use a **similarity score**. One commonly used metric is **cosine similarity**, which calculates the cosine of the angle between two vectors. The score ranges from -1 to 1:

* **1**: very similar
* **0**: unrelated
* **-1**: opposite meanings (in rare cases)

## Chunking and embedding quality

In Retrieval-Augmented Generation (RAG), text granularity determines retrieval quality. When you use embeddings for RAG, the granularity of your text embeddings determines the overall quality of retrieval. Embedding a whole document produces a single vector that averages the meaning of all its content. This process masks the specific details required for accurate query retrieval.

**Chunking** splits source content into smaller, discrete passages before embedding, producing sharper, more specific vectors. This improves semantic search precision and gives the LLM relevant context.

For guidance on how to apply chunking in practice, see [<mark style="color:blue;">Document Collections</mark>](/cloud/ai/ai-model-hub/how-tos/document-collections.md).

## Explore further

To try out **embeddings** in practice, see [<mark style="color:blue;">Text Embeddings</mark>](/cloud/ai/ai-model-hub/how-tos/text-embeddings.md).


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.ionos.com/cloud/ai/ai-model-hub/advanced-concepts/embeddings.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
