# Large Language Models

#### Small models (fewer than 15B parameters)

The following small language models (fewer than 15B parameters) are optimized for fast inference and low resource consumption. They are ideal for real-time applications and scenarios where latency and cost are critical. While their compact size means they may have less "world knowledge" and slightly lower response quality than larger models, they excel at conversational tasks, virtual assistance, and domain adaptation.

* [<mark style="color:blue;">**Mistral Nemo**</mark>](/cloud/ai/ai-model-hub/models/llms/mistral-nemo.md)**:** A French model from Mistral, tailored for conversational agents and virtual assistants. Its small size ensures efficient inference, and it is particularly effective in scenarios where rapid, context-aware dialogue is essential.
* [<mark style="color:blue;">**Meta Llama 3.1 8B**</mark>](/cloud/ai/ai-model-hub/models/llms/meta-llama-3-1-8b.md): A US-developed model by Meta, designed for conversational agents and virtual assistants. It balances speed and language understanding well, making it suitable for interactive applications that require reliable, fast responses.

#### Medium models (between 15B and 150B parameters)

Medium-sized models provide a strong balance between response quality and inference speed. They are suitable for applications that demand higher accuracy, broader knowledge, and more nuanced language understanding while maintaining reasonable performance, cost, and reliability.

* [<mark style="color:blue;">**Mistral Small 24B**</mark>](/cloud/ai/ai-model-hub/models/llms/mistral-small-24b.md)**:** A French multilingual and multimodal model from Mistral, designed for conversational agents and virtual assistants. It supports both text and image input, offering enhanced performance for applications requiring fast and reliable chat completions across various European languages.
* [<mark style="color:blue;">**Meta Llama 3.3 70B**</mark>](/cloud/ai/ai-model-hub/models/llms/meta-llama-3-3-70b.md): A US model by Meta, offering enhanced response quality for conversational agents and virtual assistants. Its larger parameter count enables more sophisticated reasoning and richer language capabilities, making it ideal for demanding dialogue systems.
* [<mark style="color:blue;">**OpenAI GPT-OSS 120B**</mark>](/cloud/ai/ai-model-hub/models/llms/openai-gpt-oss-120b.md): A large open-source language model by OpenAI, delivering exceptional response quality and comprehensive knowledge coverage. It excels in complex reasoning, content generation, and conversational tasks, offering a strong balance between performance, inference speed, and quality for demanding applications.

#### Large models (more than 150B parameters)

Large models are designed for maximum language understanding, deep reasoning, and high-quality responses. They are best suited for advanced applications where accuracy and depth of knowledge are paramount, though they require more computational resources and have slower inference speeds.

* [<mark style="color:blue;">**Meta Llama 3.1 405B**</mark>](/cloud/ai/ai-model-hub/models/llms/meta-llama-3-1-405b.md): Meta's flagship large language model, delivering exceptional response quality and comprehensive knowledge coverage. It is ideal for research, content generation, and complex conversational tasks where the highest level of language proficiency is required, albeit with slower response times.


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.ionos.com/cloud/ai/ai-model-hub/models/llms.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
