Enrich Text with AI-Generated Images

AI Model Hub for Free: From December 1, 2024, to September 30, 2025, IONOS is offering all foundation models of the AI Model Hub for free. Create your contract now and get your AI journey started today!

Visuals are crucial in making text more engaging, understandable, and memorable. Whether it is a book cover, an advertisement, or an educational resource, the right image can enhance the impact of written content and capture the audience’s attention. Many types of text benefit from being accompanied by visuals:

Book Covers: A well-designed cover captures the essence of a book and attracts potential readers.
Advertising: Eye-catching visuals increase engagement and make products more appealing to buyers.
Web Design: Articles and content on websites are more likely to be read when paired with images.
Social Media: Compelling graphics boost visibility and interaction on social platforms.
Education: Illustrations enhance comprehension and make complex topics easier to understand.

Overview

With AI-powered tools, generating images tailored to your text has never been easier. In this guide, we’ll demonstrate how to automate image creation using two core components of the AI Model Hub:

A Large Language Model to craft concise prompts.
A Text-to-Image model to generate high-quality visuals.

By the end of this guide, you will have a fully functional workflow for generating custom images from text, enabling you to create book covers, marketing visuals, and other illustrations with ease.

Example Scenario

As an example, we’ll walk through the process of creating a book cover for a children’s story, showing how you can apply these techniques to various use cases. Below is a summary of the book we’ll use.

Book summary

Title: Fluff a Heart

Author: Little Lion

Fluff, a kind golden retriever, lived with Lily, a baker.

One crisp autumn day, while Lily was busy making her famous honey buns, something strange happened. A thick fog rolled into the village, and the usually busy streets fell silent. No one could see beyond a few feet, and the villagers started to worry. Lily had never seen such a fog before, and she knew something wasn’t right.

Fluff, always curious and brave, sniffed the air, and noticed something odd—a faint, distant cry coming from somewhere beyond the fog. It sounded like someone was in trouble.

Venturing into the dense fog, he discovered a small rabbit trapped beneath a fallen branch. Fluff pushed the branch aside, freeing the rabbit and returning home as a hero.

The end.

The image we will generate with the source code might look like this:

Get Started with Enriching Your Text

To follow this tutorial, ensure you have:

Python 3.8 or higher installed on your machine,
The IONOS_API_TOKEN environment variable set with your authentication token.

Download the Python code and install the packages from requirements.txt to see everything in action.

12KB

ai-model-hub-enrich-generated-images.zip

Step 1: Generate Prompt

To create a compelling book cover, we first need a well-crafted prompt for the Text-to-Image model. Instead of manually writing it, we use a Large Language Model (LLM) to generate a concise, descriptive prompt based on the book summary.

    endpoint = 'https://openai.inference.de-txl.ionos.com/v1/chat/completions'
    body = {
        "model": "meta-llama/Meta-Llama-3.1-8B-Instruct",
        "messages": [
            {
                "role": "system",
                "content": """
                    You are a helpful assistant that transforms the following 
                    text into a very short prompt for an image generation model. 
                    The image needs to be in the style of a children's book and 
                    should show both title and author of the book.
                    
                    You will only respond with the prompt and 
                    nothing else. Do not repeat this message!
                """
            },
            {
                "role": "user",
                "content": f"{text}"
            }
        ],
        "temperature": 0.7,
        "top_p": 0.9,
        "stop": [ "\n" ],
        "max_tokens": 1000
    }
    response = requests.post(endpoint, headers=HEADERS, json=body)

Understanding the Prompt Design

The prompt structure consists of two key components:

System Instructions: These guide the LLM to generate a short and focused description suitable for an image-generation model.

“Very short prompt” ensures the output remains concise and relevant.
“In the style of a children’s book” helps achieve a comic-like illustration rather than a photorealistic image.
“Display both title and author” ensures the generated image includes textual elements.

User Input: This is dynamically replaced by the book summary to tailor the prompt for each case.

Example Output

For our book summary, the generated prompt might look like this:

Generate a children's book illustration, featuring a golden retriever (Fluff) and a 
rabbit in a foggy autumn village, with the title 'Fluff a Heart' by Little Lion 
written in bold, cursive font at the top."

By leveraging an LLM, we ensure that our image-generation process is efficient and adaptable to different book summaries.

Step 2: Generate a Book Cover

With a well-crafted prompt from Step 1, we can generate an image using a Text-to-Image model. In this step, we use the FLUX.1-schnell model to create a visually appealing book cover.

Below is a Python snippet demonstrating how to generate the image:

endpoint = 'https://openai.inference.de-txl.ionos.com/v1/images/generations'
size = "1024x1024"
body = {
    "model": "black-forest-labs/FLUX.1-schnell",
    "prompt": prompt,
    "response_format": "b64_json",
    "size": size,
    "n": 1
}
response = requests.post(endpoint, headers=HEADERS, json=body)

Understanding the Parameters

prompt:The text input generated in Step 1, instructs the model on what to depict.
model: Specifies the Text-to-Image model to be used (FLUX.1-schnell in this case).
size Determines the image dimensions: • "1024x1024" – Square format. • "1024x1792" – Portrait orientation. • "1792x1024" – Landscape orientation.
response_format: Defines the output format (base64-encoded JSON in this case).
n: Specifies the number of images to generate (1 by default).

Iterating for Better Results

The first generated image may not always be perfect. Some common issues include:

Typos in text elements: The model might misinterpret letters or add extra words.
Unintended elements: Additional characters, objects, or artifacts may appear.
Styling inconsistencies: The image style might not fully align with expectations.

To improve the results:

1.	Refine the prompt: Add clearer descriptions or constraints.
2.	Try multiple generations:  Running the model multiple times increases the chance of a better output.
3.	Experiment with different models: Some models handle text placement or stylistic choices better than others.

Example Output

The book cover displayed earlier in this guide was generated using this approach. While it aligns well with the book’s theme, refining the final result took a few iterations.

By following this process, you can generate engaging book covers that visually represent your stories!

Step 3: Try It Yourself!

Now it's your turn to generate book covers using the provided code. If you haven't done so yet, download the source code from the link at the top of this page, extract it on your machine, and install the required dependencies:

pip install -r requirements.txt

Once set up, generate a book cover using:

python src/main.py --path input_text.txt --orientation square

Here, input_text.txt contains the book summary for which you want to generate an image. The downloaded source code includes an example file.

Use the orientation parameter to specify the image format:

"square" (1024x1024)
"portrait" (1024x1792)
"landscape" (1792x1024)

After execution, the generated image will be saved in the images/ folder as generated_image.png.

Explore further

Now that you have successfully generated an image, try experimenting with different settings:

Run the code multiple times with the same parameters. Since AI-generated images vary, refining input_text.txt and the prompt (prompt.py) can improve consistency.
Test different Text-to-Image models (image.py). While this guide uses FLUX.1-schnell, try Stable Diffusion XL and compare the results.
Experiment with different Large Language Models. The guide uses a smaller model for prompt generation—test a larger model and observe how it affects image quality.

You can fine-tune your workflow by iterating on these settings to create visually appealing, customized book covers.

In this tutorial, you learned how to generate compelling illustrations from text leveraging the IONOS AI Model Hub API, using a combination of a Large Language Model and a Text-to-Image model. Our example focused on creating a book cover for a children's story, but this approach can be applied to various other scenarios by adapting the prompt to suit your needs.

Want to explore more? Check out our use case on Intelligent Document Search with AI.

PreviousUse Cases NextIntelligent Document Search with AI

Last updated 1 month ago

Was this helpful?