The Synergy of AI and Image SEO: Transforming Online Visibility

Image SEO becomes a strategic asset for any website aiming to climb the ranks of search engine results. This article delves into the ways GPT Vision can reshapes image SEO, providing actionable insights for website owners to harness its potential for a more dynamic and visible online presence.

Exploring the Power of Image SEO

Text-based SEO has traditionally taken center stage and the significant impact of images on website traffic and user engagement has often been overlooked. This oversight brings to light a common scenario many of us face, including myself. On numerous occasions, I’ve uploaded images to my website using generic names like “1223.jpg,” bypassing the opportunity to add meaningful descriptions or optimise for keywords. The reality is, amidst the hustle of daily tasks, dedicating time to optimise every image feels like a luxury we can’t afford.

This ‘must-do’ practice, which is frequently overlooked, is not just a catalyst for SEO enhancement; it is a critical component that should be mandated for the accessibility of our websites. The task of optimising images, often seen as optional or secondary, is in fact pivotal not only for improving search engine visibility but also for ensuring that our digital spaces are inclusive and accessible to all users. Integrating tools like GPT Vision into our workflows can transform this obligation from a cumbersome task into a streamlined, almost effortless process, reinforcing the importance of accessibility alongside SEO in our digital priorities.

Streamlining Image SEO: An App That Optimises with Keywords

Trying to boost my blog’s search rankings and tackle my own tendency to skip the images descriptions steps, I created a GPT-Vision app that automates adding SEO-friendly names, titles, and alt-texts to images. This solution streamlines the process, making it easier to optimise visuals quickly and effectively, bypassing the hassle without compromising on quality. It can be a practical tool for anyone looking to enhance their digital content’s visibility with minimal effort.

Inside the App: Automating SEO

Here’s an overview of how the app operates:

Uploading Images: Users start by uploading one or several images they wish to optimise. The application is configured to handle multiple uploads in one go, allowing for a batch processing approach to enhance efficiency.
Adding Keywords: Alongside image uploads, users are prompted to input keywords that are relevant to their content. These keywords play a pivotal role as they guide the AI in generating pertinent, SEO-optimised suggestions for each image.
Selecting Random Keywords: To inject variability and potentially unveil unique optimisation opportunities, the application randomly shuffles the submitted keywords. It then selects a subset for use in the optimisation process. This approach can highlight unexpected yet valuable keyword combinations. This method ensures a diverse set of keywords is used across different requests, enriching the SEO suggestions with varied perspectives and enhancing the likelihood of covering a broader spectrum of search queries.
Generating SEO Suggestions: With the images and selected keywords at the ready, the application interacts with the OpenAI API. It sends a request that includes the image (converted into base64 format for compatibility) and the chosen keywords. Utilising its understanding of SEO best practices, along with the implications of the keywords and image, the AI generates suggestions for SEO-friendly file names, titles, and alt texts.
Displaying Results: Upon receiving suggestions from the OpenAI API, the application parses this feedback to extract the proposed file names, titles, and alt texts. These suggestions are then presented to the user, offering a clear and actionable set of optimisations for each image.

GPT-VISION at the Core

Diving into the heart of the application, the call to the OpenAI API, specifically GPT Vision, is a critical step that leverages artificial intelligence to generate SEO-optimised suggestions for images. Here’s a closer look at how this API call is structured and executed within the code.

Constructing the Prompt: The application creates a detailed prompt that encapsulates the task at hand. This prompt includes the selected keywords and specifies that the goal is to generate SEO-friendly file names, titles, and alt texts for the images. The formulation of this prompt is crucial as it guides the AI in understanding exactly what is required.
Here the prompt:

            random_keywords_prompt = ', '.join(selected_keywords)
            prompt_text = (
                "Suggest an SEO optimised image file name for an image with the file type '{}', "
                "using the keywords: '{}'. Use hyphens to separate words in the file name, ensure the alt "
                "text is descriptive and natural, and craft the image title to be concise, engaging, and including primary keywords. "
                "The file type should be retained in the file name suggestion.\n"
                "img_file_name:\n"
                "img_title:\n"
                "alt_text:\n"
            ).format(file_extension, random_keywords_prompt)

Preparing the Image: Before the API call, each image uploaded by the user is converted into a base64-encoded string. This encoding is necessary because the OpenAI API requires images to be in a format that can be transmitted over the internet as part of the request. The base64 string effectively represents the image in a way that’s compatible with web-based APIs.

API Request: With the prompt and the base64-encoded image prepared, the application makes a request to the OpenAI API. This request is sent via a POST method, containing the prompt text and the encoded image. The API is designed to accept this combination of text and visual data, enabling it to generate comprehensive and contextually relevant SEO suggestions.

           response = client.chat.completions.create(
                model="gpt-4-vision-preview",
                messages=[
                    {
                        "role": "user",
                        "content": [
                            {"type": "text", "text": prompt_text},
                            {"type": "image_url",
                                "image_url": f"data:image/jpeg;base64,{base64_image}"}
                        ]
                    }
                ],
                max_tokens=600,
            )

Handling the Response: Once the API processes the request, it returns a response that includes the AI-generated suggestions for the image’s file name, title, and alt text. The response is structured in a way that makes it easy to parse and extract the necessary information.

Parsing and Display: The final step involves parsing the API response to retrieve the suggested SEO elements. These suggestions are then formatted and displayed to the user, providing them with ready-to-use, optimised text for their images.

Optimisation in Action: Two Case Studies

Case #1: A DALL-E generated image I’ve used in my last article.

The APP at work:

Case #2: An infographic flowchart

The APP at work:

In Conclusion:

In wrapping up, it’s important to note that my expertise doesn’t lie in SEO techniques, so I openly welcome any suggestions or improvements to the prompts used in this application (please leave a Comment).

The examples provided not only demonstrate the technology’s power but also underscore the ease with which such AI innovations can be woven into our daily routines. This shift towards automation transcends mere time savings, enhancing the depth and efficacy of SEO practices. It ensures that content isn’t just easier to find but also reaches a broader audience with greater accessibility.

Nicola Lazzari: AI & Tech Innovation Specialist in London & Milan