Introduction: The challenges facing Product Managers when it comes to image generation
With the rise of Generative AI, tools such as Photoshop's Generative Fill are revolutionising visual creation. This type of technology is transforming a wide range of sectors: architecture, e-commerce, entertainment, education and even medicine. The breadth and diversity of use cases for GenAI, as illustrated in this article on its history and applications, show the extent to which these innovations fit into multiple contexts.
Product Managers face a number of challenges. How can they ensure that the technology meets the real needs of users? How do you manage subjective feedback while aligning the technical and creative teams? And above all, how do you define and share a clear vision of the quality of the image generated?
This article proposes four keys to overcoming these challenges and ensuring the success of image generation projects.
Key 1: Understand the business to identify value drivers
The Product Discovery stage in an image generation project is not just about identifying the needs expressed by users. It also involves mapping their workflow to identify where AI can bring real value. This means paying particular attention to specific aspects, such as the image formats required, the quality expected, evaluation criteria, adjustments, and integration with existing tools. This approach involves immersing ourselves in their business processes, and identifying the critical stages that have a direct impact on their efficiency and results.
For example, in a project aimed at architects, knowing that they need precise sketches quickly is only a starting point. We need to go further: how do users get from the first sketches to 3D modelling? Where do they waste the most time? Which stages require human intervention that AI cannot replace? These considerations will help to target friction points, such as the time spent adjusting models for customer presentations, where AI can offer real efficiency gains.
By understanding these issues, Product Managers can transform vague ideas into clear, actionable user stories, aligned with real needs.
Key 2: Translate requirements into functionality using a structured approach
The key to designing relevant features lies in prioritising the features to be developed, based on user feedback. Workshops and interviews, for example, help to identify and prioritise essential requirements. By structuring this feedback, teams can focus on concrete expectations and avoid wasting time on secondary features. As explained in this article on learning from successes and failures to create good AI products, integrating user feedback and experimenting from the earliest stages is an effective method for refining and developing AI products tailored to real needs.
- In an image generation project, it is crucial that each feature meets precise aesthetic criteria while respecting technical constraints. This requires :
- Clear visual references, derived from existing data produced by users, to align expectations with the final rendering.
A structured evaluation grid, incorporating criteria such as realism, line style, general atmosphere and conformity with a reference image, etc. - Consideration of technical limitations, such as the model's difficulty in generating textures or shadows, in order to prioritise the most significant areas for improvement.
Criteria type | Criterion | Description |
---|---|---|
Aesthetics | Line style | Thickness and regularity of lines: precise, sharp lines for the foreground, blurred lines for the background |
Aesthetics | Colour consistency | The palette is defined according to the purpose of the image: neutral colours for technical renderings, or bright colours for more dynamic visuals |
Aesthetics | Line style | Simulated tool for visual rendering, such as a pencil line, ink finish, watercolour effect or digital drawing |
Technical | Shadow management | Generates realistic shadows that are consistent across light sources |
Technical | Texture quality | Diversification of textures to ensure realistic rendering and avoid visual inconsistencies |
Technical | Image resolution | Generation of high-resolution images (such as 4K for printing) or lower resolution images for fast previews |
Example of a structured evaluation grid for analysing aesthetic and technical criteria in an image generation project.
Key 3: Use qualitative AND quantitative feedback to optimise iterations
Image generation projects come up against a peculiarity: subjectivity. Contrary to what you might think, measuring the quality of results is not straightforward, even in text generation projects. Even if a text has a relatively fixed structure (grammar, spelling, consistency), it can still pose problems without a suitable set of test data. For images, it's even more complex, because the human eye accepts a great deal of visual freedom: a bit like suddenly being allowed to write ‘CooMme that’ without completely losing the meaning.
For example, in an image generation project for an e-commerce site, aimed at creating visuals of clothes from descriptions, we can use two types of feedback to refine the model:
- Quantitative: Users rate the images on a scale of 1 to 5 according to precise criteria such as fidelity to the real product, luminosity and texture. This allows us to quickly identify areas for improvement.
- Qualitative: Free comments can be added to the ratings, such as: ‘The folds in the fabric lack realism, giving the impression of an artificial image.’
With structured feedback, it's possible to refine the model effectively, while respecting expectations that are sometimes difficult to objectify.
Reference image
Image generated 1: Faithfulness to the real product: 2/5, brightness 3/5, texture 0/5
Image generated 2: Faithfulness to the real product: 4/5, brightness 4/5, texture 5/5
Key 4: Measure the value of the product using specific KPIs
Finally, how do you know if the tool is providing real value to users? It's not just a question of technical performance, but also of the tangible impact on their day-to-day work. Traditional AI metrics, such as accuracy (proportion of correct results among all predictions) and recall (ability to detect all correct answers), are relevant in highly structured scenarios. However, these metrics show their limits when it comes to image generation, where evaluation is also based on qualitative criteria as seen in the previous point.
The indicators should reflect the impact of the product on users' day-to-day work, taking into account the overall time required to achieve a satisfactory result:
- Acceptance rate: Percentage of images deemed usable without modification, indicating the extent to which the tool directly meets expectations. This can be an automatic assessment, for example images are validated if they obtain an average score above a predefined threshold (such as 4/5), based on objective criteria such as those defined above; or a binary assessment (image OK or KO), made by the user on a dedicated interface.
- Tweak Time : Total time required to make an image usable, including both AI work and human adjustments. This indicator assesses overall productivity.
Let's take the case of an image generation project where the acceptance rate is 80%. Most of the images generated are retained, but require an average of 15 minutes of correction time. Compared with the 45 minutes needed to produce a sketch without AI, this represents a net saving of 30 minutes per usable image. On the other hand, the remaining 20% are deemed unusable or require more substantial adjustments.
These data demonstrate a significant impact on productivity, by freeing up time for repetitive and mechanical tasks. This allows users to concentrate on more intellectual activities with higher added value, while revealing areas for improvement to further reduce the number of adjustments required.
Conclusion
The success of an image generation project depends on a detailed understanding of the specific characteristics of this type of project. This includes the precise identification of the image production workflow, the translation of user requirements in terms of visual quality, the integration of qualitative and quantitative feedback to refine renderings, and the continuous adaptation of KPIs to measure not only technical performance but also the impact on users' day-to-day work.
By applying these principles, Product Managers can build truly useful tools that meet business needs while optimising technical and creative processes. This framework allows them to differentiate themselves from more traditional projects by taking into account the unique challenges associated with image generation.
If you have any feedback or would like to discuss these practices, please don't hesitate to contact me!