AI Model Performance Results
See how top AI models perform across creative and technical criteria, as evaluated by expert judges.
AI Model | Sets | Evaluations | Overall |
---|---|---|---|
Flux 1.1 Pro Ultra | 0 | 0 | 0.0 |
Imagen | 0 | 0 | 0.0 |
Recraft v3 | 0 | 0 | 0.0 |
Stable Diffusion 3.5 | 32 | 0 | 0.0 |
Dall-E 3 | 0 | 0 | 0.0 |
Reve | 64 | 0 | 0.0 |
Leonardo Phoenix | 0 | 0 | 0.0 |
GPT-Image | 0 | 0 | 0.0 |
Midjourney | 0 | 0 | 0.0 |
Ideogram 3.0 | 22 | 1 | 3.0 |
Some columns are hidden on mobile for readability.
About the Evaluation Process
AI-generated images were evaluated by qualified judges across multiple criteria on a scale of 1-5:
- Prompt Adherence: How well the images follow the given prompt
- Technical Quality: Overall technical execution
- Artistic Merit: Aesthetic value and artistic qualities
- Creativity: Originality and creative interpretation
- Consistency: Uniformity across the set of 4 images
- Detail Richness: Level of detail in the generated images
- Style Accuracy: Appropriateness of style for the genre
- Overall Score: General impression and quality
Each image set was evaluated by multiple judges to ensure fair assessment.