Nano Banana Pro: Can Google’s 14-Image Model Fix the 6-Finger Glitch?
Image Credit: Jacky Lee
Google DeepMind’s Nano Banana Pro – officially branded as Gemini 3 Pro Image – is Google’s latest image generation and editing model, built on the Gemini 3 Pro architecture and designed to improve how AI handles text, diagrams and complex scenes. Announced on 20 November 2025, the model is now rolling out across the Gemini app, Google AI Studio and Google Ads as Google’s “state-of-the-art” image backend.
Anatomy as a Persistent Weak Spot
Despite rapid improvements in text-to-image systems, human anatomy remains a well-documented failure mode. Common issues include proliferated limbs, missing fingers, deformed feet and fused body parts.
A 2025 paper by Lu Ma et al. introduced the Distortion-5K dataset – about 4,700 AI-generated images of people, labelled for distorted regions – and a “Human Distortion Benchmark” used to test several popular text-to-image models. Using their ViT-HD detector, the authors reported that nearly 50% of evaluated human-related generated images contained at least one anatomical distortion, highlighting how widespread the problem still is across current systems.
These distortions particularly affect hands and overlapping limbs, where partial occlusion and unusual viewpoints make it hard for models to infer structure from training images. The Distortion-5K analysis emphasizes that distortions often occupy small regions of the image, demanding high-resolution reasoning to detect and avoid them.
Against this backdrop, Nano Banana Pro arrives as Google’s attempt to push image generation closer to reliable human figures and more structurally coherent compositions, while also addressing legible text and grounded diagrams.
What Nano Banana Pro Adds
According to Google’s launch blog and the model’s official page, Nano Banana Pro is positioned as a Gemini 3 Pro–based image model with three main pillars: clearer text, “studio-quality” creative control and stronger use of world knowledge.
Key capabilities include:
Text inside images
Nano Banana Pro is designed to generate sharp, legible text for posters, diagrams and product mockups, and to translate that text into multiple languages. Internal Gemini 3 Pro Image benchmarks published by Google show the model achieving the lowest single-line text error rates, often under 10%, across many languages, compared with models such as GPT-Image 1 and Flux Pro Kontext Max.Multi-image and multi-character consistency
The launch material states that Nano Banana Pro can blend up to 14 input images while keeping the resemblance of up to five people consistent across a scene – for example, combining several reference photos into a single fashion editorial or lifestyle composition.Higher resolutions and finer controls
Google advertises outputs up to 2K and 4K resolution via the API, with controls for aspect ratio, camera angle, lighting, depth of field and localized edits such as changing clothing, time of day or focus within a frame.Grounded, context-rich visuals
Through Gemini 3’s reasoning and optional Search grounding, Nano Banana Pro can generate infographics, step-by-step recipe visuals, or simple educational diagrams that attempt to reflect real-world information, rather than only generic stylistic prompts.
Taken together, these features move the model beyond pure “prompt-to-picture” art toward structured design work, layouts and information graphics.
Benchmarks and Early External Tests
Google’s own model card and landing page present Nano Banana Pro (Gemini 3 Pro Image) as state-of-the-art in several internal benchmarks, leading on:
Elo preference scores for image editing and text-to-image tasks versus Gemini 2.5 Flash, GPT-Image 1, Seedream v4 and Flux Pro.
Single-line text rendering error rates across multiple languages, where Gemini 3 Pro Image is shown with the lowest error rates in a heatmap comparison table.
Independent reviews provide additional, though narrower, evidence:
In a 2025 comparison on PageOn.ai, a reviewer tested 100 prompts that required specific text inside images. In that small benchmark, Nano Banana achieved 94% correct text rendering, while Midjourney was reported at 71%, meaning nearly one in three Midjourney images needed manual correction for text. The author framed this difference as significant for marketing and information graphics workflows that rely on clean typography.
This is a single reviewer’s test rather than a field-wide standard, but it aligns with Google’s own emphasis on text fidelity.
Guides and blog posts from third-party tools that integrate Nano Banana or Nano Banana Pro note that it is particularly strong at precise image editing and text replacement, for example preserving layout and textures while swapping or translating text on packaging or posters.
In contrast, comprehensive, model-by-model anatomy statistics comparable to Distortion-5K’s cross-model analysis have not yet been published for Nano Banana Pro. Most available evidence on anatomy – especially hands and complex poses – is qualitative, based on examples and early user feedback.
Where and How It’s Available
Google is distributing Nano Banana Pro across several product lines rather than a single interface.
Gemini app (consumers and students)
Available in the Gemini app when users choose “Create images” with a “Thinking” model.
Free-tier users receive limited Nano Banana Pro quotas; once those are exhausted, image creation falls back to the original Nano Banana model.
Subscribers to Google AI Plus, Pro and Ultra tiers receive higher image limits.
Workspace and creative tools
Rolling out to Google Ads for creative testing and ad asset generation.
Being introduced to Google Slides and Vids for Workspace customers, and to Flow, Google’s AI filmmaking tool, for AI Ultra subscribers.
Developers and enterprises
Exposed via the Gemini API and Google AI Studio under the Gemini 3 Pro Image model family, and available in Vertex AI for large-scale or production use.
This distribution strategy positions Nano Banana Pro as both a consumer-facing creative tool and an infrastructure component for advertising, office productivity and enterprise applications.
Safety, Watermarking and Provenance
Google has coupled Nano Banana Pro with an expanded provenance and safety stack:
SynthID watermarking
All images generated or edited by Nano Banana Pro carry an imperceptible SynthID digital watermark, intended to survive common operations such as cropping, resizing or format conversion.Visible “Gemini sparkle” badge
In many consumer contexts, including the Gemini app for free and Google AI Pro tiers, AI-generated images include a visible icon to indicate that they were produced by Google AI.C2PA metadata and verification
Google is embedding C2PA provenance metadata in images across Gemini, Google Ads, Workspace and developer tools, and has added a feature in the Gemini app that lets users upload an image and ask whether it was generated by Google AI, based on detecting SynthID.
The Gemini 3 Pro Image page also lists known limitations: text can still be misspelled, small faces and fine details may be inaccurate, and data-driven diagrams may misrepresent information, which Google explicitly recommends users verify.
Position in a Growing Visual-AI Market
Nano Banana Pro arrives in a context where AI visuals are already common in marketing and social media.
A September 2025 statistics roundup by Zebracat reports that:
68% of marketers say they use AI-powered tools to create or enhance visual content, up from 40% in 2021.
AI-generated visuals appear in 35% of social media campaigns in 2025, more than doubling from 15% in 2020.
These figures suggest that, for many teams, AI imagery has moved from experimental to routine, especially for online campaigns. A model like Nano Banana Pro, which emphasizes text accuracy, multi-image composition and higher resolution, is likely to be evaluated not only on artistic flair but also on its ability to fit into established marketing and product-design workflows.
At the same time, Lu Ma et al.’s Distortion-5K work shows that anatomical distortions remain prevalent – affecting nearly half of evaluated AI-generated human images across multiple models. Until similarly systematic benchmarks are published for Nano Banana Pro, it is more accurate to say the model appears to reduce obvious anatomy failures – based on Google’s sample images and early user reports – rather than to claim it has solved them outright.
Outlook
The available evidence supports several cautious conclusions:
Substantive improvement over earlier Gemini image models
Internal DeepMind benchmarks show Gemini 3 Pro Image leading prior Gemini 2.5 Flash Image (Nano Banana) and several competitors on image editing and text-to-image preference scores, particularly for text-heavy and infographic-style tasks.Strong but not perfect text handling
Google’s own heatmaps and the PageOn.ai 100-prompt test both point to notably higher text accuracy than many rivals in practical use, though this is based on early, limited evaluations rather than a universal standard.Human figures still demand scrutiny
Broader research on AI image generation indicates that hands, limbs and overlapping bodies remain challenging areas. Nano Banana Pro offers more control and higher resolution, which should help, but its performance on anatomy is best treated as an incremental improvement pending independent, model-specific studies.Traceability features are helpful but not a complete safeguard
SynthID watermarks and C2PA metadata make it easier to flag images from Google’s tools, yet they do not stop misuse via screenshots, re-encoding or other generators that lack similar safeguards.
Nano Banana Pro, therefore, looks less like a final answer to AI’s visual reliability problems and more like a significant step in a broader shift: from loosely controlled, one-off AI artwork toward structured, traceable and text-heavy visual assets that can be integrated into professional workflows, albeit still with a need for human oversight.
We are a leading AI-focused digital news platform, combining AI-generated reporting with human editorial oversight. By aggregating and synthesizing the latest developments in AI — spanning innovation, technology, ethics, policy and business — we deliver timely, accurate and thought-provoking content.
