Google Just Dropped a Bombshell! The Upgraded Multimodal Capabilities Are Insanely Powerful • Lucky Snail

Hi everyone, I’m luckySnail. Yesterday, Google officially released the native multimodal image generation feature for Gemini 2.0 Flash Experimental. After playing with it for a few hours, I couldn’t wait to share — it’s incredibly powerful! Let’s take a look at what it can do:

Image generation
Image editing
Creating image stories
Designing birthday cards

Let’s first check out the results, and then I’ll explain how to try it. If you want to jump right in, you can scroll down to the “Tutorial” section.

Image Generation

Here’s an image of my favorite SpongeBob and Patrick, generated all at once by Gemini — ready to use right away!

Now we can use our imagination and generate a picture of Tom and Jerry shaking hands! Generated Image March 13, 2025 - 11_24PM.png.jpeg

Image Editing

The text placement in the Tom and Jerry image above might not be ideal — we can adjust it through conversation:

We can also remix images from the web:

Creating Image Stories

This is the most promising feature. Let’s use it to generate a story of a food delivery rider picking up an order and delivering it to the customer:

Here’s an example of generating a game character from scratch: Amazing!

Designing Birthday Cards

I think this is a very practical feature. One idea I have right now is to use it to design wedding invitations. What do you think of the result?

Design a Chinese-style wedding invitation card. Use Chinese red as the theme color, with large text reading: "Xiao Zhang ❤️ Xiao Wang's Wedding Invitation"

Tutorial

Right now, it’s still free and unlimited from Google — what a generous move from a tech giant! First, open this URL in your browser: https://aistudio.google.com/ (you’ll need a VPN). You’ll see this:

When you first enter, I suggest trying out the three sample cards provided by the official site to understand how it works. Then you can let your imagination run wild! I really envy those creative minds now.

Summary

After using it for a few hours, I feel the generation capability is already very strong, though sometimes it returns ⚠️. In terms of fine-tuning images, it feels like it has reached a productive level. Right now, the image story generation feature seems the most exciting — it can generate images with narrative context in one go, perfect for content creation. If you think plain text is too dry, give it a try — it’s really useful!

I recently launched my own product: https://www.svgshow.cn . It’s a website that helps you quickly turn content into beautiful images, with online editing capabilities. My cover image was generated using it.