Skip to main content

New Gemini Feature Turns Photos into Videos

Google is once again redefining the boundaries of digital creativity. Its Gemini platform now lets users transform ordinary still images/photos into short, animated video clips, complete with sound. This fresh capability, revealed by David Sharon, who leads Multimodal Generation for Gemini Apps, is powered by the company’s latest video model, Veo 3.

Gemini Feature

How It Works?

Breathing life into a static photo might sound like something out of a sci-fi movie, but with Gemini, the process feels intuitive and fun. Inside the Gemini interface, users can head over to the prompt area and select the “Videos” option. Once a photo is uploaded, all that’s left to do is describe what the scene should look like in motion, and optionally, suggest accompanying audio.

That’s all it takes. A few inputs later, your snapshot evolves into an eight-second animated video. Whether you're reimagining a childhood drawing or adding motion to a scenic photo from a recent hike, the possibilities feel nearly limitless. Finished videos can be downloaded or shared instantly with friends and family.

The AI Engine Behind the Art

Under the hood, all of this is made possible by Veo 3, Google's advanced video-generation engine. Introduced in May, this model is already making waves. It recently became available to Google AI Pro users across more than 150 countries.

And users are clearly loving it. In just the past seven weeks, over 40 million videos have been created using Veo 3 (both within Gemini and Flow -- Google’s AI-powered storytelling tool). People are using it to do everything from reimagining classic fairy tales with a modern spin to building ASMR experiences around nature’s most mesmerizing sounds.

Where and How to Try It

The photo-to-video feature is currently rolling out to Gemini AI Pro and Ultra users in select countries. Curious users can check it out by visiting gemini.google.com. The same tools are also available in Flow, which is more tailored for creators working on longer or more cinematic projects.

Built With Safety in Mind

As with all of Google’s AI innovations, the launch of this feature comes with a focus on responsibility and safety. Behind the scenes, the tech giant is running continuous “red teaming” simulations, essentially stress tests designed to catch problems before they reach real users.

Each AI-generated video is clearly marked with a visible watermark to indicate it was created by artificial intelligence. Additionally, every file includes a SynthID digital signature -- Google’s invisible watermarking system designed for traceability.

And user feedback is more than welcome. With a quick thumbs-up or thumbs-down on each video, creators can share their impressions. This feedback loop helps Google continuously fine-tune the experience and maintain high standards of safety.


This feature is more than just a novelty; it’s a glimpse into what the future of personal storytelling could look like. By giving users the ability to animate their memories, drawings, or ideas with just a few prompts, Google is turning imagination into a playable format. Whether you’re an artist, a content creator, or just someone curious to explore what AI can do, Gemini now offers a platform filled with limitless potential.

Comments

Popular posts from this blog

Hands-On with Manus: My First Impression with an Autonomous AI Agent

Last month, I stumbled across an article about a new AI agent called Manus that was making waves in tech circles. Developed by Chinese startup Monica, Manus promised something different from the usual chatbots – true autonomy. Intrigued, I joined their waitlist without much expectation. Then yesterday, my inbox pinged with a surprise: I'd been granted early access to Manus, complete with 1,000 complimentary credits to explore the platform. As someone who's tested every AI tool from ChatGPT to Claude, I couldn't wait to see if Manus lived up to its ambitious claims. For context, Manus enters an increasingly crowded field of AI agents. OpenAI released Operator in January, Anthropic launched Computer Use last fall, and Google unveiled Project Mariner in December. Each promises to automate tasks across the web, but Manus claims to take autonomy further than its competitors. This post shares my unfiltered experience – what Manus is, how it works, where it shines, where it st...

Your 'Please' and 'Thank You' Cost OpenAI Millions, Sam Altman Reveals

In the rapidly evolving world of artificial intelligence, even seemingly small gestures of human courtesy towards chatbots like ChatGPT come with a price tag. OpenAI CEO Sam Altman recently revealed that users saying " please " and " thank you " to the company's AI models is costing "tens of millions of dollars". While the notion of politeness having a significant financial impact on a tech giant might seem surprising, experts explain that this cost is a consequence of how these powerful AI systems operate on an immense scale. How AI Processes Language (And Politeness) Understanding the cost involves looking into the technical underpinnings of AI chatbots. Large language models (LLMs) like ChatGPT process text by breaking it down into smaller units called tokens . These tokens can be words, parts of words, or even punctuation marks. When a user inputs a prompt, the AI processes each token, requiring computational resources like processing power ...