Gemini Omni by Google: multimodal video in high resolution, up to 4K.
Gemini Omni builds video from text, photos or another video, taking up to seven images as input. Clips run from 4 to 10 seconds at 720p, 1080p or 4K. It handles very long prompts, so you can describe a complex scene in full.
| Type | Video generation (text-to-video, image-to-video, video-to-video) |
|---|---|
| Animate a photo | Yes |
| Input frames | up to 7 photos |
| Audio | No |
| Clip length | 10s |
| Resolution | 720p, 1080p, 4K |
| Prompt length | 20000 characters |
| Provider model | Google Gemini Omni (Omni Flash) |
| Released | 2026-05-19 |
Gemini Omni by Google: multimodal video in high resolution, up to 4K. It is a video model by Google (Google Gemini Omni (Omni Flash)), available on Mixer AI pay-as-you-go — from 54 coins.
Pay as you go, no plans — from 54 coins. The exact price is shown before you run it.
Yes — upload a photo as a frame or reference and the model turns it into video. Text-to-video also works.
No. Mixer AI is pay-as-you-go: you top up a balance in coins and spend it only on the generations you want. Available on the site and in the Telegram bot @addbeer_bot.