Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Animated story video generation #450

Closed

Conversation

Yousif-GO
Copy link

This Code demonstrates how to generate an animated story video by:

  1. Generating a story sequence using structured Google Gemini API (for character consistency).
  2. Generating images for each scene using Google’s Imagen API.
  3. Synthesizing narration audio using Kokoro's KPipeline.
  4. Creating short video clips (image + audio overlay) for each scene.
  5. Combining all clips into one final video.
  6. Cleaning up temporary files after processing.
1739225609_output_video.mp4

…structured Google Gemini API (for character consistency) and Imagen
…structured Google Gemini API (for character consistency) and Imagen
Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

@github-actions github-actions bot added status:awaiting review PR awaiting review from a maintainer component:examples Issues/PR referencing examples folder labels Feb 11, 2025
@markmcd
Copy link
Member

markmcd commented Feb 12, 2025

This is very cool! I love that it's a complex end-to-end example, contains text instructions throughout, and uses Google models and open-source tools exclusively. Plus it's fun!

I'll go over in a bit more detail when I get some time but for now, would you be able to use the google-genai SDK instead of google-generativeai? It'd be great to show this example off, and having the latest SDK makes it a useful demo.

Have you tried doing the audio generation using Gemini 2.0 by any chance? It's only available through the Live API now. But there is also a preview that we can look at getting you onto. This isn't required at all, but something we'll want to do once audio generation becomes GA.

And thank you for following our contrib template! Do you want to add your name on this anywhere? You're welcome to add a byline at the top with a link to a social account or website.

@Yousif-GO
Copy link
Author

Hi Mark,

Thank you for your feedback. I just created another pull request with the suggested changes. The new code now provides a full demonstration of Gemini text, Live API, and Imagen working together to generate a story video.

Notable changes include:

  • Removed the use of Kokoro and added Gemini Live instead.
  • Tweaked various parts of the code to ensure smooth functionality.
  • Updated to the latest google-genai SDK rather than using google-generativeai.

Also , an access to Native audio output would also be great for additional experimentation.

I'm glad you liked it, and I hope this demo could effectively showcases the capabilities of Gemini and Imagen in a fun and engaging way.

@Yousif-GO Yousif-GO closed this Feb 12, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component:examples Issues/PR referencing examples folder status:awaiting review PR awaiting review from a maintainer
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants