This post is my submission for DEV Education Track: Build Apps with Google AI Studio.
What I Built
I built SceneCraft, a dynamic, AI-powered storyboard generator designed to help writers, filmmakers, and creatives visualize their scenes. The application transforms a sequence of textual prompts into a cinematic storyboard, using Google's Gemini API to generate both the images and creative suggestions.
The core of the app was built around these key functionalities:
- Sequential Image Generation: Using the
imagen-3.0-generate-002
model to generate a series of images based on a combination of a base style, character descriptions, and individual shot prompts. - AI-Powered Suggestions: Leveraging the
gemini-2.5-flash-preview-04-17
model with a specific system instruction andresponseMimeType: 'application/json'
to provide users with creative alternatives for their style, character, and shot descriptions, acting as a creative partner. - Dynamic UI: A fully interactive interface built with React and TypeScript where users can add, edit, and remove characters and shots on the fly, making the storyboarding process fluid and iterative.
Demo
A link to the fully deployed application is here.
The application interface is clean and divided into two columns. On the left, the "Scene Details" panel allows the user to define their vision. It contains editable text areas for the overall Style, Character descriptions (with the ability to add/remove characters), and the Shot Sequence (with the ability to add/remove shots). Each input field is enhanced with a "Suggest" button.
On the right, the "Storyboard Grid" area starts empty, waiting for the user's creation.
When the user clicks "Generate Storyboard", each shot card in the grid enters a loading state. As the imagen-3.0-generate-002
model returns images, they populate the cards sequentially, creating a visual narrative. If a user feels stuck, clicking the suggest button on any field calls the gemini-2.5-flash-preview-04-17
model to provide creative alternatives, which can be applied with a single click.
My Experience
Working with the Google AI SDK on this project was a fantastic experience. The key takeaway for me was the sheer versatility of the Gemini models and how easily they can be integrated to create complex, interactive applications.
What I learned:
- Structured Output is a Game-Changer: Using
responseMimeType: 'application/json'
along with a detailedsystemInstruction
was incredibly powerful. It allowed me to reliably get structured data for the "Suggestions" feature without complex string parsing, making the integration seamless. - Combining Models for Richer Experiences: The real power came from using two different models for their specialized tasks.
imagen-3.0-generate-002
was a workhorse for high-quality visuals, whilegemini-2.5-flash-preview-04-17
was perfect for fast, creative text generation. This combination enabled an app that doesn't just execute commands but collaborates with the user. - Ease of Integration: The
@google/genai
SDK is clean and straightforward. The API for generating images and generating content is intuitive, and handling asynchronous operations within the React state model was a smooth process.
What was surprising was how simple it was to build a feature that feels genuinely helpful and creative. The AI suggestions aren't just random; by providing context from the user's own writing, the model acts as a true assistant, elevating the user's initial ideas. This project demonstrated that modern AI APIs can be the foundation for tools that truly augment human creativity.
Top comments (4)
Awesome 🔥
Thank U for checking this out
This is incredible! Love how you combined image and text generation to truly assist the creative process. SceneCraft looks like a game-changer for storytellers!
Glad to hear you liked it