PipeDream Studio: Vibe-Coding a Hyper-Personalized Workflow for AI Filmmaking

Mar 8
7 min read

My journey into exploring AI video generation, its potential use cases, and workflows began with an attempt to create an AI short film, which I'd previously mentioned in my last post. Having encountered some challenges from my first attempt, I sought a workflow that could bridge traditional filmmaking along with my conviction that AI is ushering in a new era of "Hyper-Personalization", in many forms including workflows. This led me to develop, or "vibe-code", my own tool: PipeDream Studio. This singular application streamlines the entire process, taking a project from script to assembly cut. I initiated the build in Google AI Studio and completed the prototype in Google Antigravity.

To test the application I created a few experimental short Proof of Concept Spots which can be seen in the reel below:

https://video.wixstatic.com/video/d97b0d_2e1d6aed71f748ac8b0da099bd06654a/1080p/mp4/file.mp4

Three Spec Ads created in the PipeDream Studio app using fictional brands created with Gemini.

Optimism Mobile - Elements, storyboard and video generated in PipeDream Studio. Some VFX, Sound Design and final finish in Premiere Pro. Titles in Photoshop and Voice Over in Elevenlabs (the 3rd and last test).
Storm-Grip Runners - Elements, storyboard and video generated in PipeDream Studio. Sound Design, film grain, color grade and final finish in Premiere Pro. VO in another vibe coded app in Google AI Studio (the 2nd test)
Lumina Espresso Press - Elements, storyboard and video generated in PipeDream Studio. Sound Design and final finish in Premiere Pro (the 1st test).

The Discovery

The initial problems I was facing with my short film was determining the best way to convert the script into a workable format for an AI video generator, essentially, converting the script into a series of prompts. Other challenges I encountered were maintaining consistency in character, props, and location, which is a common and unfortunate issue with AI videos (currently but for how long?).

My solution was to create this app to address all these issues. I also felt it was important to keep the human in control at all times. Therefore, the app always provides the filmmaker with ways to control the creative output. PipeDream Studio is powered by Gemini, Nano Banana Pro, and Veo 3.1. (I may add further models via API calls later).

The app has six stages or pages to go from script to screen: Script, Elements, Blocking, Storyboard, Rushes, and Assembly (though there is flexibility to start from any point). I’ll quickly run through the key features for each one individually:

1. Script

On this page, you can set up your entire project, choose a preset style, and lock in your characters, locations, props, and shot-list.

Upload or paste your script into the text window.
The AI scans the script and extracts your characters, locations, and props, along with their descriptions.
Confirm or make any changes to the extracted assets, then press confirm and create Assets (This sends all the extracted assets to the Elements page).
Head to the script Breakdown panel to now create your shot-list (my favorite feature). Draw red rectangle boxes over parts of your script, assign a camera position and location. Once all shots are defined, click Create Shot-list. This feeds into the Blocking and Storyboard pages.

2. Elements

On this page, you can start generating the elements extracted from the Script page, with the descriptions from the script loaded and ready to be used as prompts.

Nano Banana Pro will create a character sheet with two headshots (one profile) and three full-body shots at different angles. You can upload reference images if you have a specific wardrobe for the characters. Generate additional sheets for different costumes and re-edit any image using the Inpaint feature.
Generate a location image, with the option to upload reference images. Then, generate the view in the opposite direction to provide a background for reverse-angle shots. Generate close-up details of the environment for any close-up shots.
Generate a Prop/Product sheet, also featuring reference images and Inpaint editing features. The Prop/Product sheet displays the props from four angles: top, bottom, front, and back.

The generated elements will then be available for the following pages:

3. Blocking

This page is partly a hybrid, but its main purpose is implied in its name: it attempts to block the action and generate spatial awareness in the AI to achieve continuity in action, character, and location, a frequent pitfall of AI-generated videos (currently but things are rapidly changing as I write this!). It’s still a work in progress but currently works by generating Nano Banana Pro storyboard grids.

The shot-list created on the Script page appears here. Each shot is an individual panel with the camera position, location, and the text from the script you boxed, loaded as a prompt, which you can re-edit before generating. These will be converted into the grid frames.
After reviewing all your shots and checking details or rewriting prompts, select the images of the elements you want to include in the scene (characters, location, props, etc.). Then click generate, and multiple grids of up to four start frames will be created. Grids of four are used because a single grid frame will be 1080HD if the source image is 4K, thereby maintaining more detail (originally I was using 3x3 grids).
Re-edit any grid globally or re-edit individual frames with the Inpaint feature.

4. Storyboard

What was originally the storyboard page has evolved into something more like a shot-list. Here, a list of empty start and end frames with a camera position and script text box has been loaded from the script breakdown panel on the Script page. This is where the video generation begins.

Click on the Start Frame to select a frame from the Blocking Grids. A frame will be extracted from the grid image and added to the start frame for that shot. You can make further changes here with Inpainting or by adding a reference image if needed.
Now generate the end frame by writing a prompt and feeding the start frame image as a reference to maintain continuity. Alternatively, select a reference image from local storage, or select images from your generated elements, or select another blocking grid frame, or select the last frame from one of your already generated videos as well. Lots of options which is also available for the start frames too!
Move to the video prompt box below to generate a video using the frames-to-video process. Prompt templates are also available including JSON templates.
There is also a drop-down option to switch over to Ingredients-to-video mode if required where the start frame will be loaded as the first reference image, then pick from either elements or local storage for the following two ingredient images.

Your generated videos will then appear on the next page.

5. Rushes

I gave this a traditional name as this is how I’ve been thinking of each video generation. This is where your generated videos appear for playback review, and if you’re not happy with “Take 1,” you can re-edit the prompt and export “Take 2”!

From here, you can then select your favorite takes in the top corner of each shot; this then feeds that take into the following page’s timeline.

Rushes Page - Generated videos for each shot goes down the page and new takes appear to the right of the first. Put a tick on your favourite to load it into the Assembly timeline (From the Optimism Mobile Ad)

6. Assembly

Your first opportunity to see it all come together! A very basic video editor's timeline where you can trim the head and tail of each clip loaded from the previous page. You can also re-arrange the videos, and add or delete videos on the timeline. Playback your scene, then export a full cut mp4 and/or export this assembly cut timeline as an xml file to continue the edit in Premiere Pro or any other NLE software (the process I took for the three created ads).

Assembly Page - Quick edits and playback before exporting via XML to your preferred NLE (From the Optimism Mobile Ad)

Conclusion

This experiment demonstrated to me a genuine, emerging opportunity for businesses to leverage AI in producing rapid, cost-effective, and highly personalized digital ads, especially as video models evolve in realism. It makes me wonder where we will be this time next year? However, for this to be effective, a hybridized approach is vital and for best results the AI-generated content must be moved into Premiere Pro or other NLEs for finishing touches by a human creative, who remains ultimately responsible for the final delivery.

In the context of the AI workflow inside of PipeDream Studio and general AI video/image generators, achieving consistency still remains difficult. However, I've found that incorporating multiple opportunities for refinement throughout the process helps to mitigate this, even if issues are spotted later. It's crucial to refine and precisely set your start and end frames to capture the exact look you want for your shot. This ultimately cuts costs by reducing the number of videos you have to generate to get the desired result! Ideally, you want to achieve the perfect shot in just a couple of attempts. Therefore, image-to-video is the most effective method for maintaining creative control and lowering expenses, as you can refine the frames/images at a much lower cost than generating new videos.

Regarding 'Vibe coding'. This has advanced significantly over the last year, truly opening up possibilities for hyper-personalized workflow solutions. For example, instead of paying for an expensive subscription with features you barely use, build exactly what you require to plug the gaps or fix the bottlenecks. Each custom-built solution can be tailor made to specific projects or adapted to meet a companies unique work style and culture.

As creatives globally explore diverse approaches to this technology, this project has been my take on building a functional, real-world workflow. It’s been an incredible and fun experiment in 'Vibe-Coding' a custom app tailored to my specific filmmaking needs. I’m excited to use this tool for the rest of my short film and to keep iterating on these workflow solutions as the tech evolves.