Jun 12, 2026
A Video Production Pipeline Built Around a 15-Minute Constraint
How I designed a content system that produces videos quickly enough to be sustainable.
Goal
Create educational videos and shorts consistently without spending hours editing.
The original motivation was supporting content distribution for ClearFit, a career-fit project I was experimenting with. I wanted a workflow I could realistically sustain long-term.
Target:
- less than 15 minutes of effort per video
- good AI voice quality
- good quality AI-generated imagery
- visually engaging output
- inexpensive to run locally (I was curious how much quality could be achieved with roughly 16GB of VRAM before reaching for paid APIs)
Approach
After testing several text-to-speech systems and image-generation workflows, I settled on a combination that produced acceptable quality on local hardware.
The central piece became a JSON specification describing the entire video.
Each file contains:
- title
- scenes
- video thumbnail
- narration
- image prompts
- animation instructions
System
The JSON acts as the source of truth.
From there, local scripts:
- generate narration
- generate images
- animate backgrounds
- assemble scenes
- render the final video
A single specification can produce both long-form videos and shorts.
Result
The workflow is now simple enough that a new video starts as a JSON file and can be rendered with a few commands.
More importantly, the production process fits the original constraint: creating content quickly, inexpensively, and with little ongoing effort.
SystemClarity
Have a real case? Submit it.
If this kind of pattern feels familiar in your own work, use the inquiry form to share what you are trying to build and where the technical shape is still unclear.
Share This Essay