Here is a thing that happened: an entire profession's worth of physical tools, accumulated over a century of filmmaking, got compressed into a blinking cursor. No transition period. No manual. Just a white box that says "describe your video" and expects you to translate fifteen years of lens selection and lighting instinct into a sentence.

That was the situation in early 2025. A year later, the models are staggeringly better and the text box is exactly the same size.

The gap runs both directions

The original problem was translation. You know what an 85mm at f/1.4 does. You know the difference between bounced key and hard side. You know why you would pick ARRI LogC over Rec.709 for a particular mood. But the text box does not know that you know this, so you type "cinematic, 4K, dramatic lighting" like everyone else and get back something that looks fine. Generic. The visual equivalent of muzak.

That was the problem CinePrompt was built to close. And it still is. But there is a second gap that turned out to be just as important: the person who has a vivid image in their head but does not have the vocabulary to describe it. The director who can feel the shot but cannot name the lens. The motion designer who knows the mood but not the lighting rig that creates it.

CinePrompt now works in both directions. If you have the vocabulary, it gives you surgical control — 1,457 buttons mapped to real cinematography language, each one placing the exact word that moves the exact model. If you do not have the vocabulary, Scene-to-Prompt lets you describe what you see in plain language and CinePrompt decomposes it into professional fields: camera angle, focal length, lighting type, color palette, movement, sound. You get the cinematography education and the optimized prompt in the same action.

Both directions meet at the same place: a prompt built from specifics, not adjectives.

What it actually is

CinePrompt is the fastest way to get from an idea in your head to a professional-quality AI video. It is also a control surface — meaning it gives you precise, repeatable control over every creative decision in the shot. Those two things are not in tension. Speed comes from structure.

Structured like a film set. On a real production, the creative decisions happen before the camera rolls. Lens selection. Lighting plans. Color references. The shoot executes what was already decided. CinePrompt works the same way — you make the creative calls through buttons and fields, and the prompt assembles itself in real time, optimized for whichever model you are generating with.

AI video generation skipped this phase entirely. Every tool in the space went straight from "describe your video" to "here is your video." No structured way to define what you want before you spend the inference, no way to iterate on individual decisions without starting over. CinePrompt puts that structure back — and because it is wired directly into generation, you go from idea to finished video without leaving the page.

Control surface. Think about a mixing console. A hundred faders, each controlling one element of the sound. You could technically write a sentence describing the mix you want and hand it to an engineer. But the faders let you reach in and move one thing without touching everything else. That is what CinePrompt's buttons are. Each one is a handle on a single piece of the prompt. Change the lens from 35mm to 85mm. The rest of the prompt stays exactly where you put it.

This matters more than it sounds like it should. When you are iterating on a shot — trying different focal lengths, swapping lighting setups, testing camera movements — you need to change one variable at a time. A text box does not let you do that without rewriting the whole sentence and hoping you did not accidentally lose the thing that was working. Buttons let you turn one dial.

For AI filmmaking. Not for AI image generation, not for chatbot conversations, not for social media content. For people who think in shots, sequences, and scenes. The vocabulary is real cinematography vocabulary. The workflow assumes you care about the difference between a dolly push and a tracking shot, even if you are generating both from text.

Three workflows, one toolkit

Single Shot is full control over every parameter of a single shot. Subject, environment, camera, lens, lighting, color, movement, sound. Simple mode gives you the essentials. Complex mode opens every dial. Good for hero shots, style tests, one-offs, or learning what each parameter actually does to the output.

Multi-Shot builds sequences. Transition connectors between shots. Recurring characters that maintain consistency across the scene. Global settings — lighting, mood, music — that carry through so shot five does not look like it wandered in from a different film. This is for people who think in scenes, not clips.

Frame to Motion is the fastest path from reference image to finished video. The image panel gives you full prompt control with buttons — build your frame, generate it, and it appears right there. The motion panel sits beside it with the short, specific prompts that img2vid models actually respond to: camera direction, speed, subject action. One click tags your generated image into the video prompt. One more click generates the video. No exporting, no re-uploading, no switching tabs.

This is also where the Subject Library earns its keep. Saved characters with their reference images live in the panel, each tagged with an @Image marker you can drop into the prompt with a click. Up to nine references, unified numbering, one shared counter across direct uploads and saved subjects. When you are working with reference-to-video models like Kling or Seedance, this is the difference between a coherent workflow and a scavenger hunt across browser tabs.

Thirty-one models, zero markup

CinePrompt generates video across 31 models from two providers — fal.ai and Venice.ai — with more coming. BYOK architecture: you bring your own API key, pay provider rates, no credit markup, no subscription trap. Your key lives in your browser. CinePrompt is not in the middle.

This matters because the Generate button is a commodity. Every platform selling Kling, Veo, Sora, or Seedance is calling the same APIs and wrapping them in different credit systems. What CinePrompt actually provides is the intelligence layer between you and those APIs: 1,457 cinematography controls, eight model-specific optimizers that know which words each model responds to and which it ignores, and an AI-powered Scene-to-Prompt engine that can decompose a rough idea into professional cinematography fields in seconds. That knowledge — accumulated and encoded into the tool — is what turns a text box into a creative instrument.

Each model speaks a slightly different dialect. Kling responds to camera body references that Veo ignores. Seedance handles audio natively while others need a separate pass. Some models understand film stock names. Others need the color science described in plain terms. CinePrompt handles the translation so you do not have to maintain a mental spreadsheet of model quirks.

Scene-to-Prompt: both directions

Type a rough description into the Scene-to-Prompt bar — "a woman walks through a neon-lit Tokyo alley at night, rain on the ground, jazz playing somewhere" — and CinePrompt parses it into structured fields. Camera angle, focal length, lighting type, color palette, environment details, sound design. Every field becomes a button you can adjust. The jazz becomes a score you can change to ambient drone. The neon becomes a lighting setup you can swap to sodium vapor. The 35mm it guessed becomes the 85mm you actually wanted.

This is not autocomplete. It is decomposition. It takes a feeling and gives it handles. And once the handles exist, you own every piece.

Who this is for

Cinematographers and DPs who built a career on visual instinct and need that instinct to survive the transition to generated media. Motion designers building AI content for clients who expect polish and do not accept "cinematic, 4K" as a creative brief. Directors who can see the shot but cannot name every technical decision that makes it work. AI filmmakers who are tired of the same flat, overlit, wide-angle output that comes from unprompted generation.

And honestly, anyone who has opened one of these generators, stared at the text box, and felt the particular frustration of having a clear image in their head with no good way to describe it.

Why pre-production is the point

Here is the bet: generation is a solved problem on a treadmill. The models will keep getting better. The prices will keep dropping. Within a year or two, generating a shot will be instant and nearly free. Every video editor — Premiere, Resolve, Final Cut — will have a Generate button built in. The text box will be everywhere.

When generation is free and ubiquitous, the value is entirely in what happens before you press the button. The decisions. The specificity. The difference between "a man walks down a street" and "a weathered fisherman limps down a cobblestone lane in golden-hour side light, 85mm shallow depth of field, the camera tracking at knee height as fishing nets dry on a stone wall behind him." Both are prompts. One is a creative direction.

Every NLE will eventually have a Generate button. The question is what sits between your intention and that button — an empty text field, or a structured creative instrument that knows more about cinematography than any single prompt could contain. CinePrompt is built for the second version of that future. The one where the quality of the ask determines the quality of the output, and the tool that helps you ask better wins.


CinePrompt is live at cineprompt.io and free to use in Simple mode. If there is something you want covered in Field Notes, tell us.


Bruce Belafonte is an AI filmmaker at Light Owl. He thinks in focal lengths and will not apologize for it.