The agent wrote the prompt -- CinePrompt Field Notes

The prompt field has been shrinking for months. It started as the main event: a standalone tool, a white rectangle, every creative decision packed into forty words. Then it moved into a chatbot and became a message. Then it moved into an editing timeline and became a form field. Now Adobe has built something called Project Moonlight, and the prompt field is gone entirely.

In its place: a conversation with an AI agent that interprets what you want, selects the model, writes the instructions, and hands you the output.

You never type a prompt. You talk to a middleman who types it for you.

The suite ate everything

Adobe Firefly now hosts more than thirty models from across the industry. Google's Nano Banana 2 and Veo 3.1. Runway's Gen-4.5. Kling 2.5 Turbo. OpenAI's image models. Adobe's own Firefly Image Model 5. All of them sitting inside one interface, available through the same subscription, generating output that flows directly into Photoshop, Premiere, After Effects.

Every absorption this series has documented followed a pattern. Sora moved into ChatGPT. Seedance moved into CapCut. Runway opened its platform to competitors. Each time, the generation tool lost a little autonomy and the host product gained a little gravity. Adobe skipped the incremental steps. It absorbed everything simultaneously. The models. The editing. The refinement. The conversation. Thirty competitors living under one roof, and the roof belongs to the company that already owns the creative suite.

This is article 16's commodity thesis at its most literal. Adobe is not selling generation. Adobe is selling the room the generation happens in. The models are interchangeable tenants. The landlord collects rent regardless of which one you pick.

The disappearing text box

The trajectory has been visible for a while. Track it:

Standalone tool: the prompt was the product. Every word mattered. The interface presented the text field as the central creative act.

Chatbot: the prompt became a message in a thread. The interface encouraged brevity and accommodation. Four words got you something. Forty words got you something better, but the interface never asked for forty.

Editing timeline: the prompt became a form field inside a larger workflow. One input among many. The context improved (the timeline knows what comes before and after) but the real estate shrank.

Agent: the prompt disappears. You describe your intent in conversation. The agent translates that conversation into model instructions. You never see the prompt unless you go looking for it.

Adobe calls this "becoming the creative director of your own world." Which is a fascinating sentence, because a creative director does not operate the camera. A creative director describes what they want and trusts someone else to execute it. The agent is that someone else.

Another translator in the chain

The translation gap this series has documented runs between the filmmaker and the model. Forty articles about the distance between creative intent and generated output. The prompt was the bridge. Imperfect, lossy, constantly fighting model defaults and attention gradients, but yours. Every word a decision. Every omission a surrender.

The agent adds a new link to the chain. Filmmaker to agent to prompt to model. Two translations instead of one. The agent reads your conversational description and converts it into model-readable instructions. The model reads those instructions and converts them into pixels. Information compresses at each handoff.

If the agent is good, this is an improvement. A skilled translator between you and a foreign-language speaker is better than a phrasebook. The agent can hold context across a session, remember your preferences, select the right model for the task. It can write a longer, more structured prompt than most humans would bother with. It can do the bookkeeping (which model handles this shot type best, which keywords this model responds to, what resolution and aspect ratio) while you focus on the creative decisions that bookkeeping supports.

If the agent is mediocre, it is another set of defaults between you and the output. Another opinion about what your work should look like, except this one speaks in first person and calls itself helpful.

Custom models change the math

Adobe also launched custom models in Firefly. Upload your images, train a model on your style, and every subsequent generation carries your visual identity. Character consistency. Illustration style. Photographic look. The model learns your aesthetic and reproduces it.

This is the same principle Netflix paid for when it acquired InterPositive: a model that understands your specific visual language because it was trained on your specific work. InterPositive did it at production scale with dailies. Adobe is doing it at individual scale with a drag-and-drop upload.

Custom models compress the gap from the model's side rather than the filmmaker's side. Instead of writing a more precise prompt to overcome the model's generic training, you train the model to start from your visual baseline. The gap narrows because the model's starting position moved closer to your intent.

Pair a custom model with a conversational agent and the arithmetic shifts again. The agent knows your style (the custom model). It knows your tools (the Adobe suite). It knows your current project (the session context). The filmmaker describes a shot in plain language. The agent writes a prompt tuned to the custom model. The custom model generates output that already looks like the filmmaker's work.

At every stage, someone other than the filmmaker is making creative decisions. The custom model decides what "your style" means by averaging your uploaded images. The agent decides how to translate your words into prompt language. The generation model decides how to render the prompt. Three layers of interpretation between intent and output.

Vocabulary does not retire when the interface changes

A filmmaker who knows the difference between hard key light and soft fill, between an 85mm and a 24mm, between Portra warmth and CineStill halation, does not lose that knowledge when the input method changes from a text box to a conversation. The vocabulary transfers. It has transferred through every absorption so far.

But the agent creates a new kind of test. When you typed a prompt directly, you saw exactly what the model received. Your words, your order, your decisions. When an agent writes the prompt on your behalf, you are trusting a translator you cannot audit in real time. The agent might expand your "warm evening light" into something specific and useful. It might also flatten your "hard overhead noon, no fill, sweat visible" into something softer and more conventionally attractive, because the agent has its own beauty bias inherited from its own training data.

The only way to know is to speak precisely enough that the agent has less room to interpret. "Make it look cinematic" gives the agent permission to fill every gap. "Hard overhead key, no fill, 4300K, visible grain in the shadows, shallow depth of field with the background blown to 2 stops over" gives the agent instructions specific enough that translation has less room to wander.

The vocabulary was always the point. The interface was always temporary.

The question underneath

Every interface evolution in this space has followed the same pattern. The prompt field gets smaller or disappears. The system fills more gaps with more defaults. The output gets more accessible and more convergent simultaneously. The person with specific creative vocabulary produces different work from the person without it. The distance between those two outcomes widens with each layer of abstraction.

Adobe built a system where you can generate from thirty models, train your own style, and direct an agent without ever writing a prompt. That is genuinely powerful. It is also a system with more distance between the filmmaker and the pixel than any prior interface in this space.

The question is the same one it has been since the text box first appeared: do you know what you want?

If yes, every tool serves you. Prompt, chatbot, timeline, agent. The vocabulary carries.

If no, every tool decides for you. And each new layer of helpful abstraction makes the decision a little more invisible.

Bruce Belafonte is an AI filmmaker at Light Owl. He has never delegated a prompt to an agent and suspects the day is closer than his stubbornness will admit.