AI Video Generation

How to Write Prompts for AI Video: Complete Guide with Sora & Veo Examples

A
AdCreate Team
||36 min read
How to Write Prompts for AI Video: Complete Guide with Sora & Veo Examples

The single biggest factor determining whether your AI-generated video looks cinematic or catastrophic is not the model you choose. It is the prompt you write.

A vague prompt produces vague output. A specific, well-structured prompt produces footage that rivals a professional shoot. The gap between "a product on a table" and "a matte black wireless earbud case resting on a polished walnut surface, golden hour light streaming through floor-to-ceiling windows, shallow depth of field, camera slowly dollies forward" is the gap between unusable and ad-ready.

This guide is the most comprehensive resource on ai video prompts available today. It covers the anatomy of effective prompts, model-specific techniques for Google Veo 3.1 and OpenAI Sora 2, image-to-video prompting with Wan 2.5, ready-to-use prompt templates for every ad type, common mistakes, and advanced techniques for building multi-scene campaigns. You will find over 30 real, usable prompt examples — each one crafted for advertising use cases.

Whether you are generating your first AI video or your five hundredth, better prompts mean better output. Let us get into it.

Why Prompting Matters for AI Video

AI video models are not mind readers. They are pattern-matching systems trained on vast datasets of video and text descriptions. When you write a prompt, you are providing the model with constraints and instructions — the more precise those constraints, the closer the output matches your vision.

Here is why prompting is the highest-leverage skill in AI video generation:

Quality scales with specificity. A prompt that describes subject, action, lighting, camera movement, color palette, and mood produces dramatically better results than one that describes only the subject. In testing across hundreds of generations, specific prompts produce usable output 70-80% of the time. Vague prompts produce usable output less than 20% of the time.

Credits are finite. On AdCreate, text-to-video generation costs credits — Veo 3.1 Fast runs 8 credits, Veo 3.1 Pro runs 40 credits, Sora 2 Fast runs 15 credits, and Sora 2 Pro runs 60 credits. Every generation wasted on a poorly constructed prompt is money lost. Better prompts mean fewer regenerations and lower costs.

Iteration is expensive. Unlike text generation where you can regenerate in seconds at minimal cost, video generation takes 1-4 minutes per clip. If your prompt is off, you lose both credits and time. Getting the prompt right on the first or second attempt is a competitive advantage.

Consistency requires precision. When you are building a campaign with multiple scenes, character consistency and visual coherence depend entirely on how precisely you describe recurring elements across prompts. Vague descriptions produce variation; precise descriptions produce consistency.

The bottom line: learning to write effective ai video prompts is not optional. It is the skill that separates marketers who get value from AI video tools from those who burn credits and give up.

Scrabble tiles spelling
Photo by Markus Winkler on Pexels

The Anatomy of a Great AI Video Prompt

Every high-performing AI video prompt contains six elements. Think of this as your prompt framework — a checklist to run through before hitting generate.

1. Subject and Action

What is in the frame, and what is happening? This is the foundation. Be specific about the subject (person, product, environment) and the action (walking, pouring, rotating, dissolving).

Weak: "A woman using a phone"
Strong: "A young woman in her late 20s with dark curly hair sits at a sunlit cafe table, scrolling through a fitness app on her iPhone, smiling as she discovers a new workout plan"

The strong version specifies age, appearance, setting, device, app context, and emotional response. The model has far more to work with.

2. Visual Style and Mood

What aesthetic are you targeting? Photorealistic documentary? Cinematic commercial? Warm and inviting? Cold and clinical? The style and mood modifiers shape the entire visual character of the output.

Style keywords that work: cinematic, documentary-style, editorial, commercial, photojournalistic, fashion film, indie film, broadcast quality, stock footage style, social media native

Mood keywords that work: warm, intimate, aspirational, energetic, serene, dramatic, playful, luxurious, gritty, minimalist, ethereal

3. Camera Movement and Angle

How is the virtual camera behaving? This is one of the most impactful elements and one that beginners almost always skip.

Camera movements: dolly forward, dolly backward, truck left/right, crane up/down, orbit, pan, tilt, push in, pull out, handheld, steadicam, tracking shot, static tripod, drone ascending, drone flyover

Camera angles: eye level, low angle, high angle, bird's eye, worm's eye, dutch angle, over-the-shoulder, close-up, extreme close-up, medium shot, wide shot, establishing shot

4. Lighting and Color Palette

Lighting sets the emotional tone of the entire scene. Specify both the light source and the color temperature.

Lighting keywords: golden hour, blue hour, overcast soft light, harsh midday sun, studio three-point lighting, neon glow, candlelight, backlit silhouette, Rembrandt lighting, flat even lighting, volumetric light rays, rim lighting

Color palette keywords: warm earth tones, cool blue palette, desaturated muted tones, high contrast, pastel, monochrome, rich saturated colors, teal and orange, black and gold

5. Aspect Ratio and Duration

Always specify your target format. Different platforms require different aspect ratios, and including this in your prompt helps the model compose the frame correctly.

  • 16:9 for YouTube ads and landscape content
  • 9:16 for TikTok, Instagram Reels, YouTube Shorts
  • 1:1 for Instagram feed, Facebook feed

Duration matters too. An 8-second clip needs different pacing than a 4-second clip. Specify whether the action should unfold quickly or slowly.

6. Negative Prompts (What to Avoid)

Not all models support negative prompts explicitly, but you can achieve the same effect by stating what you do not want in your prompt.

Examples: "No text overlays. No watermarks. No abrupt cuts. Avoid shaky camera. Do not include any logos or brand names in the scene. No distorted faces or hands."

Negative instructions help the model avoid common failure modes — especially text rendering artifacts, which remain a weakness across all current models.

The Complete Prompt Formula

Put it all together and you get this structure:

[Subject + Action] + [Visual Style + Mood] + [Camera Movement + Angle] + [Lighting + Color] + [Aspect Ratio + Duration] + [Negative Instructions]

Here is an example using the full formula:

"A sleek white electric car drives along a coastal highway at sunset. Cinematic commercial style, aspirational mood. Drone tracking shot following the car from a 45-degree angle above and behind. Golden hour lighting with warm orange tones reflecting off the car's surface, deep blue ocean in the background. 16:9 aspect ratio, 8-second clip with smooth constant speed. No text, no license plates, no other vehicles."

That prompt gives the model everything it needs. Now let us look at how to optimize prompts for specific models.

Prompting for Google Veo 3.1

Veo 3.1 is Google DeepMind's flagship video generation model, and it excels at photorealism, native audio generation, and prompt adherence. If you want footage that looks like it came from a RED or ARRI camera, Veo is your model. On AdCreate, Veo 3.1 Fast costs 8 credits per generation and Veo 3.1 Pro costs 40 credits — making it the most cost-effective option for photorealistic content. Access it through AdCreate's text-to-video pipeline.

What Veo 3.1 Does Best

  • Photorealistic scenes — landscapes, interiors, product environments, human subjects
  • Natural lighting — Veo handles golden hour, overcast, studio, and mixed lighting with exceptional fidelity
  • Native audio — Veo generates synchronized dialogue, ambient sounds, and sound effects alongside the video
  • 4K output — true 3840x2160, not upscaled
  • Character consistency — identity preservation across frames and extended sequences
  • Texture and material rendering — fabric, metal, glass, water, skin

Veo-Specific Modifiers That Work Well

These keywords and phrases produce consistently strong results with Veo 3.1:

  • "Shot on ARRI Alexa" or "shot on RED V-Raptor" — triggers cinematic rendering characteristics
  • "Anamorphic lens" — produces that signature widescreen bokeh and lens flare
  • "35mm film grain" — adds organic film texture
  • "Shallow depth of field, f/1.4" — creates beautiful background blur
  • "Natural ambient sound" — activates Veo's audio generation for environmental sounds
  • "Photorealistic" — reinforces the model's strongest capability
  • "4K resolution, high detail" — pushes the model toward maximum fidelity
  • "Slow motion, 120fps look" — produces smooth slow-motion footage
  • "Golden hour volumetric light" — Veo renders light rays beautifully
  • "Macro lens, extreme close-up" — excellent for product detail shots

10+ Example Prompts for Veo 3.1

Prompt 1 — Product Hero Shot:

"A premium matte black wireless speaker sits on a marble shelf in a modern minimalist apartment. Warm afternoon light streams through sheer curtains, casting soft shadows. Camera slowly dollies forward toward the speaker, revealing its textured surface detail. Shallow depth of field, shot on ARRI Alexa, cinematic commercial style. 16:9, 6 seconds. No text, no logos."

Why it works: Combines specific product description with environment context, lighting, camera movement, and a lens reference that Veo responds to strongly.

Prompt 2 — Lifestyle/Brand Scene:

"A couple in their 30s walks through a sun-dappled farmers market, carrying a canvas tote bag with fresh produce. They pause at a flower stall, laughing naturally. Documentary style, warm earth tones, golden hour lighting. Handheld camera at eye level with subtle movement. Natural ambient market sounds — voices, rustling, distant music. 9:16 vertical, 8 seconds."

Why it works: Veo excels at naturalistic human interaction. The audio instruction triggers ambient sound generation. Vertical format is specified for social platforms.

Prompt 3 — Food and Beverage:

"Espresso pouring from a professional machine into a ceramic cup in slow motion. Rich crema forming on the surface. Steam rising. Warm cafe interior background, soft bokeh. Macro lens close-up, shallow depth of field. Warm color palette with deep browns and cream tones. Sound of espresso extraction and dripping. 1:1 square, 6 seconds."

Why it works: Veo's texture rendering makes it exceptional for food content. Slow motion plus macro lens plus specific audio cues produce a premium result.

Prompt 4 — Real Estate / Architecture:

"Camera glides through a modern open-plan living room with floor-to-ceiling windows overlooking a city skyline at dusk. Polished concrete floors, designer furniture, indoor plants. Blue hour lighting with warm interior lamps creating contrast. Steadicam forward movement, eye level. Cinematic, aspirational, luxurious mood. 16:9, 8 seconds. No people."

Why it works: Veo renders architectural spaces with exceptional accuracy. The blue hour exterior / warm interior contrast is a classic real estate technique the model handles well.

Prompt 5 — Nature / Travel:

"Aerial drone shot slowly ascending over a turquoise mountain lake surrounded by pine forests. Morning mist floating across the water surface. Early morning golden light hitting the mountain peaks. Shot on RED V-Raptor, cinematic documentary style. Natural sounds of birds and gentle wind. 16:9, 8 seconds. No people, no structures."

Why it works: Nature scenes are Veo's strongest category. The drone movement, morning mist, and camera reference combine to produce footage indistinguishable from real aerial photography.

Prompt 6 — Skincare / Beauty Product:

"A drop of golden serum falls in extreme slow motion onto a woman's fingertip. Macro lens captures the liquid's viscosity and shimmer. Clean white studio background with soft diffused lighting from the left. The drop catches the light, creating a subtle prismatic effect. Shallow depth of field, f/2.0. 9:16, 5 seconds. No text."

Why it works: Product texture and material rendering are Veo strengths. The macro perspective, specific lighting direction, and slow motion produce premium beauty content.

Prompt 7 — Fashion / Apparel:

"A model in a tailored camel overcoat walks confidently down a rain-wet city sidewalk at night. Neon signs reflect in the puddles on the ground. Medium tracking shot following from the side. Cinematic fashion film aesthetic, cool blue-orange color grade. Shallow depth of field. Sound of footsteps on wet pavement and distant city ambience. 9:16, 8 seconds."

Why it works: The wet surface reflections, neon lighting, and fashion film reference push Veo into its cinematic sweet spot. Audio cues add immersion.

Prompt 8 — Technology / SaaS:

"A close-up of hands typing on a sleek laptop keyboard in a modern co-working space. The laptop screen shows a clean dashboard interface (screen content blurred). Natural window light from the left, soft and even. Camera slowly racks focus from the hands to the laptop screen. Professional, clean, aspirational mood. 16:9, 6 seconds."

Why it works: Blurring the screen content avoids text rendering issues. The rack focus adds visual interest. Clean professional lighting matches SaaS brand aesthetics.

Prompt 9 — Fitness / Wellness:

"A woman in athletic wear performs a yoga sun salutation on a wooden deck overlooking misty mountains at sunrise. Wide shot transitioning to medium as she flows through the poses. Warm golden sunrise light, lens flare. Cinematic, serene, inspirational mood. Natural sounds of birds and gentle wind. Shot on ARRI Alexa with anamorphic lens. 16:9, 8 seconds."

Why it works: The combination of human movement, natural landscape, and sunrise lighting plays to all of Veo's strengths simultaneously.

Prompt 10 — Automotive:

"A metallic blue sports car parked on a rooftop helipad at twilight. City lights blurring in the background. Camera slowly orbits the car at bumper height, catching reflections of the city lights on the car's polished surface. Cinematic commercial, luxurious and aspirational mood. Deep blue and amber color palette. 16:9, 8 seconds. No people, no text."

Why it works: Vehicle surfaces with complex reflections are challenging for AI, but Veo handles them well when given precise lighting and environment context.

Prompt 11 — Unboxing / Product Reveal:

"Hands carefully lift the lid off a premium minimalist white box, revealing a rose gold smartwatch nestled in velvet. Top-down close-up shot with soft studio lighting. The lid lifts slowly, creating a satisfying reveal moment. Warm neutral color palette, clean commercial aesthetic. Sound of the box opening softly. 1:1, 6 seconds."

Why it works: Product reveals are one of the most requested ad formats. Veo's attention to material textures (velvet, metal, cardboard) makes unboxing content feel tactile and premium.

Prompt 12 — Pet / Animal Content:

"A golden retriever puppy runs through a sunlit meadow of wildflowers toward the camera in slow motion. Ears bouncing, tongue out, pure joy. Eye-level tracking shot. Golden hour backlighting creating a warm glow around the fur. Shallow depth of field, 35mm film look. Natural outdoor sounds. 9:16, 6 seconds."

Why it works: Animal content consistently performs well on social platforms. Veo renders fur, movement, and natural light accurately. The slow motion and backlighting create an emotional, shareable moment.

Vibrant note cards strung on a line outdoors in Mantova.
Photo by Edoardo Tommasini on Pexels

Prompting for OpenAI Sora 2

Sora 2 is OpenAI's second-generation video model, and it excels at cinematic storytelling, creative and stylized content, physics-accurate motion, and complex multi-shot sequences. If you want footage that feels directed — with narrative intent and artistic vision — Sora is your model. On AdCreate, Sora 2 Fast costs 15 credits and Sora 2 Pro costs 60 credits per generation. For a detailed comparison of both models, read our Veo 3 vs Sora 2 comparison.

What Sora 2 Does Best

  • Cinematic storytelling — complex sequences with narrative arc and emotional weight
  • Physics simulation — gravity, fluid dynamics, fabric, collisions rendered accurately
  • Creative and stylized content — built-in presets for Film Noir, Papercraft, Claymation, and more
  • Surreal and conceptual imagery — metaphorical visuals, impossible scenarios, dream sequences
  • Multi-shot instructions — describe a sequence of camera movements and Sora follows them in order
  • Character movement — fluid, naturalistic motion with believable body mechanics

Sora-Specific Modifiers That Work Well

These keywords and phrases produce consistently strong results with Sora 2:

  • "Film noir style" — triggers black-and-white high-contrast rendering with dramatic shadows
  • "Wes Anderson aesthetic" — produces symmetrical framing, pastel palettes, and whimsical composition
  • "Studio Ghibli inspired" — generates anime-influenced art with lush landscapes
  • "Stop motion animation" — creates a charming tactile stop-motion look
  • "Papercraft style" — generates scenes that look like paper cut-out animations
  • "Surreal, dreamlike" — unlocks Sora's strength in impossible/conceptual scenarios
  • "One continuous take" — encourages the model to maintain a seamless unbroken shot
  • "Physics simulation" — emphasizes realistic physical interactions
  • "Claymation" — produces clay-textured animated characters and scenes
  • "Narrative sequence" — prompts Sora to structure the clip with a beginning, middle, and end

10+ Example Prompts for Sora 2

Prompt 1 — Conceptual Brand Video:

"A single seed falls in slow motion onto dark rich soil. As it lands, a brilliant green sprout erupts and rapidly grows into a towering tree, its branches spreading wide and bursting with colorful blossoms. The camera pulls back to reveal a thriving garden that fills an entire city rooftop. Cinematic, inspiring, dreamlike quality. Warm golden light. One continuous upward camera movement. 16:9, 8 seconds."

Why it works: Sora thrives on transformation sequences and metaphorical storytelling. The continuous growth from seed to tree to garden is a narrative arc the model executes beautifully.

Prompt 2 — Social Media Hook (Pattern Interrupt):

"A coffee mug sits on an office desk. Suddenly, the desk begins to transform — papers fold themselves into origami birds and fly away, the keyboard keys ripple like piano keys playing a melody, and the mug sprouts tiny legs and walks to the edge of the desk. Whimsical, playful, slightly surreal. Close-up to medium pull-back. Bright, even office lighting. 9:16, 6 seconds."

Why it works: Pattern-interrupt content stops the scroll. Sora's physics engine and creative capabilities make impossible scenarios look polished and intentional.

Prompt 3 — Luxury Brand Storytelling:

"A single red rose petal falls through the air in extreme slow motion, tumbling gracefully. As it descends, the background transitions from a dark void to an elegant marble surface where a luxury perfume bottle catches the petal. The camera slowly orbits the bottle as more petals rain down gently. Film noir meets luxury commercial aesthetic. Rich reds and deep blacks. 16:9, 8 seconds."

Why it works: Sora handles slow-motion physics (falling petals, air resistance) with precision. The dark-to-reveal transition is a narrative technique the model follows well.

Prompt 4 — Explainer Concept (Abstract):

"A glowing blue data stream flows like a river through a miniature city made of circuit boards and microchips. The stream splits into tributaries that flow into tiny buildings, lighting them up one by one. Camera follows the flow of data from a bird's eye view, descending into street level. Futuristic, clean, optimistic color palette of blue, white, and soft green. 16:9, 8 seconds."

Why it works: Abstract concepts like "data flow" or "connectivity" are impossible to film but natural for Sora. The miniature city metaphor is something the model renders with impressive detail.

Prompt 5 — UGC-Style Social Content:

"A young woman films herself on her front-facing phone camera as she opens a package at her kitchen counter. She holds up a colorful wellness supplement jar, reads the label excitedly, and gives a thumbs-up directly to camera. POV selfie camera angle, slightly shaky handheld feel. Natural kitchen lighting, casual and authentic mood. 9:16, 8 seconds."

Why it works: Sora can simulate the front-facing camera look that defines UGC content. The slightly shaky handheld instruction adds authenticity. Pair this with AdCreate's AI-generated scripts for a complete UGC-style video workflow.

Prompt 6 — Seasonal Campaign (Holiday):

"A cozy living room with a decorated Christmas tree and a crackling fireplace. Snow falls gently outside the window. A wrapped gift box on the floor slowly unties its own ribbon and opens to release a warm golden light that fills the room. Camera pushes in slowly from wide to close on the gift. Warm, magical, festive mood. Rich golds, deep greens, and soft whites. 1:1, 8 seconds."

Why it works: Sora handles magical/impossible elements (self-opening gifts, glowing light) naturally. Seasonal content benefits from the model's ability to create emotionally resonant atmospheres.

Prompt 7 — App/Product Demo Feel:

"A smartphone floats in a clean void, slowly rotating. As it rotates, colorful UI elements — notifications, calendar events, charts — burst out of the screen in 3D, orbiting around the phone like planets around a sun. The elements are vibrant and crisp. Clean, modern, tech-forward aesthetic. White background with subtle gradient. Camera slowly orbits the phone. 9:16, 6 seconds."

Why it works: Floating product with 3D elements is a Sora specialty. The orbiting UI elements create visual excitement that works well for app marketing.

Prompt 8 — Before/After Transformation:

"Split screen effect: on the left, a cluttered chaotic office desk piled with papers, coffee cups, and sticky notes. On the right, the same desk completely organized — clean surface, single laptop, one plant. The camera slowly pushes forward equally on both sides. Bright, clinical lighting on the organized side, dim yellowish light on the cluttered side. 16:9, 6 seconds."

Why it works: Before/after is a proven ad format. Sora handles the split-screen concept and the contrasting lighting/mood between the two sides effectively.

Prompt 9 — Wes Anderson Inspired Brand Content:

"A perfectly symmetrical overhead shot of a desk with items arranged in a precise grid: a leather notebook, a fountain pen, a brass compass, a vintage camera, and a cup of tea. Hands enter from the bottom of frame and carefully adjust each item's position by millimeters. Wes Anderson aesthetic — pastel color palette, extreme symmetry, deadpan composition. Static camera, no movement. Soft even lighting. 1:1, 6 seconds."

Why it works: The Wes Anderson modifier is one of Sora's most reliable stylistic triggers. The symmetrical composition and precise item arrangement play to the model's compositional strengths.

Prompt 10 — Motion Graphics Style:

"Bold geometric shapes — circles, triangles, and rectangles — in bright primary colors slide, bounce, and stack into the shape of a house. The animation is smooth, playful, and satisfying. Clean white background. Flat design, motion graphics style with subtle shadows. Camera static, content fills the frame. 9:16, 5 seconds."

Why it works: Sora handles geometric animation and motion graphics aesthetics well. This type of content is perfect for explainer ads, real estate, and insurance brands.

Prompt 11 — Emotional Storytelling:

"An elderly man sits alone at a large dining table set for one. He stares at an empty chair across from him. Then the doorbell rings. He stands slowly, walks to the door, and opens it to find his daughter and grandchildren with bags and presents. His face transforms from solitude to pure joy. Warm interior lighting, cinematic close-ups intercut with wide shots. Intimate, emotional, hopeful mood. 16:9, 8 seconds."

Why it works: Sora excels at narrative sequences with emotional arcs. The contrast between loneliness and reunion is a storytelling structure the model handles with surprising nuance.

Prompt 12 — Claymation Product Ad:

"A claymation character — a friendly round figure with big eyes — picks up a tiny clay version of a coffee bag, opens it, and sniffs deeply with exaggerated satisfaction. Hearts pop out of its head. Claymation stop-motion style with visible fingerprint textures on the clay. Colorful, playful, warm lighting. Static camera at eye level. 9:16, 6 seconds."

Why it works: The Claymation preset transforms Sora's output into charming stop-motion content that stands out in feeds dominated by photorealistic content. The exaggerated character reactions add personality.

Prompting for Image-to-Video with Wan 2.5

Wan 2.5 is the model behind AdCreate's image-to-video feature, and it works fundamentally differently from text-to-video. Instead of generating a scene from scratch, Wan 2.5 animates a still image — adding motion, camera movement, and subtle environmental effects to your existing visual assets.

At just 5 credits per generation, it is the most affordable way to create video content from your existing product photography and brand imagery.

How Image-to-Video Prompting Differs

With text-to-video, your prompt describes the entire scene. With image-to-video, the scene already exists in your uploaded image. Your prompt describes only the motion — what moves, how the camera behaves, and what subtle changes occur over time.

Key principle: Your prompt should describe movement and transformation, not the scene itself. The model already sees the image. Tell it what happens next.

Image-to-Video Prompt Examples

Prompt 1 — Product Rotation:

"Camera slowly orbits the product from left to right, revealing different angles. Subtle light shift as the camera moves. Smooth constant speed."

Use when: You have a product photo and want a 360-degree showcase feel.

Prompt 2 — Zoom and Reveal:

"Camera slowly pushes in toward the product, gradually revealing fine surface details and texture. Shallow depth of field increases as camera gets closer."

Use when: You want to highlight product craftsmanship, material quality, or intricate details.

Prompt 3 — Environmental Animation:

"The background environment comes alive — leaves sway gently in a breeze, light shifts subtly as clouds pass, and shadows move naturally. The product remains stable and sharp."

Use when: Your product is in a lifestyle setting and you want the environment to feel alive without the product moving.

Prompt 4 — Parallax Depth Effect:

"Subtle parallax movement — the foreground and background shift at different speeds as the camera drifts slightly to the right, creating a 3D depth effect."

Use when: You want to add dimension and depth to a flat product photo.

Prompt 5 — Dramatic Lighting Shift:

"Lighting gradually transitions from cool blue tones to warm golden tones, as if the sun is rising behind the product. Shadows shift accordingly."

Use when: You want a mood transition or want to showcase how a product looks under different lighting.

Prompt 6 — Gentle Hover/Float:

"The product gently floats upward a few centimeters, hovers, and slowly rotates. A soft shadow beneath shifts to match the movement. Clean, weightless feel."

Use when: You want a premium, aspirational feel for tech products, cosmetics, or luxury items.

Tips for Image-to-Video Prompting

  • Start simple. "Slow zoom in" or "gentle camera pan left" produces reliable results. Add complexity gradually.
  • Specify what stays still. "The product remains sharp and stationary while the background blurs and shifts" prevents the model from distorting your product.
  • Use subtle motion. Dramatic camera movements can distort the original image. Subtle, slow movements preserve quality.
  • Describe the physics. "Hair moves gently as if in a light breeze" is better than "make the hair move."
  • Keep prompts short. Image-to-video prompts work best at 1-3 sentences. The image provides all the visual context.
A close-up of a vintage typewriter with 'Write something' typed on paper.
Photo by Markus Winkler on Pexels

Prompt Templates by Ad Type

Here are ready-to-use prompt templates organized by advertising use case. Swap in your own product and brand details. These templates work with both Veo 3.1 and Sora 2 on AdCreate — choose the model based on whether you need photorealism (Veo) or creative style (Sora).

Product Showcase Prompts

Template 1 — Hero Product Reveal:

"[Product] resting on [surface] in a [setting]. Dramatic spotlight illuminates the product from above, with the rest of the scene in shadow. Camera slowly dollies in from medium to close-up, revealing surface texture and design details. Cinematic commercial style. [Brand color palette]. 16:9, 6 seconds. No text."

Template 2 — Product in Use:

"Close-up of hands [using the product] — [specific action]. [Environment context]. Natural, authentic feel. Camera at table level, slight push-in. Warm natural lighting. Sound of [relevant ambient audio]. 9:16, 6 seconds."

Template 3 — Multi-Product Lineup:

"[Number] [products] arranged in a [formation] on a [surface]. Camera slowly tracks across the lineup from left to right, pausing briefly on each. Clean studio lighting, [color] background. Commercial style. 16:9, 8 seconds."

Lifestyle and Brand Prompts

Template 4 — Aspirational Lifestyle:

"[Person description] [activity] in [aspirational setting]. [Season/time of day]. Documentary-style, warm and inviting. Handheld camera, eye level, following the subject. Natural ambient sound. [Color palette]. 9:16, 8 seconds."

Template 5 — Brand Values / Mission:

"A montage of [related scenes that embody brand values]: [scene 1], [scene 2], [scene 3]. Each scene transitions smoothly to the next. Cinematic, inspirational, uplifting mood. Warm golden light throughout. Orchestral music feel. 16:9, 8 seconds."

UGC-Style Prompts

Template 6 — Unboxing Reaction:

"A person sits at [location] and opens a [package description]. They pull out [product], examine it closely, and react with genuine excitement. Front-facing selfie camera angle, slightly shaky handheld. Natural indoor lighting. Casual, authentic, unscripted feel. 9:16, 8 seconds."

Template 7 — Testimonial Setup:

"A [person description] looks directly at the camera in a [casual setting] and speaks enthusiastically. Tight medium shot, iPhone-quality camera feel. Natural window light from one side. Authentic, relatable, conversational. 9:16, 6 seconds."

For more on creating authentic AI UGC content, see our guide on how to create AI UGC videos that convert.

Explainer and Demo Prompts

Template 8 — Process Visualization:

"An abstract visualization of [process]: [visual metaphor]. Clean, modern, minimal background. Smooth flowing motion from left to right across the frame. Tech-forward aesthetic, blue and white color palette. 16:9, 8 seconds."

Template 9 — Problem/Solution Split:

"Left side of frame: [chaotic visual representing the problem]. Right side: [clean visual representing the solution]. A dividing line sweeps from left to right, transforming chaos into order. Clean, satisfying, modern. 16:9, 6 seconds."

Seasonal Campaign Prompts

Template 10 — Holiday / Festive:

"[Holiday-themed setting] with [seasonal decorations]. [Product] placed prominently in the scene. [Festive lighting — twinkle lights, candles, warm glow]. Camera slowly pulls back to reveal the full festive scene. Warm, cozy, magical mood. [Holiday color palette]. 1:1, 8 seconds."

Template 11 — Summer / Outdoor:

"[Product] on a [summer setting — beach towel, poolside table, picnic blanket]. Bright midday sun, vibrant saturated colors. Water/breeze movement in the background. Camera at low angle looking slightly up. Energetic, fresh, vibrant mood. 9:16, 6 seconds."

Template 12 — New Year / Fresh Start:

"A sunrise timelapse over [cityscape/landscape]. As the sun rises, [product/brand element] appears in silhouette and gradually illuminates. Inspirational, hopeful, forward-looking mood. Orange-to-blue gradient sky. Drone shot or wide static. 16:9, 8 seconds."

Common Prompting Mistakes and How to Fix Them

After analyzing thousands of AI video generations, these are the most frequent prompting mistakes — and the fixes that immediately improve output quality.

Mistake 1: Being Too Vague

Bad: "A product video for a water bottle"
Good: "A stainless steel water bottle with a matte olive green finish stands on a granite kitchen counter. Morning sunlight hits the bottle from the right side, creating a sharp highlight on the curved surface. Camera slowly pushes in from medium to close-up. Clean, minimal, commercial. 16:9, 6 seconds."

Fix: Add specific details about material, color, surface, environment, light direction, and camera behavior.

Mistake 2: Overloading the Prompt

Bad: "A woman walks through a forest, then sits by a lake, then picks up a book, then a bird lands on her hand, then she stands up and walks to a cabin, and inside the cabin there is a fire, and she makes tea, and looks out the window at snow."

Good: "A woman in a knit sweater sits beside a still mountain lake reading a book. Morning mist hovers over the water. She looks up from the book and smiles at the view. Wide to medium shot, gentle push-in. Serene, peaceful mood. 16:9, 8 seconds."

Fix: One scene, one action, one mood. AI video models generate 4-8 second clips — you cannot fit a feature film into one generation. For longer narratives, generate individual scenes and chain them together.

Mistake 3: Requesting On-Screen Text

Bad: "A video showing the text 'SALE 50% OFF' appearing on screen in bold letters"
Good: Generate the video without text, then add text overlays using AdCreate's template system.

Fix: Never ask the AI model to generate text in the video. Text rendering is a weakness across all current models. Use your video creation platform's text overlay tools instead. AdCreate's templates handle text, captions, and CTAs as a separate layer — crisp and readable every time.

Mistake 4: Ignoring Camera Instructions

Bad: "A coffee shop scene with people"
Good: "Interior of a bustling coffee shop. Camera at counter height slowly pans from left to right, revealing baristas making drinks, customers chatting, and steam rising from cups. Warm tungsten lighting. Ambient coffee shop sounds. 16:9, 8 seconds."

Fix: Always specify camera position, movement, and speed. Without camera instructions, the model defaults to static or random movement that may not serve your creative intent.

Mistake 5: Forgetting the Mood

Bad: "A woman running on a beach"
Good: "A woman in athletic wear runs along the shoreline at golden hour. Waves crash beside her feet. The mood is empowering, aspirational, and free. Camera tracks alongside at her pace. Warm amber tones, lens flare from the low sun. 9:16, 8 seconds."

Fix: Mood words shape everything — color grading, pacing, composition, lighting intensity. Include at least 2-3 mood descriptors in every prompt.

Mistake 6: Not Specifying Duration

Bad: "A timelapse of a city from day to night"
Good: "A timelapse of a city skyline transitioning from golden afternoon to blue hour to night with illuminated buildings. Fixed tripod wide shot. 8 seconds compressed from several hours. Smooth, gradual light transition."

Fix: Specify the clip duration so the model calibrates the pacing of events within the available time.

Mistake 7: Mixing Incompatible Styles

Bad: "A photorealistic documentary-style video with claymation characters in a cyberpunk neon city filmed like a Wes Anderson movie"
Good: Choose one coherent style and commit to it.

Fix: Style confusion produces incoherent output. Pick one visual direction per generation. If you need multiple styles, generate separate clips.

Close-up of hands writing in a spiral notebook with a pen, seated comfortably.
Photo by RDNE Stock project on Pexels

Advanced Techniques

Once you have mastered the fundamentals, these advanced techniques will push your AI video output to the next level.

Iterative Prompting

Do not try to nail the perfect prompt on your first attempt. Use a three-round approach:

  1. Round 1 — Concept test. Write a simple prompt to test whether the model can handle your concept. Use Fast mode (Veo 3.1 Fast at 8 credits, Sora 2 Fast at 15 credits) to minimize cost.
  2. Round 2 — Refinement. Based on the Round 1 output, add or modify specific details. If the lighting was too dark, specify brighter. If the camera moved too fast, add "slow, gentle movement."
  3. Round 3 — Polish. Switch to Pro mode (Veo 3.1 Pro at 40 credits, Sora 2 Pro at 60 credits) for maximum quality. Your refined prompt produces the best possible output at the highest fidelity.

This approach costs less than jumping straight to Pro mode with an untested prompt.

Chaining Scenes for Multi-Clip Campaigns

Single AI video clips are 4-8 seconds. Ads are 15-60 seconds. Bridging this gap requires scene chaining — generating multiple clips that cut together into a cohesive sequence.

The key to scene chaining is consistency descriptors. Include identical language for recurring elements across all prompts in a sequence:

Scene 1 prompt: "A woman with shoulder-length auburn hair and a white linen blouse sits at a wooden desk in a bright home office. She looks frustrated at her laptop. Medium shot, warm natural window light from the left. 16:9, 6 seconds."

Scene 2 prompt: "The same woman with shoulder-length auburn hair and a white linen blouse leans back in her chair, smiling with relief as she discovers a solution on her laptop screen. Same bright home office, same warm natural window light from the left. Medium shot, slight push-in. 16:9, 6 seconds."

Scene 3 prompt: "Close-up of the same woman with shoulder-length auburn hair looking directly at the camera with a confident smile. Same bright home office background, blurred. Same warm natural window light. 16:9, 4 seconds."

By repeating character description, environment, and lighting details verbatim across prompts, you maximize visual consistency across cuts. AdCreate's ad frameworks and Brick System handle the assembly — write your prompts for the scripts with our ad script writing guide, then generate the matching visuals.

Style Consistency with Reference Language

When creating a campaign with multiple videos that need a unified visual identity, establish a "style block" — a paragraph of consistent descriptors that you paste into every prompt:

Example style block:

"Cinematic commercial style. Shot on ARRI Alexa with anamorphic lens. Warm earth tone color palette — amber, terracotta, cream, olive. Shallow depth of field. Natural lighting. No text overlays. No logos. 16:9."

Paste this block at the end of every prompt in the campaign. The visual details vary per scene, but the overarching aesthetic remains consistent.

Prompt Cheat Sheet by Model

Element Veo 3.1 Best Practice Sora 2 Best Practice
Style reference "Shot on ARRI Alexa" / "RED V-Raptor" "Film noir" / "Wes Anderson" / presets
Lighting Specific natural lighting descriptions Mood-based lighting ("dramatic", "ethereal")
Motion Camera movement terms (dolly, truck, crane) Narrative instructions ("the camera follows", "we see")
Audio "Natural ambient sound of [specific]" "Sound of [specific action]"
Realism "Photorealistic, 4K, high detail" "Cinematic, stylized, artistic"
Best prompts Long, specific, technical Narrative, descriptive, creative

Using AdCreate's Platform to Maximize Prompt Effectiveness

AdCreate streamlines the prompting process in several ways. The text-to-video pipeline lets you select your model (Veo 3.1 or Sora 2), quality tier (Fast or Pro), and aspect ratio before writing your prompt. The platform also provides prompt suggestions and examples tailored to your selected model.

For campaigns, AdCreate's Brick System means you do not need every prompt to produce a complete ad. Generate individual Bricks — a hook clip, a product showcase, a lifestyle scene, a CTA background — and assemble them into complete ads using templates and frameworks. This modular approach means each prompt can focus on doing one thing well.

Visit the pricing page to see how credits work across models and quality tiers, or start with 50 free credits to test your prompts without commitment.

Frequently Asked Questions

What are the best ai video prompts for advertising?

The best ai video prompts for advertising include all six elements of the prompt framework: subject/action, visual style/mood, camera movement/angle, lighting/color palette, aspect ratio/duration, and negative instructions. For product ads, focus on specific material descriptions, precise lighting, and subtle camera movements. For brand storytelling, emphasize mood, narrative arc, and emotional tone. Use Veo 3.1 for photorealistic content and Sora 2 for stylized or conceptual content.

How do I write prompts for Sora 2?

Sora 2 responds best to narrative, descriptive prompts that tell a story rather than listing technical specifications. Use cinematic language ("the camera reveals," "we discover," "the scene transforms"), include emotional modifiers ("intimate," "epic," "whimsical"), and leverage Sora's style presets (Film Noir, Papercraft, Claymation, Wes Anderson aesthetic). Sora excels at conceptual and impossible scenarios — do not limit your prompts to things you could film with a real camera.

How do I write prompts for Veo 3.1?

Veo 3.1 responds best to technically specific prompts. Reference real camera systems ("shot on ARRI Alexa"), specify lens characteristics ("anamorphic lens," "85mm portrait lens," "shallow depth of field f/1.4"), describe lighting with precision ("golden hour backlight with rim lighting from the right"), and use photography/cinematography terminology. Include audio instructions ("natural ambient sound of [environment]") to activate Veo's native audio generation.

What is an ai video prompt generator?

An ai video prompt generator is a tool that helps you craft optimized prompts for AI video models. Rather than writing prompts from scratch, these tools provide structured templates, suggest effective modifiers, and format your creative intent into prompt language that models respond to well. AdCreate includes prompt guidance within its text-to-video interface, suggesting model-specific language and structure as you write.

How many credits does text-to-video cost on AdCreate?

AdCreate's credit costs depend on the model and quality tier: Veo 3.1 Fast costs 8 credits, Veo 3.1 Pro costs 40 credits, Sora 2 Fast costs 15 credits, Sora 2 Pro costs 60 credits, and Wan 2.5 image-to-video costs 5 credits. The Starter plan includes 500 credits per month at $39/month ($23/month billed annually). A free tier of 50 credits is available with no credit card required.

Should I use the same prompt for Veo and Sora?

No. While the same prompt will produce output on both models, you will get significantly better results by tailoring your prompt to each model's strengths. Veo 3.1 responds to technical cinematography language, camera equipment references, and precise lighting descriptions. Sora 2 responds to narrative language, style presets, and creative/conceptual instructions. Think of it like directing two different cinematographers — you adjust your direction based on their strengths.

How long should an ai video prompt be?

The sweet spot is 2-5 sentences (40-100 words). Too short (under 20 words) and the model lacks direction. Too long (over 150 words) and the model may lose coherence or ignore elements. For image-to-video with Wan 2.5, keep prompts even shorter — 1-3 sentences focused purely on motion. Quality of detail matters more than quantity of words.

Can AI video prompts include audio instructions?

Yes. Both Veo 3.1 and Sora 2 support native audio generation. Include audio cues in your prompt: "natural ambient sound of a busy cafe," "sound of rain on a window," "gentle background music." Veo 3.1 is particularly strong at environmental audio. Sora 2 excels at dialogue synchronization and music. Specifying audio in your prompt ensures the generated video comes with matching soundscapes rather than silence.


The difference between AI video that looks amateur and AI video that looks professional comes down to one thing: the prompt. Every technique in this guide — the six-element framework, model-specific modifiers, template structures, and advanced chaining methods — exists to help you write prompts that produce footage worth using. Start with the templates, adapt them to your brand, iterate based on output, and build a library of prompts that consistently deliver. Better prompts, better video, better ads. Create your first video with AdCreate's text-to-video pipeline — 50 free credits, both Veo 3.1 and Sora 2, no credit card required.

A

Written by

AdCreate Team

Creating AI-powered tools for marketers and creators.

Ready to create AI videos?

Access Veo 3.1, Sora 2, and 13+ AI tools. Free tier available, plans from $23/mo.