Generate video from text

Generate a video from a text prompt. Describe the scene, camera motion, and mood — the model returns a video with matching audio. Responds immediately with the job `id` and `created_at` timestamp. Poll `GET /v2/text-to-video/{id}` until the status is `completed`, then download from `result.video_url`. Billed per second of generated video. See [Pricing](/pricing).

Authentication

AuthorizationBearer
API key authentication

Request

This endpoint expects an object.
promptstringRequired<=5000 characters
Text prompt describing the desired video content
modelenumRequired

Model to use for generation. See Supported Models for details.

durationintegerRequired

Video duration in seconds. See Supported Models for available durations per model.

resolutionstringRequired

Output video resolution. See Supported Models for available resolutions per model.

fpsintegerOptionalDefaults to 24

Frame rate in frames per second. See Supported Models for available FPS per model and resolution.

generate_audiobooleanOptionalDefaults to true

Generate audio for the video. When true, the generated video will include AI-generated audio matching the scene. When false, only silent video is generated.

camera_motionenumOptional
Apply camera motion effects to the generated video.

Response

Job submitted successfully
idstring
Unique job identifier. Use this to poll for status.
created_atdatetime
ISO 8601 timestamp of when the job was created.

Errors

400
Bad Request Error
401
Unauthorized Error
402
Payment Required Error
422
Unprocessable Entity Error
429
Too Many Requests Error
500
Internal Server Error
503
Service Unavailable Error