Generate video from text

Beta

Generate a video from a text prompt. Describe the scene, camera motion, and mood — the model returns a video with matching audio.

Responds immediately with the job id and created_at timestamp. Poll GET /v2/text-to-video/{id} until the status is completed, then download from result.video_url.

Authentication

AuthorizationBearer
API key authentication

Request

This endpoint expects an object.
promptstringRequired<=5000 characters
Text prompt describing the desired video content
modelenumRequired

Model to use for generation. See Supported Models for details.

Allowed values:
durationintegerRequired

Video duration in seconds. See Supported Models for available durations per model.

resolutionstringRequired

Output video resolution. See Supported Models for available resolutions per model.

fpsintegerOptionalDefaults to 24

Frame rate in frames per second. See Supported Models for available FPS per model and resolution.

camera_motionenumOptional
Apply camera motion effects to the generated video.
generate_audiobooleanOptionalDefaults to trueBeta

Generate audio for the video. When true, the generated video will include AI-generated audio matching the scene. When false, only silent video is generated.

Response

Job submitted successfully
idstring
Unique job identifier. Use this to poll for status.
created_atdatetime
ISO 8601 timestamp of when the job was created.

Errors

400
Bad Request Error
401
Unauthorized Error
402
Payment Required Error
422
Unprocessable Entity Error
429
Too Many Requests Error
500
Internal Server Error
503
Service Unavailable Error