Extend a video by generating additional frames at the beginning or end using AI generation.
Audio is generated for the extended portion if the input video has audio
Authentication
AuthorizationBearer
API key authentication
Request
This endpoint expects an object.
video_uristringRequired
Input video for extending. See [Input Formats](/input-formats#video-input) for supported formats and codecs.
- Supported aspect ratios: 16:9 and 9:16
- Maximum resolution: 3840x2160 (4K)
- Minimum frame count: 73 (around 3 seconds at 24fps)
The output video preserves the input video's resolution.
durationdoubleRequired2-20
Duration in seconds to extend the video. Minimum 2 seconds, maximum 20 seconds (480 frames at 24fps).
promptstringOptional<=5000 characters
Description of what should happen in the extended portion of the video.
modeenumOptionalDefaults to end
Where to extend the video:
- `end` (default): Extends the video at the end.
- `start`: Extends the video at the beginning.
Allowed values:
modelenumOptionalDefaults to ltx-2-3-pro
Model to use for video generation.
Allowed values:
contextdoubleOptional1-20
**Advanced parameter:** Number of seconds from the input video to use as context for the extension (maximum 20 seconds).
The model uses context frames from the input video to generate a more coherent extension.
The sum of `context` + `duration` (converted to frames using the input video's FPS) cannot exceed 505 frames (~21 seconds at 24fps). For higher-FPS inputs, the maximum total duration in seconds will be proportionally lower; for lower-FPS inputs, it will be proportionally higher.
If not provided, defaults to maximize available context within the 505 frame limit while respecting the 20-second cap.
Response headers
x-request-idstring
Unique request identifier for tracking and debugging
Input video for extending. See Input Formats for supported formats and codecs.
Supported aspect ratios: 16:9 and 9:16
Maximum resolution: 3840x2160 (4K)
Minimum frame count: 73 (around 3 seconds at 24fps)
The output video preserves the input video’s resolution.
Where to extend the video:
end (default): Extends the video at the end.
start: Extends the video at the beginning.
Advanced parameter: Number of seconds from the input video to use as context for the extension (maximum 20 seconds).
The model uses context frames from the input video to generate a more coherent extension.
The sum of context + duration (converted to frames using the input video’s FPS) cannot exceed 505 frames (~21 seconds at 24fps). For higher-FPS inputs, the maximum total duration in seconds will be proportionally lower; for lower-FPS inputs, it will be proportionally higher.
If not provided, defaults to maximize available context within the 505 frame limit while respecting the 20-second cap.