PyTorch API
For developers who want direct Python integration or custom workflows beyond ComfyUI, LTX-2 offers two paths: the native ltx-pipelines package (full control, all features) and the HuggingFace Diffusers integration (familiar API, quick start).
Repository Structure
The LTX-2 codebase is a monorepo with three packages:
Requirements
- Python >= 3.10
- CUDA > 12.7
- PyTorch ~= 2.7
See System Requirements for full hardware specifications.
Installation
Download Models
A full list of available checkpoints (including spatial/temporal upscalers, distilled LoRA, IC-LoRA variants, and camera control LoRAs) is on the LTX-2 GitHub repo.
Option 1: Native Pipelines (ltx-pipelines)
The ltx-pipelines package provides the most complete feature set, including two-stage generation, IC-LoRA, and fine-grained guidance control.
Available Pipelines
Text-to-Video Example
This example generates a video using the distilled pipeline with a two-stage approach — base generation followed by upscale refinement.
Dimension constraints: Width and height must be divisible by 32. Frame
count must follow the pattern 8n + 1 (valid values: 1, 9, 17, 25, 33, 41,
49, 57, 65, 73, 81, 89, 97, 121, etc.).
Guidance Parameters
The native pipelines expose MultiModalGuiderParams for fine-grained control over generation:
Memory Optimization
For consumer GPUs, enable FP8 quantization to reduce VRAM usage by ~40% with minimal quality loss:
From the command line:
Additional tip: Set the environment variable PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to improve memory allocation.
Option 2: HuggingFace Diffusers
If you’re already working within the Diffusers ecosystem, LTX-2 is available as a native pipeline:
The Diffusers integration provides a simpler interface but may not expose all
features available in the native ltx-pipelines package (e.g., IC-LoRA,
advanced guidance parameters). For full feature access, use the native
pipelines.
For more on the Diffusers integration, see the HuggingFace documentation.
Generation Parameters Reference
Resolution
Standard aspect ratios:
Higher resolutions are supported (up to 4K) but require significantly more VRAM. Start with lower resolutions for testing.
Frame Count & Duration
Sampling
What’s Next
- Prompting Guide — write prompts that produce coherent motion and audio
- LoRA Guide — customize style, motion, and character appearance
- IC-LoRA Guide — video-to-video with depth, pose, and edge control
- Trainer Documentation — train your own LoRAs