IC-LoRA

IC-LoRA (In-Context LoRA) enables precise video-to-video control in LTX-2 by conditioning generation on reference signals like depth maps, pose skeletons, or edge detections. Unlike standard LoRAs that modify style or effects, IC-LoRAs let you dictate the spatial structure and motion of your output with frame-level precision.

This enables workflows where you can animate depth maps into videos, retarget character motion using pose sequences, or generate videos following precise edge-based compositions.

IC-LoRA Video Walkthrough

Comparison: IC-LoRA vs LoRA

Feature	LoRA	IC-LoRA
Purpose	Style, effects, visual modifications	Structural control, motion guidance
Input	Text prompt only	Text prompt + control signal
Strength	0.5-1.5 adjustable	Always 1.0
Control	Global style influence	Frame-level spatial control
Training Data	Video datasets (single modality)	Paired video + control signals
Use Case	”Make it look like cake"	"Follow this depth map”
Combining	Easily stack multiple	Best used one control type at a time

When to use which:

Need style/effect: Standard LoRA
Need structural control: IC-LoRA
Need both: Combine them (IC-LoRA 1.0 + effect LoRA 0.5-0.8)

Available IC-LoRA Models

LTX-2 provides several official IC-LoRA control adapters, each trained for specific control types:

Depth Control

Control video generation using depth map sequences that represent spatial relationships and 3D structure.

Model: LTX-2-19b-IC-LoRA-Depth-Control

Use Cases:

Architectural walkthroughs with camera movement following depth
Converting 3D scene depth renders into photorealistic video
Maintaining spatial consistency across generated frames
Depth-aware compositing and scene reconstruction

Control Input: Grayscale depth maps where pixel intensity represents distance (closer = darker/lighter depending on convention)

Pose Control

Guide character and human motion using skeletal pose sequences extracted from reference videos or animation.

Model: LTX-2-19b-IC-LoRA-Pose-Control

Use Cases:

Character animation and motion retargeting
Dance sequence generation from pose data
Consistent human motion across style transfers
Precise body positioning and movement control

Control Input: Pose keypoint data typically from OpenPose, MediaPipe, or similar skeleton detection systems

Canny Control

Control composition and structure using Canny edge detection, providing precise outlines and boundaries.

Model: LTX-2-19b-IC-LoRA-Canny-Control

Use Cases:

Line art to video conversion
Architectural rendering with precise edge control
Logo and graphic animation
Maintaining strict compositional structure

Control Input: Canny edge detection maps (white edges on black background)

Video Detailer

Enhance detail, quality, and resolution in video generation workflows, particularly useful for upscaling pipelines.

Model: LTX-2-19b-IC-LoRA-Detailer

Use Cases:

Multi-scale rendering workflows
Video upscaling with detail enhancement
Quality improvement in generation pipelines
Detail recovery in compressed or lower-quality sources

Control Input: Reference video frames for structural guidance

Using IC-LoRAs

In ComfyUI

ComfyUI provides the recommended workflow for IC-LoRA due to its visual node-based interface for styling a video.

Setup:

Install ComfyUI-LTXVideo custom nodes
Download IC-LoRA model from Hugging Face (choose ComfyUI-compatible .safetensors)
Place in ComfyUI/models/loras/ directory
Load the ic-lora.json workflow

Basic Workflow:

Control Signal Input → IC-LoRA Loader → LTX Model → Video Output
       ↓                                    ↓
   Text Prompt ─────────────────────────────┘

Key nodes:

Load Image/Video: Load your control sequence
Apply IC-LoRA: Applies the control adapter to the model
LTX Sampler: Generates video following the control signal
Text Encoder: Provides prompt conditioning

Example workflow steps:

Load depth map or pose skeleton video
Apply IC-LoRA with strength 1.0 (recommended)
Provide descriptive text prompt
Generate video with standard LTX-2 parameters

Preparing Control Signals

The quality of your controls directly impacts IC-LoRA results. Here’s how to prepare each type:

Depth Maps

Tools:

Depthcrafter
Blender/3D software - For synthetic depth renders

Best practices:

Use consistent depth range across all frames
Ensure smooth temporal transitions (avoid flickering)
Match resolution to target generation resolution

Example depth extraction:

1 from transformers import pipeline
2 
3 depth_estimator = pipeline("depth-estimation", model="depth-anything/Depth-Anything-V2-Large")
4 
5 # Process video frame-by-frame
6 for frame in video_frames:
7     depth = depth_estimator(frame)["depth"]
8     # Normalize and save
9     depth_map = (depth / depth.max() * 255).astype(np.uint8)

Pose Skeletons

Tools:

OpenPose - Multi-person pose estimation
MediaPipe Pose - Fast single-person pose
DWPose - High-quality pose estimation

Best practices:

Extract poses at consistent frame rate matching target generation
Ensure pose keypoints are detected reliably across all frames
Handle occlusions gracefully (interpolate missing keypoints)
Use skeleton visualization format (lines connecting keypoints)

Format: Typically 17-18 keypoints for body skeleton rendered as visual overlay

Canny Edges

Tools:

OpenCV Canny edge detection
PIL/Pillow image processing
ComfyUI Canny preprocessor nodes

Best practices:

Adjust threshold values to capture essential edges without noise
Maintain consistent edge thickness across frames
Blur input slightly before edge detection to reduce noise

Example edge extraction:

1 import cv2
2 
3 def extract_canny(frame, low_threshold=100, high_threshold=200):
4     # Convert to grayscale
5     gray = cv2.cvtColor(frame, cv2.COLOR_RGB2GRAY)
6     # Optional: slight blur to reduce noise
7     blurred = cv2.GaussianBlur(gray, (5, 5), 0)
8     # Detect edges
9     edges = cv2.Canny(blurred, low_threshold, high_threshold)
10     return edges

IC-LoRA Parameters

Control Strength

Unlike standard LoRAs where strength typically ranges from 0.5-1.0, IC-LoRAs are designed to work at full strength (1.0).

Why always 1.0?

IC-LoRAs are trained specifically to condition on control signals
Lower strengths result in ignored or inconsistent control
The prompt and control signal balance is built into the training

Exception: The Video Detailer IC-LoRA may benefit from strength adjustment (0.7-1.0) depending on desired detail level.

Resolution and Frame Rate

Match control signal resolution to generation resolution
Control FPS should match target generation FPS
For best results: 704x1216 at 24-30 FPS
IC-LoRAs work at various resolutions but quality depends on control signal clarity

Training Custom IC-LoRAs

Create your own IC-LoRA control adapters using the LTX-Video-Trainer.

Best Practices

Control Signal Quality

Use high-quality control extraction tools
Ensure temporal consistency (smooth transitions between frames)
Match control resolution to generation resolution
Pre-process control signals to remove noise and artifacts

Prompt Alignment

Describe visual style, not control type (“ornate architecture” not “depth map shows…”)
Align prompt with control signal motion and composition
Be specific about materials, lighting, and atmosphere
Avoid contradicting the control structure

Performance Optimization

IC-LoRAs add minimal overhead (less than 10% compute)
Works with FP8 quantized models
Compatible with distilled models for faster generation
Use Video Detailer IC-LoRA for efficient upscaling

Quality Validation

Always test IC-LoRA with simple control signals first
Verify control is being respected before complex generations
Compare with and without IC-LoRA to assess control strength
Iterate on control signal quality before increasing generation complexity

IC-LoRA Troubleshooting

Control Signal Not Being Followed

Symptoms: Generated video ignores control structure

Solutions:

Verify IC-LoRA is loaded correctly (check adapter name)
Ensure strength is set to 1.0 (required for IC-LoRAs)
Check control signal format matches expected input (resolution, channels)
Validate control signal has sufficient contrast/detail
Verify model compatibility (IC-LoRA version matches base model)

Temporal Flickering or Inconsistency

Symptoms: Unstable motion, frame-to-frame inconsistencies

Solutions:

Smooth control signal temporal transitions (use interpolation)
Increase inference steps (try 40-50 instead of 30)
Ensure control signal FPS matches generation FPS
Apply temporal filtering to control signal before generation
Check for abrupt changes in control signal values

Poor Quality or Artifacts

Symptoms: Visual artifacts, degraded quality, unwanted textures

Solutions:

Improve control signal quality (better extraction tools)
Ensure control signal resolution is adequate
Adjust prompt to better describe desired style
Try Video Detailer IC-LoRA for quality enhancement
Check that control signal doesn’t have noise or compression artifacts

Control Too Strong or Rigid

Symptoms: Output looks constrained, lacks natural variation

Solutions:

IC-LoRAs are designed for 1.0 strength - this is expected behavior
Adjust your control signal to be less restrictive (e.g., lighter edges)
Use more flexible prompts that allow style variation
Consider if standard LoRA might be better for your use case

Memory Issues

Symptoms: Out of memory errors during generation

Solutions:

Use FP8 quantized models to reduce VRAM
Reduce generation resolution
Process control signals in smaller batches

Resources

Official Models

Training Resources

LTX-Video-Trainer