IC-LoRA (Image Conditioning)
IC-LoRA
IC-LoRA (In-Context LoRA) enables precise video-to-video control in LTX-2 by conditioning generation on reference signals like depth maps, pose skeletons, or edge detections. Unlike standard LoRAs that modify style or effects, IC-LoRAs let you dictate the spatial structure and motion of your output with frame-level precision.
This enables workflows where you can animate depth maps into videos, retarget character motion using pose sequences, or generate videos following precise edge-based compositions.
IC-LoRA Video Walkthrough
Comparison: IC-LoRA vs LoRA
When to use which:
- Need style/effect: Standard LoRA
- Need structural control: IC-LoRA
- Need both: Combine them (IC-LoRA 1.0 + effect LoRA 0.5-0.8)
Available IC-LoRA Models
LTX-2 provides several official IC-LoRA control adapters, each trained for specific control types:
Depth Control
Control video generation using depth map sequences that represent spatial relationships and 3D structure.
Model: LTX-2-19b-IC-LoRA-Depth-Control
Use Cases:
- Architectural walkthroughs with camera movement following depth
- Converting 3D scene depth renders into photorealistic video
- Maintaining spatial consistency across generated frames
- Depth-aware compositing and scene reconstruction
Control Input: Grayscale depth maps where pixel intensity represents distance (closer = darker/lighter depending on convention)
Pose Control
Guide character and human motion using skeletal pose sequences extracted from reference videos or animation.
Model: LTX-2-19b-IC-LoRA-Pose-Control
Use Cases:
- Character animation and motion retargeting
- Dance sequence generation from pose data
- Consistent human motion across style transfers
- Precise body positioning and movement control
Control Input: Pose keypoint data typically from OpenPose, MediaPipe, or similar skeleton detection systems
Canny Control
Control composition and structure using Canny edge detection, providing precise outlines and boundaries.
Model: LTX-2-19b-IC-LoRA-Canny-Control
Use Cases:
- Line art to video conversion
- Architectural rendering with precise edge control
- Logo and graphic animation
- Maintaining strict compositional structure
Control Input: Canny edge detection maps (white edges on black background)
Video Detailer
Enhance detail, quality, and resolution in video generation workflows, particularly useful for upscaling pipelines.
Model: LTX-2-19b-IC-LoRA-Detailer
Use Cases:
- Multi-scale rendering workflows
- Video upscaling with detail enhancement
- Quality improvement in generation pipelines
- Detail recovery in compressed or lower-quality sources
Control Input: Reference video frames for structural guidance
Using IC-LoRAs
In ComfyUI
ComfyUI provides the recommended workflow for IC-LoRA due to its visual node-based interface for styling a video.
Setup:
- Install ComfyUI-LTXVideo custom nodes
- Download IC-LoRA model from Hugging Face (choose ComfyUI-compatible
.safetensors) - Place in
ComfyUI/models/loras/directory - Load the ic-lora.json workflow
Basic Workflow:
Key nodes:
- Load Image/Video: Load your control sequence
- Apply IC-LoRA: Applies the control adapter to the model
- LTX Sampler: Generates video following the control signal
- Text Encoder: Provides prompt conditioning
Example workflow steps:
- Load depth map or pose skeleton video
- Apply IC-LoRA with strength 1.0 (recommended)
- Provide descriptive text prompt
- Generate video with standard LTX-2 parameters
Preparing Control Signals
The quality of your controls directly impacts IC-LoRA results. Here’s how to prepare each type:
Depth Maps
Tools:
- Depthcrafter
- Blender/3D software - For synthetic depth renders
Best practices:
- Use consistent depth range across all frames
- Ensure smooth temporal transitions (avoid flickering)
- Match resolution to target generation resolution
Example depth extraction:
Pose Skeletons
Tools:
- OpenPose - Multi-person pose estimation
- MediaPipe Pose - Fast single-person pose
- DWPose - High-quality pose estimation
Best practices:
- Extract poses at consistent frame rate matching target generation
- Ensure pose keypoints are detected reliably across all frames
- Handle occlusions gracefully (interpolate missing keypoints)
- Use skeleton visualization format (lines connecting keypoints)
Format: Typically 17-18 keypoints for body skeleton rendered as visual overlay
Canny Edges
Tools:
- OpenCV Canny edge detection
- PIL/Pillow image processing
- ComfyUI Canny preprocessor nodes
Best practices:
- Adjust threshold values to capture essential edges without noise
- Maintain consistent edge thickness across frames
- Blur input slightly before edge detection to reduce noise
Example edge extraction:
IC-LoRA Parameters
Control Strength
Unlike standard LoRAs where strength typically ranges from 0.5-1.0, IC-LoRAs are designed to work at full strength (1.0).
Why always 1.0?
- IC-LoRAs are trained specifically to condition on control signals
- Lower strengths result in ignored or inconsistent control
- The prompt and control signal balance is built into the training
Exception: The Video Detailer IC-LoRA may benefit from strength adjustment (0.7-1.0) depending on desired detail level.
Resolution and Frame Rate
- Match control signal resolution to generation resolution
- Control FPS should match target generation FPS
- For best results: 704x1216 at 24-30 FPS
- IC-LoRAs work at various resolutions but quality depends on control signal clarity
Training Custom IC-LoRAs
Create your own IC-LoRA control adapters using the LTX-Video-Trainer.
Best Practices
Control Signal Quality
- Use high-quality control extraction tools
- Ensure temporal consistency (smooth transitions between frames)
- Match control resolution to generation resolution
- Pre-process control signals to remove noise and artifacts
Prompt Alignment
- Describe visual style, not control type (“ornate architecture” not “depth map shows…”)
- Align prompt with control signal motion and composition
- Be specific about materials, lighting, and atmosphere
- Avoid contradicting the control structure
Performance Optimization
- IC-LoRAs add minimal overhead (less than 10% compute)
- Works with FP8 quantized models
- Compatible with distilled models for faster generation
- Use Video Detailer IC-LoRA for efficient upscaling
Quality Validation
- Always test IC-LoRA with simple control signals first
- Verify control is being respected before complex generations
- Compare with and without IC-LoRA to assess control strength
- Iterate on control signal quality before increasing generation complexity
IC-LoRA Troubleshooting
Control Signal Not Being Followed
Symptoms: Generated video ignores control structure
Solutions:
- Verify IC-LoRA is loaded correctly (check adapter name)
- Ensure strength is set to 1.0 (required for IC-LoRAs)
- Check control signal format matches expected input (resolution, channels)
- Validate control signal has sufficient contrast/detail
- Verify model compatibility (IC-LoRA version matches base model)
Temporal Flickering or Inconsistency
Symptoms: Unstable motion, frame-to-frame inconsistencies
Solutions:
- Smooth control signal temporal transitions (use interpolation)
- Increase inference steps (try 40-50 instead of 30)
- Ensure control signal FPS matches generation FPS
- Apply temporal filtering to control signal before generation
- Check for abrupt changes in control signal values
Poor Quality or Artifacts
Symptoms: Visual artifacts, degraded quality, unwanted textures
Solutions:
- Improve control signal quality (better extraction tools)
- Ensure control signal resolution is adequate
- Adjust prompt to better describe desired style
- Try Video Detailer IC-LoRA for quality enhancement
- Check that control signal doesn’t have noise or compression artifacts
Control Too Strong or Rigid
Symptoms: Output looks constrained, lacks natural variation
Solutions:
- IC-LoRAs are designed for 1.0 strength - this is expected behavior
- Adjust your control signal to be less restrictive (e.g., lighter edges)
- Use more flexible prompts that allow style variation
- Consider if standard LoRA might be better for your use case
Memory Issues
Symptoms: Out of memory errors during generation
Solutions:
- Use FP8 quantized models to reduce VRAM
- Reduce generation resolution
- Process control signals in smaller batches