render_track_videos
Module: GBC.utils.data_preparation.render_track_videos
This module provides essential video rendering capabilities for experimental validation and quality assessment in human-to-robot motion retargeting. The PoseRenderer class generates side-by-side comparison videos showing original human motion alongside robot motion, serving as a critical tool for validating retargeting quality, debugging pose transfer issues, and creating presentation materials for research results.
📚 Dependencies
This module requires the following Python packages:
- torch- PyTorch framework for neural networks
- numpy- Numerical computing library
- opencv-python (cv2)- Video processing and image manipulation
- trimesh- 3D mesh processing
- human_body_prior- SMPL+H body model
- body_visualizer- Human body mesh visualization
- scipy- Scientific computing (rotation mathematics)
- tqdm- Progress bars for batch processing
- PIL- Image processing
- matplotlib- Plotting and visualization
🔧 System Requirements
Critical Command-Line Tool: This module requires ImageMagick to be installed on your system for advanced image processing operations. Install using:
# Ubuntu/Debian
sudo apt-get install imagemagick
# macOS (with Homebrew)
brew install imagemagick
# CentOS/RHEL
sudo yum install ImageMagick
# Verify installation
convert --version
Note: The convert command from ImageMagick is used internally for image format conversions and batch processing operations.
🎯 Core Functionality
🎬 PoseRenderer
Module Name: GBC.utils.data_preparation.render_track_videos.PoseRenderer
Definition:
class PoseRenderer:
    def __init__(self,
                 urdf_path: str,
                 dataset_path: str,
                 mapping_table: Dict[str, str],
                 smplh_model_path: str,
                 dmpls_model_path: str,
                 poseformer_model_path: str,
                 save_path: str,
                 secondary_dir: Optional[str] = None,
                 device: str = "cuda",
                 max_single_batch: int = 128,
                 without_model: bool = False,
                 smpl_fits_dir: Optional[str] = None
                ):
📥 Initialization Parameters:
- urdf_path (str): Path to robot URDF description file
- dataset_path (str): Root directory of AMASS motion capture dataset
- mapping_table (Dict[str, str]): SMPL+H to robot joint mapping (see create_smplh.md)
- smplh_model_path (str): Path to SMPL+H human body model file
- dmpls_model_path (str): Path to DMPL muscle model file
- poseformer_model_path (str): Path to trained PoseTransformer model
- save_path (str): Output directory for rendered videos
- secondary_dir (Optional[str]): Specific AMASS subdirectory to process
- device (str): Computing device ("cuda" or "cpu")
- max_single_batch (int): Maximum frames processed in single batch (memory management)
- without_model (bool): Skip loading PoseTransformer model (for pre-computed actions)
- smpl_fits_dir (Optional[str]): Path to fitted SMPL+H parameters (.pt file)
🔧 Core Features:
- Side-by-Side Comparison: Human motion vs robot motion visualization
- Batch Processing: Efficient handling of large motion sequences
- Quality Assessment: Visual validation of retargeting accuracy
- Filtering Integration: Built-in motion smoothing and post-processing
- Multi-Format Output: Individual frames and compiled videos
🚀 Primary Rendering Methods
🎯 Complete Motion Rendering
Method Signature:
@torch.no_grad()
def render_pose(self, pose: torch.Tensor, save_path: str, fps: torch.Tensor, name: str)
📥 Input Parameters:
- pose (torch.Tensor): Human pose sequence [1, num_frames, 66](SMPL+H format)
- save_path (str): Output directory for this specific motion sequence
- fps (torch.Tensor): Frame rate for video generation
- name (str): Motion sequence identifier
🔧 Processing Pipeline:
- Motion Filtering: Apply temporal smoothing to robot actions
- Batch Processing: Split large sequences for memory efficiency
- Dual Rendering: Generate human and robot visualizations
- Video Compilation: Create side-by-side comparison video
- Cleanup: Remove temporary frame directories
💡 Usage Example:
from GBC.utils.data_preparation.render_track_videos import PoseRenderer
# Initialize renderer
renderer = PoseRenderer(
    urdf_path="/path/to/robot.urdf",
    dataset_path="/path/to/AMASS",
    mapping_table=your_mapping_table,
    smplh_model_path="/path/to/smplh/model.npz",
    dmpls_model_path="/path/to/dmpls/model.npz", 
    poseformer_model_path="/path/to/trained_model.pt",
    save_path="/output/videos",
    device="cuda"
)
# Render motion sequence
pose_sequence = torch.randn(1, 300, 66)  # 300 frames
fps = torch.tensor(30.0)
renderer.render_pose(pose_sequence, "/output/walking_motion", fps, "walking")
🎬 Pre-Computed Action Rendering
Method Signature:
def render_pose_without_model(self, pose: torch.Tensor, action: torch.Tensor, save_path: str, fps: int)
📥 Input Parameters:
- pose (torch.Tensor): Human pose sequence for visualization
- action (torch.Tensor): Pre-computed robot actions
- save_path (str): Output directory
- fps (int): Video frame rate
💡 Usage Scenario: When robot actions are already computed and you want to generate comparison videos without re-running inference.
# Use pre-computed actions
human_poses = torch.load("human_motion.pt")
robot_actions = torch.load("robot_actions.pt")
renderer.render_pose_without_model(
    pose=human_poses,
    action=robot_actions,
    save_path="/output/validation",
    fps=30
)
🤖 Robot-Only Rendering
Method Signature:
def render_pose_without_model_urdf(self, action: torch.Tensor, save_path: str, fps: int, name: str = "video")
📥 Input Parameters:
- action (torch.Tensor): Robot action sequence
- save_path (str): Output directory
- fps (int): Video frame rate
- name (str): Output video filename (without extension)
💡 Usage Scenario: Generate robot-only motion videos for demonstrations or presentations.
# Robot-only visualization
robot_actions = torch.load("generated_actions.pt")
renderer.render_pose_without_model_urdf(
    action=robot_actions,
    save_path="/output/robot_demo",
    fps=30,
    name="robot_walking"
)
🔧 Internal Rendering Components
🧑🎨 Human Body Rendering
Method Signature:
def render_pose_amass(self, pose: torch.Tensor, face: torch.Tensor, save_path: str, 
                     width: int = 800, height: int = 600, update_mv: bool = False)
🔧 Implementation Details:
def render_pose_amass(self, pose, face, save_path, width=800, height=600, update_mv=False):
    """Render SMPL+H human body mesh"""
    if not hasattr(self, 'mv') or update_mv:
        self.mv = MeshViewer(width=width, height=height, use_offscreen=True)
    
    # Create trimesh from SMPL+H vertices and faces
    body_mesh = trimesh.Trimesh(
        vertices=pose.detach().cpu().numpy(),
        faces=face.detach().cpu().numpy(),
        vertex_colors=np.tile(colors['grey'], (6890, 1))
    )
    
    # Render to image
    self.mv.set_static_meshes([body_mesh])
    body_image = self.mv.render(render_wireframe=False)
    cv2.imwrite(save_path, body_image)
🤖 Robot Rendering
Method Signature:
def render_pose_robot(self, action: torch.Tensor, save_path: str, root_tf: Optional[torch.Tensor] = None)
🔧 Implementation Details:
def render_pose_robot(self, action, save_path, root_tf=None):
    """Render robot in specified joint configuration"""
    # Apply coordinate transformation for consistent viewpoint
    robot_vis_tf = torch.zeros((4, 4))
    robot_vis_tf[range(4), [1, 2, 0, 3]] = 1  # Y→X, Z→Y, X→Z transformation
    
    # Generate robot visualization
    img = self.robot_vis(action, root_tf or robot_vis_tf)
    cv2.imwrite(save_path, img)
🎞️ Video Compilation
Video Generation Pipeline:
# 1. Collect all rendered frames
bm_images = glob(os.path.join(bm_save_path, "*.png"))
fk_images = glob(os.path.join(fk_save_path, "*.png"))
# 2. Calculate optimal frame dimensions
bm_img = cv2.imread(bm_images[0])
fk_img = cv2.imread(fk_images[0])
h, w, _ = bm_img.shape
h_fk, w_fk, _ = fk_img.shape
w_new = int(w * h_fk / h)  # Scale to match height
framesize = (w_new + w_fk, h_fk)
# 3. Setup video writer
fourcc = cv2.VideoWriter_fourcc(*'XVID')
out = cv2.VideoWriter(video_name, fourcc, fps.item(), framesize)
# 4. Combine frames side-by-side
for bm_img_path, fk_img_path in zip(bm_images, fk_images):
    bm_img = cv2.imread(bm_img_path)
    fk_img = cv2.imread(fk_img_path)
    
    # Resize to match heights
    bm_img = cv2.resize(bm_img, (w_new, h_fk))
    
    # Horizontal concatenation
    combined_frame = np.hstack((bm_img, fk_img))
    out.write(combined_frame)
out.release()
💡 Practical Applications
🧪 Experimental Validation Workflow
Quality Assessment Pipeline:
def validate_retargeting_quality(dataset_dir, model_path, output_dir):
    """Comprehensive validation of retargeting quality"""
    
    # Initialize renderer with validation configuration
    renderer = PoseRenderer(
        urdf_path="/path/to/robot.urdf",
        dataset_path=dataset_dir,
        mapping_table=mapping_table,
        smplh_model_path="/path/to/smplh/model.npz",
        dmpls_model_path="/path/to/dmpls/model.npz",
        poseformer_model_path=model_path,
        save_path=output_dir,
        device="cuda",
        max_single_batch=64  # Conservative for stability
    )
    
    # Process validation dataset
    dataloader = DataLoader(renderer.dataset, batch_size=1, shuffle=False)
    
    for i, data in enumerate(dataloader):
        if i >= 10:  # Limit for quick validation
            break
            
        pose = data['poses'].to(renderer.fk.device)
        fps = data['fps']
        motion_name = data['title'][0]
        
        # Generate comparison video
        output_path = os.path.join(output_dir, f"validation_{i:03d}_{motion_name}")
        renderer.render_pose(pose, output_path, fps, motion_name)
        
        print(f"Rendered validation video: {motion_name}")
🔍 Debug and Analysis Tools
Frame-by-Frame Analysis:
def analyze_pose_transfer_frame(renderer, human_pose, robot_action, frame_idx, output_dir):
    """Detailed analysis of specific frame"""
    
    # Generate human body mesh
    body_params = renderer.get_pose_amass(human_pose[frame_idx:frame_idx+1])
    human_img_path = f"{output_dir}/human_frame_{frame_idx}.png"
    renderer.render_pose_amass(body_params.v[0], body_params.f, human_img_path)
    
    # Generate robot pose
    robot_img_path = f"{output_dir}/robot_frame_{frame_idx}.png"
    renderer.render_pose_robot(robot_action[frame_idx], robot_img_path)
    
    # Create side-by-side comparison
    human_img = cv2.imread(human_img_path)
    robot_img = cv2.imread(robot_img_path)
    
    # Resize to match heights and combine
    h_target = min(human_img.shape[0], robot_img.shape[0])
    human_resized = cv2.resize(human_img, (int(human_img.shape[1] * h_target / human_img.shape[0]), h_target))
    robot_resized = cv2.resize(robot_img, (int(robot_img.shape[1] * h_target / robot_img.shape[0]), h_target))
    
    comparison = np.hstack([human_resized, robot_resized])
    cv2.imwrite(f"{output_dir}/comparison_frame_{frame_idx}.png", comparison)
📊 Batch Validation for Training
Training Integration Example:
class PoseTransformerTrainer:
    def __init__(self, ..., validation_renderer=None):
        self.validation_renderer = validation_renderer
    
    def validate_with_videos(self, epoch, num_samples=3):
        """Generate validation videos during training"""
        if self.validation_renderer is None:
            return
            
        self.model.eval()
        with torch.no_grad():
            sample_data = next(iter(self.test_loader))
            poses = sample_data['poses'][:num_samples]
            
            for i, pose in enumerate(poses):
                # Generate robot actions
                pose_body = pose[:, 3:66]
                actions = self.model(pose_body)
                
                # Create validation video
                output_path = f"validation/epoch_{epoch}_sample_{i}"
                fps = sample_data['fps'][i] if i < len(sample_data['fps']) else 30
                
                self.validation_renderer.render_pose_without_model(
                    pose=pose.unsqueeze(0),
                    action=actions,
                    save_path=output_path,
                    fps=int(fps)
                )
🛠️ Configuration and Advanced Usage
⚙️ Memory Management
Large Sequence Handling:
# Configure for large motion sequences
renderer = PoseRenderer(
    # ... other parameters
    max_single_batch=32,  # Reduce batch size for memory-constrained systems
    device="cuda"         # Use GPU for faster processing
)
# Monitor memory usage during rendering
import psutil
import GPUtil
def monitor_resources():
    """Monitor system resources during rendering"""
    cpu_percent = psutil.cpu_percent()
    memory_percent = psutil.virtual_memory().percent
    
    try:
        gpus = GPUtil.getGPUs()
        gpu_memory = gpus[0].memoryUtil * 100 if gpus else 0
        print(f"CPU: {cpu_percent}%, RAM: {memory_percent}%, GPU: {gpu_memory}%")
    except:
        print(f"CPU: {cpu_percent}%, RAM: {memory_percent}%")
🎨 Visual Quality Optimization
High-Quality Rendering Configuration:
# High-resolution rendering setup
renderer.robot_vis = RobotVisualizer(
    urdf_path=urdf_path,
    width=1920,   # Full HD width
    height=1080,  # Full HD height
    robot_transmission=0.0,  # Fully opaque robot
    use_offscreen=True,
    device=device
)
# Enhanced human body rendering
renderer.mv = MeshViewer(
    width=1920,
    height=1080,
    use_offscreen=True
)
🔄 Custom Post-Processing
Motion Filtering Integration:
from GBC.utils.data_preparation.amass_action_converter import TrackingDataPostProcess
from GBC.utils.data_preparation.data_preparation_cfg import FilterCfg
def setup_advanced_filtering(renderer, fps):
    """Configure advanced motion filtering"""
    filter_cfg = FilterCfg(
        filter_sample_rate=fps,
        filter_cutoff=fps / 8.0,  # Aggressive smoothing
        filter_order=4,           # Higher order for smoother results
        device=renderer.fk.device
    )
    
    return TrackingDataPostProcess(filter_cfg=filter_cfg)
# Usage in rendering
def render_with_custom_filtering(renderer, pose, save_path, fps, name):
    """Render with custom motion filtering"""
    filter_processor = setup_advanced_filtering(renderer, fps)
    
    # Apply filtering to predicted actions
    pose_body = pose.squeeze(0)[:, 3:66]
    actions = renderer.poseformer(pose_body)
    filtered_actions = filter_processor.filt(actions)
    
    # Render with filtered actions
    renderer.render_pose_without_model(
        pose=pose,
        action=filtered_actions,
        save_path=save_path,
        fps=int(fps)
    )
🚨 Best Practices
✅ Validation Guidelines
- Systematic Testing: Render validation videos for multiple motion types (walking, running, dancing)
- Quality Metrics: Visually assess end-effector alignment, joint smoothness, and overall pose similarity
- Temporal Consistency: Check for jittery motion or unrealistic transitions
- Physical Feasibility: Verify robot poses respect joint limits and kinematic constraints
- Comparative Analysis: Generate videos for different model versions to track improvement
🔧 Performance Optimization
Efficient Batch Processing:
# Optimize for your system configuration
BATCH_SIZE_MAP = {
    "low_memory": 16,    # Systems with <8GB GPU memory
    "medium_memory": 64, # Systems with 8-16GB GPU memory  
    "high_memory": 128   # Systems with >16GB GPU memory
}
def auto_configure_batch_size():
    """Automatically configure batch size based on available GPU memory"""
    try:
        import GPUtil
        gpus = GPUtil.getGPUs()
        if gpus:
            gpu_memory = gpus[0].memoryTotal / 1024  # Convert to GB
            if gpu_memory < 8:
                return BATCH_SIZE_MAP["low_memory"]
            elif gpu_memory < 16:
                return BATCH_SIZE_MAP["medium_memory"]
            else:
                return BATCH_SIZE_MAP["high_memory"]
    except:
        pass
    return BATCH_SIZE_MAP["low_memory"]  # Conservative default
🐛 Troubleshooting
Common Issues and Solutions:
- 
Missing ImageMagick: # Error: convert command not found
 sudo apt-get install imagemagick
- 
Memory Issues: # Reduce batch size
 renderer.max_single_batch = 16
 # Clear GPU cache
 torch.cuda.empty_cache()
- 
Video Corruption: # Ensure proper codec and frame dimensions
 fourcc = cv2.VideoWriter_fourcc(*'mp4v') # More compatible codec
 # Verify frame dimensions are even numbers
 framesize = (width - width % 2, height - height % 2)
- 
Rendering Artifacts: # Update mesh viewer between renders
 renderer.render_pose_amass(..., update_mv=True)
 # Ensure proper coordinate transformations
 robot_vis_tf = torch.zeros((4, 4))
 robot_vis_tf[range(4), [1, 2, 0, 3]] = 1
This powerful rendering framework serves as an essential validation tool, enabling researchers to visually assess retargeting quality and create compelling demonstration materials for human-to-robot motion transfer research.