render_track_videos

Module: GBC.utils.data_preparation.render_track_videos

This module provides essential video rendering capabilities for experimental validation and quality assessment in human-to-robot motion retargeting. The PoseRenderer class generates side-by-side comparison videos showing original human motion alongside robot motion, serving as a critical tool for validating retargeting quality, debugging pose transfer issues, and creating presentation materials for research results.

📚 Dependencies

This module requires the following Python packages:

torch - PyTorch framework for neural networks
numpy - Numerical computing library
opencv-python (cv2) - Video processing and image manipulation
trimesh - 3D mesh processing
human_body_prior - SMPL+H body model
body_visualizer - Human body mesh visualization
scipy - Scientific computing (rotation mathematics)
tqdm - Progress bars for batch processing
PIL - Image processing
matplotlib - Plotting and visualization

🔧 System Requirements

Critical Command-Line Tool: This module requires ImageMagick to be installed on your system for advanced image processing operations. Install using:

# Ubuntu/Debian
sudo apt-get install imagemagick

# macOS (with Homebrew)
brew install imagemagick

# CentOS/RHEL
sudo yum install ImageMagick

# Verify installation
convert --version

Note: The convert command from ImageMagick is used internally for image format conversions and batch processing operations.

🎯 Core Functionality

🎬 PoseRenderer

Module Name: GBC.utils.data_preparation.render_track_videos.PoseRenderer

Definition:

class PoseRenderer:
    def __init__(self,
                 urdf_path: str,
                 dataset_path: str,
                 mapping_table: Dict[str, str],
                 smplh_model_path: str,
                 dmpls_model_path: str,
                 poseformer_model_path: str,
                 save_path: str,
                 secondary_dir: Optional[str] = None,
                 device: str = "cuda",
                 max_single_batch: int = 128,
                 without_model: bool = False,
                 smpl_fits_dir: Optional[str] = None
                ):

📥 Initialization Parameters:

urdf_path (str): Path to robot URDF description file
dataset_path (str): Root directory of AMASS motion capture dataset
mapping_table (Dict[str, str]): SMPL+H to robot joint mapping (see create_smplh.md)
smplh_model_path (str): Path to SMPL+H human body model file
dmpls_model_path (str): Path to DMPL muscle model file
poseformer_model_path (str): Path to trained PoseTransformer model
save_path (str): Output directory for rendered videos
secondary_dir (Optional[str]): Specific AMASS subdirectory to process
device (str): Computing device ("cuda" or "cpu")
max_single_batch (int): Maximum frames processed in single batch (memory management)
without_model (bool): Skip loading PoseTransformer model (for pre-computed actions)
smpl_fits_dir (Optional[str]): Path to fitted SMPL+H parameters (.pt file)

🔧 Core Features:

Side-by-Side Comparison: Human motion vs robot motion visualization
Batch Processing: Efficient handling of large motion sequences
Quality Assessment: Visual validation of retargeting accuracy
Filtering Integration: Built-in motion smoothing and post-processing
Multi-Format Output: Individual frames and compiled videos

🚀 Primary Rendering Methods

🎯 Complete Motion Rendering

Method Signature:

@torch.no_grad()
def render_pose(self, pose: torch.Tensor, save_path: str, fps: torch.Tensor, name: str)

📥 Input Parameters:

pose (torch.Tensor): Human pose sequence [1, num_frames, 66] (SMPL+H format)
save_path (str): Output directory for this specific motion sequence
fps (torch.Tensor): Frame rate for video generation
name (str): Motion sequence identifier

🔧 Processing Pipeline:

Motion Filtering: Apply temporal smoothing to robot actions
Batch Processing: Split large sequences for memory efficiency
Dual Rendering: Generate human and robot visualizations
Video Compilation: Create side-by-side comparison video
Cleanup: Remove temporary frame directories

💡 Usage Example:

from GBC.utils.data_preparation.render_track_videos import PoseRenderer

# Initialize renderer
renderer = PoseRenderer(
    urdf_path="/path/to/robot.urdf",
    dataset_path="/path/to/AMASS",
    mapping_table=your_mapping_table,
    smplh_model_path="/path/to/smplh/model.npz",
    dmpls_model_path="/path/to/dmpls/model.npz", 
    poseformer_model_path="/path/to/trained_model.pt",
    save_path="/output/videos",
    device="cuda"
)

# Render motion sequence
pose_sequence = torch.randn(1, 300, 66)  # 300 frames
fps = torch.tensor(30.0)
renderer.render_pose(pose_sequence, "/output/walking_motion", fps, "walking")

🎬 Pre-Computed Action Rendering

Method Signature:

def render_pose_without_model(self, pose: torch.Tensor, action: torch.Tensor, save_path: str, fps: int)

📥 Input Parameters:

pose (torch.Tensor): Human pose sequence for visualization
action (torch.Tensor): Pre-computed robot actions
save_path (str): Output directory
fps (int): Video frame rate

💡 Usage Scenario: When robot actions are already computed and you want to generate comparison videos without re-running inference.

# Use pre-computed actions
human_poses = torch.load("human_motion.pt")
robot_actions = torch.load("robot_actions.pt")

renderer.render_pose_without_model(
    pose=human_poses,
    action=robot_actions,
    save_path="/output/validation",
    fps=30
)

🤖 Robot-Only Rendering

Method Signature:

def render_pose_without_model_urdf(self, action: torch.Tensor, save_path: str, fps: int, name: str = "video")

📥 Input Parameters:

action (torch.Tensor): Robot action sequence
save_path (str): Output directory
fps (int): Video frame rate
name (str): Output video filename (without extension)

💡 Usage Scenario: Generate robot-only motion videos for demonstrations or presentations.

# Robot-only visualization
robot_actions = torch.load("generated_actions.pt")
renderer.render_pose_without_model_urdf(
    action=robot_actions,
    save_path="/output/robot_demo",
    fps=30,
    name="robot_walking"
)

🔧 Internal Rendering Components

🧑‍🎨 Human Body Rendering

Method Signature:

def render_pose_amass(self, pose: torch.Tensor, face: torch.Tensor, save_path: str, 
                     width: int = 800, height: int = 600, update_mv: bool = False)

🔧 Implementation Details:

def render_pose_amass(self, pose, face, save_path, width=800, height=600, update_mv=False):
    """Render SMPL+H human body mesh"""
    if not hasattr(self, 'mv') or update_mv:
        self.mv = MeshViewer(width=width, height=height, use_offscreen=True)
    
    # Create trimesh from SMPL+H vertices and faces
    body_mesh = trimesh.Trimesh(
        vertices=pose.detach().cpu().numpy(),
        faces=face.detach().cpu().numpy(),
        vertex_colors=np.tile(colors['grey'], (6890, 1))
    )
    
    # Render to image
    self.mv.set_static_meshes([body_mesh])
    body_image = self.mv.render(render_wireframe=False)
    cv2.imwrite(save_path, body_image)

🤖 Robot Rendering

Method Signature:

def render_pose_robot(self, action: torch.Tensor, save_path: str, root_tf: Optional[torch.Tensor] = None)

🔧 Implementation Details:

def render_pose_robot(self, action, save_path, root_tf=None):
    """Render robot in specified joint configuration"""
    # Apply coordinate transformation for consistent viewpoint
    robot_vis_tf = torch.zeros((4, 4))
    robot_vis_tf[range(4), [1, 2, 0, 3]] = 1  # Y→X, Z→Y, X→Z transformation
    
    # Generate robot visualization
    img = self.robot_vis(action, root_tf or robot_vis_tf)
    cv2.imwrite(save_path, img)

🎞️ Video Compilation

Video Generation Pipeline:

# 1. Collect all rendered frames
bm_images = glob(os.path.join(bm_save_path, "*.png"))
fk_images = glob(os.path.join(fk_save_path, "*.png"))

# 2. Calculate optimal frame dimensions
bm_img = cv2.imread(bm_images[0])
fk_img = cv2.imread(fk_images[0])
h, w, _ = bm_img.shape
h_fk, w_fk, _ = fk_img.shape
w_new = int(w * h_fk / h)  # Scale to match height
framesize = (w_new + w_fk, h_fk)

# 3. Setup video writer
fourcc = cv2.VideoWriter_fourcc(*'XVID')
out = cv2.VideoWriter(video_name, fourcc, fps.item(), framesize)

# 4. Combine frames side-by-side
for bm_img_path, fk_img_path in zip(bm_images, fk_images):
    bm_img = cv2.imread(bm_img_path)
    fk_img = cv2.imread(fk_img_path)
    
    # Resize to match heights
    bm_img = cv2.resize(bm_img, (w_new, h_fk))
    
    # Horizontal concatenation
    combined_frame = np.hstack((bm_img, fk_img))
    out.write(combined_frame)

out.release()

💡 Practical Applications

🧪 Experimental Validation Workflow

Quality Assessment Pipeline:

def validate_retargeting_quality(dataset_dir, model_path, output_dir):
    """Comprehensive validation of retargeting quality"""
    
    # Initialize renderer with validation configuration
    renderer = PoseRenderer(
        urdf_path="/path/to/robot.urdf",
        dataset_path=dataset_dir,
        mapping_table=mapping_table,
        smplh_model_path="/path/to/smplh/model.npz",
        dmpls_model_path="/path/to/dmpls/model.npz",
        poseformer_model_path=model_path,
        save_path=output_dir,
        device="cuda",
        max_single_batch=64  # Conservative for stability
    )
    
    # Process validation dataset
    dataloader = DataLoader(renderer.dataset, batch_size=1, shuffle=False)
    
    for i, data in enumerate(dataloader):
        if i >= 10:  # Limit for quick validation
            break
            
        pose = data['poses'].to(renderer.fk.device)
        fps = data['fps']
        motion_name = data['title'][0]
        
        # Generate comparison video
        output_path = os.path.join(output_dir, f"validation_{i:03d}_{motion_name}")
        renderer.render_pose(pose, output_path, fps, motion_name)
        
        print(f"Rendered validation video: {motion_name}")

🔍 Debug and Analysis Tools

Frame-by-Frame Analysis:

def analyze_pose_transfer_frame(renderer, human_pose, robot_action, frame_idx, output_dir):
    """Detailed analysis of specific frame"""
    
    # Generate human body mesh
    body_params = renderer.get_pose_amass(human_pose[frame_idx:frame_idx+1])
    human_img_path = f"{output_dir}/human_frame_{frame_idx}.png"
    renderer.render_pose_amass(body_params.v[0], body_params.f, human_img_path)
    
    # Generate robot pose
    robot_img_path = f"{output_dir}/robot_frame_{frame_idx}.png"
    renderer.render_pose_robot(robot_action[frame_idx], robot_img_path)
    
    # Create side-by-side comparison
    human_img = cv2.imread(human_img_path)
    robot_img = cv2.imread(robot_img_path)
    
    # Resize to match heights and combine
    h_target = min(human_img.shape[0], robot_img.shape[0])
    human_resized = cv2.resize(human_img, (int(human_img.shape[1] * h_target / human_img.shape[0]), h_target))
    robot_resized = cv2.resize(robot_img, (int(robot_img.shape[1] * h_target / robot_img.shape[0]), h_target))
    
    comparison = np.hstack([human_resized, robot_resized])
    cv2.imwrite(f"{output_dir}/comparison_frame_{frame_idx}.png", comparison)

📊 Batch Validation for Training

Training Integration Example:

class PoseTransformerTrainer:
    def __init__(self, ..., validation_renderer=None):
        self.validation_renderer = validation_renderer
    
    def validate_with_videos(self, epoch, num_samples=3):
        """Generate validation videos during training"""
        if self.validation_renderer is None:
            return
            
        self.model.eval()
        with torch.no_grad():
            sample_data = next(iter(self.test_loader))
            poses = sample_data['poses'][:num_samples]
            
            for i, pose in enumerate(poses):
                # Generate robot actions
                pose_body = pose[:, 3:66]
                actions = self.model(pose_body)
                
                # Create validation video
                output_path = f"validation/epoch_{epoch}_sample_{i}"
                fps = sample_data['fps'][i] if i < len(sample_data['fps']) else 30
                
                self.validation_renderer.render_pose_without_model(
                    pose=pose.unsqueeze(0),
                    action=actions,
                    save_path=output_path,
                    fps=int(fps)
                )

🛠️ Configuration and Advanced Usage

⚙️ Memory Management

Large Sequence Handling:

# Configure for large motion sequences
renderer = PoseRenderer(
    # ... other parameters
    max_single_batch=32,  # Reduce batch size for memory-constrained systems
    device="cuda"         # Use GPU for faster processing
)

# Monitor memory usage during rendering
import psutil
import GPUtil

def monitor_resources():
    """Monitor system resources during rendering"""
    cpu_percent = psutil.cpu_percent()
    memory_percent = psutil.virtual_memory().percent
    
    try:
        gpus = GPUtil.getGPUs()
        gpu_memory = gpus[0].memoryUtil * 100 if gpus else 0
        print(f"CPU: {cpu_percent}%, RAM: {memory_percent}%, GPU: {gpu_memory}%")
    except:
        print(f"CPU: {cpu_percent}%, RAM: {memory_percent}%")

🎨 Visual Quality Optimization

High-Quality Rendering Configuration:

# High-resolution rendering setup
renderer.robot_vis = RobotVisualizer(
    urdf_path=urdf_path,
    width=1920,   # Full HD width
    height=1080,  # Full HD height
    robot_transmission=0.0,  # Fully opaque robot
    use_offscreen=True,
    device=device
)

# Enhanced human body rendering
renderer.mv = MeshViewer(
    width=1920,
    height=1080,
    use_offscreen=True
)

🔄 Custom Post-Processing

Motion Filtering Integration:

from GBC.utils.data_preparation.amass_action_converter import TrackingDataPostProcess
from GBC.utils.data_preparation.data_preparation_cfg import FilterCfg

def setup_advanced_filtering(renderer, fps):
    """Configure advanced motion filtering"""
    filter_cfg = FilterCfg(
        filter_sample_rate=fps,
        filter_cutoff=fps / 8.0,  # Aggressive smoothing
        filter_order=4,           # Higher order for smoother results
        device=renderer.fk.device
    )
    
    return TrackingDataPostProcess(filter_cfg=filter_cfg)

# Usage in rendering
def render_with_custom_filtering(renderer, pose, save_path, fps, name):
    """Render with custom motion filtering"""
    filter_processor = setup_advanced_filtering(renderer, fps)
    
    # Apply filtering to predicted actions
    pose_body = pose.squeeze(0)[:, 3:66]
    actions = renderer.poseformer(pose_body)
    filtered_actions = filter_processor.filt(actions)
    
    # Render with filtered actions
    renderer.render_pose_without_model(
        pose=pose,
        action=filtered_actions,
        save_path=save_path,
        fps=int(fps)
    )

🚨 Best Practices

✅ Validation Guidelines

Systematic Testing: Render validation videos for multiple motion types (walking, running, dancing)
Quality Metrics: Visually assess end-effector alignment, joint smoothness, and overall pose similarity
Temporal Consistency: Check for jittery motion or unrealistic transitions
Physical Feasibility: Verify robot poses respect joint limits and kinematic constraints
Comparative Analysis: Generate videos for different model versions to track improvement

🔧 Performance Optimization

Efficient Batch Processing:

# Optimize for your system configuration
BATCH_SIZE_MAP = {
    "low_memory": 16,    # Systems with <8GB GPU memory
    "medium_memory": 64, # Systems with 8-16GB GPU memory  
    "high_memory": 128   # Systems with >16GB GPU memory
}

def auto_configure_batch_size():
    """Automatically configure batch size based on available GPU memory"""
    try:
        import GPUtil
        gpus = GPUtil.getGPUs()
        if gpus:
            gpu_memory = gpus[0].memoryTotal / 1024  # Convert to GB
            if gpu_memory < 8:
                return BATCH_SIZE_MAP["low_memory"]
            elif gpu_memory < 16:
                return BATCH_SIZE_MAP["medium_memory"]
            else:
                return BATCH_SIZE_MAP["high_memory"]
    except:
        pass
    return BATCH_SIZE_MAP["low_memory"]  # Conservative default

🐛 Troubleshooting

Common Issues and Solutions:

Missing ImageMagick:

# Error: convert command not found
sudo apt-get install imagemagick

Memory Issues:

# Reduce batch size
renderer.max_single_batch = 16

# Clear GPU cache
torch.cuda.empty_cache()

Video Corruption:

# Ensure proper codec and frame dimensions
fourcc = cv2.VideoWriter_fourcc(*'mp4v')  # More compatible codec
# Verify frame dimensions are even numbers
framesize = (width - width % 2, height - height % 2)

Rendering Artifacts:

# Update mesh viewer between renders
renderer.render_pose_amass(..., update_mv=True)

# Ensure proper coordinate transformations
robot_vis_tf = torch.zeros((4, 4))
robot_vis_tf[range(4), [1, 2, 0, 3]] = 1

This powerful rendering framework serves as an essential validation tool, enabling researchers to visually assess retargeting quality and create compelling demonstration materials for human-to-robot motion transfer research.

📚 Dependencies​

🔧 System Requirements​

🎯 Core Functionality​

🎬 PoseRenderer​

🚀 Primary Rendering Methods​

🎯 Complete Motion Rendering​

🎬 Pre-Computed Action Rendering​

🤖 Robot-Only Rendering​

🔧 Internal Rendering Components​

🧑‍🎨 Human Body Rendering​

🤖 Robot Rendering​

🎞️ Video Compilation​

💡 Practical Applications​

🧪 Experimental Validation Workflow​

🔍 Debug and Analysis Tools​

📊 Batch Validation for Training​

🛠️ Configuration and Advanced Usage​

⚙️ Memory Management​

🎨 Visual Quality Optimization​

🔄 Custom Post-Processing​

🚨 Best Practices​

✅ Validation Guidelines​

🔧 Performance Optimization​

🐛 Troubleshooting​

📚 Dependencies

🔧 System Requirements

🎯 Core Functionality

🎬 PoseRenderer

🚀 Primary Rendering Methods

🎯 Complete Motion Rendering

🎬 Pre-Computed Action Rendering

🤖 Robot-Only Rendering

🔧 Internal Rendering Components

🧑‍🎨 Human Body Rendering

🤖 Robot Rendering

🎞️ Video Compilation

💡 Practical Applications

🧪 Experimental Validation Workflow

🔍 Debug and Analysis Tools

📊 Batch Validation for Training

🛠️ Configuration and Advanced Usage

⚙️ Memory Management

🎨 Visual Quality Optimization

🔄 Custom Post-Processing

🚨 Best Practices

✅ Validation Guidelines

🔧 Performance Optimization

🐛 Troubleshooting