Skip to main content

render_track_videos

Module: GBC.utils.data_preparation.render_track_videos

This module provides essential video rendering capabilities for experimental validation and quality assessment in human-to-robot motion retargeting. The PoseRenderer class generates side-by-side comparison videos showing original human motion alongside robot motion, serving as a critical tool for validating retargeting quality, debugging pose transfer issues, and creating presentation materials for research results.

๐Ÿ“š Dependenciesโ€‹

This module requires the following Python packages:

  • torch - PyTorch framework for neural networks
  • numpy - Numerical computing library
  • opencv-python (cv2) - Video processing and image manipulation
  • trimesh - 3D mesh processing
  • human_body_prior - SMPL+H body model
  • body_visualizer - Human body mesh visualization
  • scipy - Scientific computing (rotation mathematics)
  • tqdm - Progress bars for batch processing
  • PIL - Image processing
  • matplotlib - Plotting and visualization

๐Ÿ”ง System Requirementsโ€‹

Critical Command-Line Tool: This module requires ImageMagick to be installed on your system for advanced image processing operations. Install using:

# Ubuntu/Debian
sudo apt-get install imagemagick

# macOS (with Homebrew)
brew install imagemagick

# CentOS/RHEL
sudo yum install ImageMagick

# Verify installation
convert --version

Note: The convert command from ImageMagick is used internally for image format conversions and batch processing operations.


๐ŸŽฏ Core Functionalityโ€‹

๐ŸŽฌ PoseRendererโ€‹

Module Name: GBC.utils.data_preparation.render_track_videos.PoseRenderer

Definition:

class PoseRenderer:
def __init__(self,
urdf_path: str,
dataset_path: str,
mapping_table: Dict[str, str],
smplh_model_path: str,
dmpls_model_path: str,
poseformer_model_path: str,
save_path: str,
secondary_dir: Optional[str] = None,
device: str = "cuda",
max_single_batch: int = 128,
without_model: bool = False,
smpl_fits_dir: Optional[str] = None
):

๐Ÿ“ฅ Initialization Parameters:

  • urdf_path (str): Path to robot URDF description file
  • dataset_path (str): Root directory of AMASS motion capture dataset
  • mapping_table (Dict[str, str]): SMPL+H to robot joint mapping (see create_smplh.md)
  • smplh_model_path (str): Path to SMPL+H human body model file
  • dmpls_model_path (str): Path to DMPL muscle model file
  • poseformer_model_path (str): Path to trained PoseTransformer model
  • save_path (str): Output directory for rendered videos
  • secondary_dir (Optional[str]): Specific AMASS subdirectory to process
  • device (str): Computing device ("cuda" or "cpu")
  • max_single_batch (int): Maximum frames processed in single batch (memory management)
  • without_model (bool): Skip loading PoseTransformer model (for pre-computed actions)
  • smpl_fits_dir (Optional[str]): Path to fitted SMPL+H parameters (.pt file)

๐Ÿ”ง Core Features:

  • Side-by-Side Comparison: Human motion vs robot motion visualization
  • Batch Processing: Efficient handling of large motion sequences
  • Quality Assessment: Visual validation of retargeting accuracy
  • Filtering Integration: Built-in motion smoothing and post-processing
  • Multi-Format Output: Individual frames and compiled videos

๐Ÿš€ Primary Rendering Methodsโ€‹

๐ŸŽฏ Complete Motion Renderingโ€‹

Method Signature:

@torch.no_grad()
def render_pose(self, pose: torch.Tensor, save_path: str, fps: torch.Tensor, name: str)

๐Ÿ“ฅ Input Parameters:

  • pose (torch.Tensor): Human pose sequence [1, num_frames, 66] (SMPL+H format)
  • save_path (str): Output directory for this specific motion sequence
  • fps (torch.Tensor): Frame rate for video generation
  • name (str): Motion sequence identifier

๐Ÿ”ง Processing Pipeline:

  1. Motion Filtering: Apply temporal smoothing to robot actions
  2. Batch Processing: Split large sequences for memory efficiency
  3. Dual Rendering: Generate human and robot visualizations
  4. Video Compilation: Create side-by-side comparison video
  5. Cleanup: Remove temporary frame directories

๐Ÿ’ก Usage Example:

from GBC.utils.data_preparation.render_track_videos import PoseRenderer

# Initialize renderer
renderer = PoseRenderer(
urdf_path="/path/to/robot.urdf",
dataset_path="/path/to/AMASS",
mapping_table=your_mapping_table,
smplh_model_path="/path/to/smplh/model.npz",
dmpls_model_path="/path/to/dmpls/model.npz",
poseformer_model_path="/path/to/trained_model.pt",
save_path="/output/videos",
device="cuda"
)

# Render motion sequence
pose_sequence = torch.randn(1, 300, 66) # 300 frames
fps = torch.tensor(30.0)
renderer.render_pose(pose_sequence, "/output/walking_motion", fps, "walking")

๐ŸŽฌ Pre-Computed Action Renderingโ€‹

Method Signature:

def render_pose_without_model(self, pose: torch.Tensor, action: torch.Tensor, save_path: str, fps: int)

๐Ÿ“ฅ Input Parameters:

  • pose (torch.Tensor): Human pose sequence for visualization
  • action (torch.Tensor): Pre-computed robot actions
  • save_path (str): Output directory
  • fps (int): Video frame rate

๐Ÿ’ก Usage Scenario: When robot actions are already computed and you want to generate comparison videos without re-running inference.

# Use pre-computed actions
human_poses = torch.load("human_motion.pt")
robot_actions = torch.load("robot_actions.pt")

renderer.render_pose_without_model(
pose=human_poses,
action=robot_actions,
save_path="/output/validation",
fps=30
)

๐Ÿค– Robot-Only Renderingโ€‹

Method Signature:

def render_pose_without_model_urdf(self, action: torch.Tensor, save_path: str, fps: int, name: str = "video")

๐Ÿ“ฅ Input Parameters:

  • action (torch.Tensor): Robot action sequence
  • save_path (str): Output directory
  • fps (int): Video frame rate
  • name (str): Output video filename (without extension)

๐Ÿ’ก Usage Scenario: Generate robot-only motion videos for demonstrations or presentations.

# Robot-only visualization
robot_actions = torch.load("generated_actions.pt")
renderer.render_pose_without_model_urdf(
action=robot_actions,
save_path="/output/robot_demo",
fps=30,
name="robot_walking"
)

๐Ÿ”ง Internal Rendering Componentsโ€‹

๐Ÿง‘โ€๐ŸŽจ Human Body Renderingโ€‹

Method Signature:

def render_pose_amass(self, pose: torch.Tensor, face: torch.Tensor, save_path: str, 
width: int = 800, height: int = 600, update_mv: bool = False)

๐Ÿ”ง Implementation Details:

def render_pose_amass(self, pose, face, save_path, width=800, height=600, update_mv=False):
"""Render SMPL+H human body mesh"""
if not hasattr(self, 'mv') or update_mv:
self.mv = MeshViewer(width=width, height=height, use_offscreen=True)

# Create trimesh from SMPL+H vertices and faces
body_mesh = trimesh.Trimesh(
vertices=pose.detach().cpu().numpy(),
faces=face.detach().cpu().numpy(),
vertex_colors=np.tile(colors['grey'], (6890, 1))
)

# Render to image
self.mv.set_static_meshes([body_mesh])
body_image = self.mv.render(render_wireframe=False)
cv2.imwrite(save_path, body_image)

๐Ÿค– Robot Renderingโ€‹

Method Signature:

def render_pose_robot(self, action: torch.Tensor, save_path: str, root_tf: Optional[torch.Tensor] = None)

๐Ÿ”ง Implementation Details:

def render_pose_robot(self, action, save_path, root_tf=None):
"""Render robot in specified joint configuration"""
# Apply coordinate transformation for consistent viewpoint
robot_vis_tf = torch.zeros((4, 4))
robot_vis_tf[range(4), [1, 2, 0, 3]] = 1 # Yโ†’X, Zโ†’Y, Xโ†’Z transformation

# Generate robot visualization
img = self.robot_vis(action, root_tf or robot_vis_tf)
cv2.imwrite(save_path, img)

๐ŸŽž๏ธ Video Compilationโ€‹

Video Generation Pipeline:

# 1. Collect all rendered frames
bm_images = glob(os.path.join(bm_save_path, "*.png"))
fk_images = glob(os.path.join(fk_save_path, "*.png"))

# 2. Calculate optimal frame dimensions
bm_img = cv2.imread(bm_images[0])
fk_img = cv2.imread(fk_images[0])
h, w, _ = bm_img.shape
h_fk, w_fk, _ = fk_img.shape
w_new = int(w * h_fk / h) # Scale to match height
framesize = (w_new + w_fk, h_fk)

# 3. Setup video writer
fourcc = cv2.VideoWriter_fourcc(*'XVID')
out = cv2.VideoWriter(video_name, fourcc, fps.item(), framesize)

# 4. Combine frames side-by-side
for bm_img_path, fk_img_path in zip(bm_images, fk_images):
bm_img = cv2.imread(bm_img_path)
fk_img = cv2.imread(fk_img_path)

# Resize to match heights
bm_img = cv2.resize(bm_img, (w_new, h_fk))

# Horizontal concatenation
combined_frame = np.hstack((bm_img, fk_img))
out.write(combined_frame)

out.release()

๐Ÿ’ก Practical Applicationsโ€‹

๐Ÿงช Experimental Validation Workflowโ€‹

Quality Assessment Pipeline:

def validate_retargeting_quality(dataset_dir, model_path, output_dir):
"""Comprehensive validation of retargeting quality"""

# Initialize renderer with validation configuration
renderer = PoseRenderer(
urdf_path="/path/to/robot.urdf",
dataset_path=dataset_dir,
mapping_table=mapping_table,
smplh_model_path="/path/to/smplh/model.npz",
dmpls_model_path="/path/to/dmpls/model.npz",
poseformer_model_path=model_path,
save_path=output_dir,
device="cuda",
max_single_batch=64 # Conservative for stability
)

# Process validation dataset
dataloader = DataLoader(renderer.dataset, batch_size=1, shuffle=False)

for i, data in enumerate(dataloader):
if i >= 10: # Limit for quick validation
break

pose = data['poses'].to(renderer.fk.device)
fps = data['fps']
motion_name = data['title'][0]

# Generate comparison video
output_path = os.path.join(output_dir, f"validation_{i:03d}_{motion_name}")
renderer.render_pose(pose, output_path, fps, motion_name)

print(f"Rendered validation video: {motion_name}")

๐Ÿ” Debug and Analysis Toolsโ€‹

Frame-by-Frame Analysis:

def analyze_pose_transfer_frame(renderer, human_pose, robot_action, frame_idx, output_dir):
"""Detailed analysis of specific frame"""

# Generate human body mesh
body_params = renderer.get_pose_amass(human_pose[frame_idx:frame_idx+1])
human_img_path = f"{output_dir}/human_frame_{frame_idx}.png"
renderer.render_pose_amass(body_params.v[0], body_params.f, human_img_path)

# Generate robot pose
robot_img_path = f"{output_dir}/robot_frame_{frame_idx}.png"
renderer.render_pose_robot(robot_action[frame_idx], robot_img_path)

# Create side-by-side comparison
human_img = cv2.imread(human_img_path)
robot_img = cv2.imread(robot_img_path)

# Resize to match heights and combine
h_target = min(human_img.shape[0], robot_img.shape[0])
human_resized = cv2.resize(human_img, (int(human_img.shape[1] * h_target / human_img.shape[0]), h_target))
robot_resized = cv2.resize(robot_img, (int(robot_img.shape[1] * h_target / robot_img.shape[0]), h_target))

comparison = np.hstack([human_resized, robot_resized])
cv2.imwrite(f"{output_dir}/comparison_frame_{frame_idx}.png", comparison)

๐Ÿ“Š Batch Validation for Trainingโ€‹

Training Integration Example:

class PoseTransformerTrainer:
def __init__(self, ..., validation_renderer=None):
self.validation_renderer = validation_renderer

def validate_with_videos(self, epoch, num_samples=3):
"""Generate validation videos during training"""
if self.validation_renderer is None:
return

self.model.eval()
with torch.no_grad():
sample_data = next(iter(self.test_loader))
poses = sample_data['poses'][:num_samples]

for i, pose in enumerate(poses):
# Generate robot actions
pose_body = pose[:, 3:66]
actions = self.model(pose_body)

# Create validation video
output_path = f"validation/epoch_{epoch}_sample_{i}"
fps = sample_data['fps'][i] if i < len(sample_data['fps']) else 30

self.validation_renderer.render_pose_without_model(
pose=pose.unsqueeze(0),
action=actions,
save_path=output_path,
fps=int(fps)
)

๐Ÿ› ๏ธ Configuration and Advanced Usageโ€‹

โš™๏ธ Memory Managementโ€‹

Large Sequence Handling:

# Configure for large motion sequences
renderer = PoseRenderer(
# ... other parameters
max_single_batch=32, # Reduce batch size for memory-constrained systems
device="cuda" # Use GPU for faster processing
)

# Monitor memory usage during rendering
import psutil
import GPUtil

def monitor_resources():
"""Monitor system resources during rendering"""
cpu_percent = psutil.cpu_percent()
memory_percent = psutil.virtual_memory().percent

try:
gpus = GPUtil.getGPUs()
gpu_memory = gpus[0].memoryUtil * 100 if gpus else 0
print(f"CPU: {cpu_percent}%, RAM: {memory_percent}%, GPU: {gpu_memory}%")
except:
print(f"CPU: {cpu_percent}%, RAM: {memory_percent}%")

๐ŸŽจ Visual Quality Optimizationโ€‹

High-Quality Rendering Configuration:

# High-resolution rendering setup
renderer.robot_vis = RobotVisualizer(
urdf_path=urdf_path,
width=1920, # Full HD width
height=1080, # Full HD height
robot_transmission=0.0, # Fully opaque robot
use_offscreen=True,
device=device
)

# Enhanced human body rendering
renderer.mv = MeshViewer(
width=1920,
height=1080,
use_offscreen=True
)

๐Ÿ”„ Custom Post-Processingโ€‹

Motion Filtering Integration:

from GBC.utils.data_preparation.amass_action_converter import TrackingDataPostProcess
from GBC.utils.data_preparation.data_preparation_cfg import FilterCfg

def setup_advanced_filtering(renderer, fps):
"""Configure advanced motion filtering"""
filter_cfg = FilterCfg(
filter_sample_rate=fps,
filter_cutoff=fps / 8.0, # Aggressive smoothing
filter_order=4, # Higher order for smoother results
device=renderer.fk.device
)

return TrackingDataPostProcess(filter_cfg=filter_cfg)

# Usage in rendering
def render_with_custom_filtering(renderer, pose, save_path, fps, name):
"""Render with custom motion filtering"""
filter_processor = setup_advanced_filtering(renderer, fps)

# Apply filtering to predicted actions
pose_body = pose.squeeze(0)[:, 3:66]
actions = renderer.poseformer(pose_body)
filtered_actions = filter_processor.filt(actions)

# Render with filtered actions
renderer.render_pose_without_model(
pose=pose,
action=filtered_actions,
save_path=save_path,
fps=int(fps)
)

๐Ÿšจ Best Practicesโ€‹

โœ… Validation Guidelinesโ€‹

  1. Systematic Testing: Render validation videos for multiple motion types (walking, running, dancing)
  2. Quality Metrics: Visually assess end-effector alignment, joint smoothness, and overall pose similarity
  3. Temporal Consistency: Check for jittery motion or unrealistic transitions
  4. Physical Feasibility: Verify robot poses respect joint limits and kinematic constraints
  5. Comparative Analysis: Generate videos for different model versions to track improvement

๐Ÿ”ง Performance Optimizationโ€‹

Efficient Batch Processing:

# Optimize for your system configuration
BATCH_SIZE_MAP = {
"low_memory": 16, # Systems with <8GB GPU memory
"medium_memory": 64, # Systems with 8-16GB GPU memory
"high_memory": 128 # Systems with >16GB GPU memory
}

def auto_configure_batch_size():
"""Automatically configure batch size based on available GPU memory"""
try:
import GPUtil
gpus = GPUtil.getGPUs()
if gpus:
gpu_memory = gpus[0].memoryTotal / 1024 # Convert to GB
if gpu_memory < 8:
return BATCH_SIZE_MAP["low_memory"]
elif gpu_memory < 16:
return BATCH_SIZE_MAP["medium_memory"]
else:
return BATCH_SIZE_MAP["high_memory"]
except:
pass
return BATCH_SIZE_MAP["low_memory"] # Conservative default

๐Ÿ› Troubleshootingโ€‹

Common Issues and Solutions:

  1. Missing ImageMagick:

    # Error: convert command not found
    sudo apt-get install imagemagick
  2. Memory Issues:

    # Reduce batch size
    renderer.max_single_batch = 16

    # Clear GPU cache
    torch.cuda.empty_cache()
  3. Video Corruption:

    # Ensure proper codec and frame dimensions
    fourcc = cv2.VideoWriter_fourcc(*'mp4v') # More compatible codec
    # Verify frame dimensions are even numbers
    framesize = (width - width % 2, height - height % 2)
  4. Rendering Artifacts:

    # Update mesh viewer between renders
    renderer.render_pose_amass(..., update_mv=True)

    # Ensure proper coordinate transformations
    robot_vis_tf = torch.zeros((4, 4))
    robot_vis_tf[range(4), [1, 2, 0, 3]] = 1

This powerful rendering framework serves as an essential validation tool, enabling researchers to visually assess retargeting quality and create compelling demonstration materials for human-to-robot motion transfer research.