render_track_videos
Module: GBC.utils.data_preparation.render_track_videos
This module provides essential video rendering capabilities for experimental validation and quality assessment in human-to-robot motion retargeting. The PoseRenderer
class generates side-by-side comparison videos showing original human motion alongside robot motion, serving as a critical tool for validating retargeting quality, debugging pose transfer issues, and creating presentation materials for research results.
๐ Dependenciesโ
This module requires the following Python packages:
torch
- PyTorch framework for neural networksnumpy
- Numerical computing libraryopencv-python (cv2)
- Video processing and image manipulationtrimesh
- 3D mesh processinghuman_body_prior
- SMPL+H body modelbody_visualizer
- Human body mesh visualizationscipy
- Scientific computing (rotation mathematics)tqdm
- Progress bars for batch processingPIL
- Image processingmatplotlib
- Plotting and visualization
๐ง System Requirementsโ
Critical Command-Line Tool: This module requires ImageMagick to be installed on your system for advanced image processing operations. Install using:
# Ubuntu/Debian
sudo apt-get install imagemagick
# macOS (with Homebrew)
brew install imagemagick
# CentOS/RHEL
sudo yum install ImageMagick
# Verify installation
convert --version
Note: The convert
command from ImageMagick is used internally for image format conversions and batch processing operations.
๐ฏ Core Functionalityโ
๐ฌ PoseRendererโ
Module Name: GBC.utils.data_preparation.render_track_videos.PoseRenderer
Definition:
class PoseRenderer:
def __init__(self,
urdf_path: str,
dataset_path: str,
mapping_table: Dict[str, str],
smplh_model_path: str,
dmpls_model_path: str,
poseformer_model_path: str,
save_path: str,
secondary_dir: Optional[str] = None,
device: str = "cuda",
max_single_batch: int = 128,
without_model: bool = False,
smpl_fits_dir: Optional[str] = None
):
๐ฅ Initialization Parameters:
- urdf_path (str): Path to robot URDF description file
- dataset_path (str): Root directory of AMASS motion capture dataset
- mapping_table (Dict[str, str]): SMPL+H to robot joint mapping (see create_smplh.md)
- smplh_model_path (str): Path to SMPL+H human body model file
- dmpls_model_path (str): Path to DMPL muscle model file
- poseformer_model_path (str): Path to trained PoseTransformer model
- save_path (str): Output directory for rendered videos
- secondary_dir (Optional[str]): Specific AMASS subdirectory to process
- device (str): Computing device ("cuda" or "cpu")
- max_single_batch (int): Maximum frames processed in single batch (memory management)
- without_model (bool): Skip loading PoseTransformer model (for pre-computed actions)
- smpl_fits_dir (Optional[str]): Path to fitted SMPL+H parameters (.pt file)
๐ง Core Features:
- Side-by-Side Comparison: Human motion vs robot motion visualization
- Batch Processing: Efficient handling of large motion sequences
- Quality Assessment: Visual validation of retargeting accuracy
- Filtering Integration: Built-in motion smoothing and post-processing
- Multi-Format Output: Individual frames and compiled videos
๐ Primary Rendering Methodsโ
๐ฏ Complete Motion Renderingโ
Method Signature:
@torch.no_grad()
def render_pose(self, pose: torch.Tensor, save_path: str, fps: torch.Tensor, name: str)
๐ฅ Input Parameters:
- pose (torch.Tensor): Human pose sequence
[1, num_frames, 66]
(SMPL+H format) - save_path (str): Output directory for this specific motion sequence
- fps (torch.Tensor): Frame rate for video generation
- name (str): Motion sequence identifier
๐ง Processing Pipeline:
- Motion Filtering: Apply temporal smoothing to robot actions
- Batch Processing: Split large sequences for memory efficiency
- Dual Rendering: Generate human and robot visualizations
- Video Compilation: Create side-by-side comparison video
- Cleanup: Remove temporary frame directories
๐ก Usage Example:
from GBC.utils.data_preparation.render_track_videos import PoseRenderer
# Initialize renderer
renderer = PoseRenderer(
urdf_path="/path/to/robot.urdf",
dataset_path="/path/to/AMASS",
mapping_table=your_mapping_table,
smplh_model_path="/path/to/smplh/model.npz",
dmpls_model_path="/path/to/dmpls/model.npz",
poseformer_model_path="/path/to/trained_model.pt",
save_path="/output/videos",
device="cuda"
)
# Render motion sequence
pose_sequence = torch.randn(1, 300, 66) # 300 frames
fps = torch.tensor(30.0)
renderer.render_pose(pose_sequence, "/output/walking_motion", fps, "walking")
๐ฌ Pre-Computed Action Renderingโ
Method Signature:
def render_pose_without_model(self, pose: torch.Tensor, action: torch.Tensor, save_path: str, fps: int)
๐ฅ Input Parameters:
- pose (torch.Tensor): Human pose sequence for visualization
- action (torch.Tensor): Pre-computed robot actions
- save_path (str): Output directory
- fps (int): Video frame rate
๐ก Usage Scenario: When robot actions are already computed and you want to generate comparison videos without re-running inference.
# Use pre-computed actions
human_poses = torch.load("human_motion.pt")
robot_actions = torch.load("robot_actions.pt")
renderer.render_pose_without_model(
pose=human_poses,
action=robot_actions,
save_path="/output/validation",
fps=30
)
๐ค Robot-Only Renderingโ
Method Signature:
def render_pose_without_model_urdf(self, action: torch.Tensor, save_path: str, fps: int, name: str = "video")
๐ฅ Input Parameters:
- action (torch.Tensor): Robot action sequence
- save_path (str): Output directory
- fps (int): Video frame rate
- name (str): Output video filename (without extension)
๐ก Usage Scenario: Generate robot-only motion videos for demonstrations or presentations.
# Robot-only visualization
robot_actions = torch.load("generated_actions.pt")
renderer.render_pose_without_model_urdf(
action=robot_actions,
save_path="/output/robot_demo",
fps=30,
name="robot_walking"
)
๐ง Internal Rendering Componentsโ
๐งโ๐จ Human Body Renderingโ
Method Signature:
def render_pose_amass(self, pose: torch.Tensor, face: torch.Tensor, save_path: str,
width: int = 800, height: int = 600, update_mv: bool = False)
๐ง Implementation Details:
def render_pose_amass(self, pose, face, save_path, width=800, height=600, update_mv=False):
"""Render SMPL+H human body mesh"""
if not hasattr(self, 'mv') or update_mv:
self.mv = MeshViewer(width=width, height=height, use_offscreen=True)
# Create trimesh from SMPL+H vertices and faces
body_mesh = trimesh.Trimesh(
vertices=pose.detach().cpu().numpy(),
faces=face.detach().cpu().numpy(),
vertex_colors=np.tile(colors['grey'], (6890, 1))
)
# Render to image
self.mv.set_static_meshes([body_mesh])
body_image = self.mv.render(render_wireframe=False)
cv2.imwrite(save_path, body_image)
๐ค Robot Renderingโ
Method Signature:
def render_pose_robot(self, action: torch.Tensor, save_path: str, root_tf: Optional[torch.Tensor] = None)
๐ง Implementation Details:
def render_pose_robot(self, action, save_path, root_tf=None):
"""Render robot in specified joint configuration"""
# Apply coordinate transformation for consistent viewpoint
robot_vis_tf = torch.zeros((4, 4))
robot_vis_tf[range(4), [1, 2, 0, 3]] = 1 # YโX, ZโY, XโZ transformation
# Generate robot visualization
img = self.robot_vis(action, root_tf or robot_vis_tf)
cv2.imwrite(save_path, img)
๐๏ธ Video Compilationโ
Video Generation Pipeline:
# 1. Collect all rendered frames
bm_images = glob(os.path.join(bm_save_path, "*.png"))
fk_images = glob(os.path.join(fk_save_path, "*.png"))
# 2. Calculate optimal frame dimensions
bm_img = cv2.imread(bm_images[0])
fk_img = cv2.imread(fk_images[0])
h, w, _ = bm_img.shape
h_fk, w_fk, _ = fk_img.shape
w_new = int(w * h_fk / h) # Scale to match height
framesize = (w_new + w_fk, h_fk)
# 3. Setup video writer
fourcc = cv2.VideoWriter_fourcc(*'XVID')
out = cv2.VideoWriter(video_name, fourcc, fps.item(), framesize)
# 4. Combine frames side-by-side
for bm_img_path, fk_img_path in zip(bm_images, fk_images):
bm_img = cv2.imread(bm_img_path)
fk_img = cv2.imread(fk_img_path)
# Resize to match heights
bm_img = cv2.resize(bm_img, (w_new, h_fk))
# Horizontal concatenation
combined_frame = np.hstack((bm_img, fk_img))
out.write(combined_frame)
out.release()
๐ก Practical Applicationsโ
๐งช Experimental Validation Workflowโ
Quality Assessment Pipeline:
def validate_retargeting_quality(dataset_dir, model_path, output_dir):
"""Comprehensive validation of retargeting quality"""
# Initialize renderer with validation configuration
renderer = PoseRenderer(
urdf_path="/path/to/robot.urdf",
dataset_path=dataset_dir,
mapping_table=mapping_table,
smplh_model_path="/path/to/smplh/model.npz",
dmpls_model_path="/path/to/dmpls/model.npz",
poseformer_model_path=model_path,
save_path=output_dir,
device="cuda",
max_single_batch=64 # Conservative for stability
)
# Process validation dataset
dataloader = DataLoader(renderer.dataset, batch_size=1, shuffle=False)
for i, data in enumerate(dataloader):
if i >= 10: # Limit for quick validation
break
pose = data['poses'].to(renderer.fk.device)
fps = data['fps']
motion_name = data['title'][0]
# Generate comparison video
output_path = os.path.join(output_dir, f"validation_{i:03d}_{motion_name}")
renderer.render_pose(pose, output_path, fps, motion_name)
print(f"Rendered validation video: {motion_name}")
๐ Debug and Analysis Toolsโ
Frame-by-Frame Analysis:
def analyze_pose_transfer_frame(renderer, human_pose, robot_action, frame_idx, output_dir):
"""Detailed analysis of specific frame"""
# Generate human body mesh
body_params = renderer.get_pose_amass(human_pose[frame_idx:frame_idx+1])
human_img_path = f"{output_dir}/human_frame_{frame_idx}.png"
renderer.render_pose_amass(body_params.v[0], body_params.f, human_img_path)
# Generate robot pose
robot_img_path = f"{output_dir}/robot_frame_{frame_idx}.png"
renderer.render_pose_robot(robot_action[frame_idx], robot_img_path)
# Create side-by-side comparison
human_img = cv2.imread(human_img_path)
robot_img = cv2.imread(robot_img_path)
# Resize to match heights and combine
h_target = min(human_img.shape[0], robot_img.shape[0])
human_resized = cv2.resize(human_img, (int(human_img.shape[1] * h_target / human_img.shape[0]), h_target))
robot_resized = cv2.resize(robot_img, (int(robot_img.shape[1] * h_target / robot_img.shape[0]), h_target))
comparison = np.hstack([human_resized, robot_resized])
cv2.imwrite(f"{output_dir}/comparison_frame_{frame_idx}.png", comparison)
๐ Batch Validation for Trainingโ
Training Integration Example:
class PoseTransformerTrainer:
def __init__(self, ..., validation_renderer=None):
self.validation_renderer = validation_renderer
def validate_with_videos(self, epoch, num_samples=3):
"""Generate validation videos during training"""
if self.validation_renderer is None:
return
self.model.eval()
with torch.no_grad():
sample_data = next(iter(self.test_loader))
poses = sample_data['poses'][:num_samples]
for i, pose in enumerate(poses):
# Generate robot actions
pose_body = pose[:, 3:66]
actions = self.model(pose_body)
# Create validation video
output_path = f"validation/epoch_{epoch}_sample_{i}"
fps = sample_data['fps'][i] if i < len(sample_data['fps']) else 30
self.validation_renderer.render_pose_without_model(
pose=pose.unsqueeze(0),
action=actions,
save_path=output_path,
fps=int(fps)
)
๐ ๏ธ Configuration and Advanced Usageโ
โ๏ธ Memory Managementโ
Large Sequence Handling:
# Configure for large motion sequences
renderer = PoseRenderer(
# ... other parameters
max_single_batch=32, # Reduce batch size for memory-constrained systems
device="cuda" # Use GPU for faster processing
)
# Monitor memory usage during rendering
import psutil
import GPUtil
def monitor_resources():
"""Monitor system resources during rendering"""
cpu_percent = psutil.cpu_percent()
memory_percent = psutil.virtual_memory().percent
try:
gpus = GPUtil.getGPUs()
gpu_memory = gpus[0].memoryUtil * 100 if gpus else 0
print(f"CPU: {cpu_percent}%, RAM: {memory_percent}%, GPU: {gpu_memory}%")
except:
print(f"CPU: {cpu_percent}%, RAM: {memory_percent}%")
๐จ Visual Quality Optimizationโ
High-Quality Rendering Configuration:
# High-resolution rendering setup
renderer.robot_vis = RobotVisualizer(
urdf_path=urdf_path,
width=1920, # Full HD width
height=1080, # Full HD height
robot_transmission=0.0, # Fully opaque robot
use_offscreen=True,
device=device
)
# Enhanced human body rendering
renderer.mv = MeshViewer(
width=1920,
height=1080,
use_offscreen=True
)
๐ Custom Post-Processingโ
Motion Filtering Integration:
from GBC.utils.data_preparation.amass_action_converter import TrackingDataPostProcess
from GBC.utils.data_preparation.data_preparation_cfg import FilterCfg
def setup_advanced_filtering(renderer, fps):
"""Configure advanced motion filtering"""
filter_cfg = FilterCfg(
filter_sample_rate=fps,
filter_cutoff=fps / 8.0, # Aggressive smoothing
filter_order=4, # Higher order for smoother results
device=renderer.fk.device
)
return TrackingDataPostProcess(filter_cfg=filter_cfg)
# Usage in rendering
def render_with_custom_filtering(renderer, pose, save_path, fps, name):
"""Render with custom motion filtering"""
filter_processor = setup_advanced_filtering(renderer, fps)
# Apply filtering to predicted actions
pose_body = pose.squeeze(0)[:, 3:66]
actions = renderer.poseformer(pose_body)
filtered_actions = filter_processor.filt(actions)
# Render with filtered actions
renderer.render_pose_without_model(
pose=pose,
action=filtered_actions,
save_path=save_path,
fps=int(fps)
)
๐จ Best Practicesโ
โ Validation Guidelinesโ
- Systematic Testing: Render validation videos for multiple motion types (walking, running, dancing)
- Quality Metrics: Visually assess end-effector alignment, joint smoothness, and overall pose similarity
- Temporal Consistency: Check for jittery motion or unrealistic transitions
- Physical Feasibility: Verify robot poses respect joint limits and kinematic constraints
- Comparative Analysis: Generate videos for different model versions to track improvement
๐ง Performance Optimizationโ
Efficient Batch Processing:
# Optimize for your system configuration
BATCH_SIZE_MAP = {
"low_memory": 16, # Systems with <8GB GPU memory
"medium_memory": 64, # Systems with 8-16GB GPU memory
"high_memory": 128 # Systems with >16GB GPU memory
}
def auto_configure_batch_size():
"""Automatically configure batch size based on available GPU memory"""
try:
import GPUtil
gpus = GPUtil.getGPUs()
if gpus:
gpu_memory = gpus[0].memoryTotal / 1024 # Convert to GB
if gpu_memory < 8:
return BATCH_SIZE_MAP["low_memory"]
elif gpu_memory < 16:
return BATCH_SIZE_MAP["medium_memory"]
else:
return BATCH_SIZE_MAP["high_memory"]
except:
pass
return BATCH_SIZE_MAP["low_memory"] # Conservative default
๐ Troubleshootingโ
Common Issues and Solutions:
-
Missing ImageMagick:
# Error: convert command not found
sudo apt-get install imagemagick -
Memory Issues:
# Reduce batch size
renderer.max_single_batch = 16
# Clear GPU cache
torch.cuda.empty_cache() -
Video Corruption:
# Ensure proper codec and frame dimensions
fourcc = cv2.VideoWriter_fourcc(*'mp4v') # More compatible codec
# Verify frame dimensions are even numbers
framesize = (width - width % 2, height - height % 2) -
Rendering Artifacts:
# Update mesh viewer between renders
renderer.render_pose_amass(..., update_mv=True)
# Ensure proper coordinate transformations
robot_vis_tf = torch.zeros((4, 4))
robot_vis_tf[range(4), [1, 2, 0, 3]] = 1
This powerful rendering framework serves as an essential validation tool, enabling researchers to visually assess retargeting quality and create compelling demonstration materials for human-to-robot motion transfer research.