跳到主要内容

robot_visualizer

Module: GBC.utils.data_preparation.robot_visualizer

This module provides a powerful and intuitive robot visualization framework essential for debugging, validation, and rendering in robotics pipelines. The RobotVisualizer class offers a streamlined interface where users can simply call vis(action) to generate high-quality robot renderings, making it an indispensable tool for motion analysis, pose validation, and visual debugging throughout the development process.

📚 Dependencies

This module requires the following Python packages:

  • torch - PyTorch framework for tensor operations
  • numpy - Numerical computing library
  • trimesh - 3D mesh processing and manipulation
  • pyrender - High-quality 3D rendering engine
  • pytorch_kinematics - Forward kinematics computations
  • PIL - Image processing (for saving results)
  • GBC.utils.base.base_fk - Robot forward kinematics integration

🎯 Core Functionality

🤖 RobotVisualizer

Module Name: GBC.utils.data_preparation.robot_visualizer.RobotVisualizer

Definition:

class RobotVisualizer:
def __init__(self,
urdf_path: str,
use_offscreen: bool = True,
width: int = 800,
height: int = 1080,
robot_transmission: float = 0.1,
vis_world_frame_length: float = 0.1,
vis_world_frame_color: tuple = (0, 0, 0),
vis_link_frame_length: float = 0.05,
vis_link_frame_color: tuple = (0.5, 0.5, 0.5),
vis_points_size: float = 0.01,
vis_points_color: tuple = (0.9, 0.1, 0.1),
device: str = "cpu"
):

📥 Initialization Parameters:

  • urdf_path (str): Path to robot URDF file with mesh references
  • use_offscreen (bool): Enable headless rendering (True for batch processing)
  • width/height (int): Output image dimensions in pixels
  • robot_transmission (float): Robot mesh transparency (0.0=opaque, 1.0=transparent)
  • vis_world_frame_length (float): Length of world coordinate frame axes
  • vis_world_frame_color (tuple): RGB color for world frame visualization
  • vis_link_frame_length (float): Length of link coordinate frame axes
  • vis_link_frame_color (tuple): RGB color for link frame visualization
  • vis_points_size (float): Radius of visualized points
  • vis_points_color (tuple): RGB color for point visualization
  • device (str): Computing device ("cpu" or "cuda")

🔧 Core Features:

  • Intuitive Interface: Direct callable object vis(action) → image
  • High-Quality Rendering: PyRender-based photorealistic visualization
  • Flexible Configuration: Customizable visual appearance and camera settings
  • Debug-Friendly: Built-in coordinate frame and point visualization
  • Batch Processing: Offscreen rendering for automated pipelines

🚀 Primary Usage: Simple Visualization

🎯 Basic Rendering

Method Signature:

def __call__(self, q, root_tf=None, **kwargs) -> np.ndarray

📥 Input Parameters:

  • q (torch.Tensor): Joint configurations [num_dofs] or [batch_size, num_dofs]
  • root_tf (torch.Tensor, optional): Root transformation matrix [4, 4]
  • kwargs: Additional visualization options (see advanced usage)

📤 Output:

  • np.ndarray: Rendered RGB image [height, width, 3] (uint8 format)

💡 Simple Usage:

from GBC.utils.data_preparation.robot_visualizer import RobotVisualizer
from PIL import Image
import torch

# Initialize visualizer
vis = RobotVisualizer(urdf_path="/path/to/robot.urdf")

# Render robot in zero pose
zero_pose = torch.zeros(vis.num_joints)
image = vis(zero_pose)

# Save result
Image.fromarray(image).save("robot_pose.png")

🔧 Batch Visualization:

# Visualize sequence of poses
poses = torch.randn(10, vis.num_joints) # 10 different poses
images = []

for i, pose in enumerate(poses):
image = vis(pose)
images.append(image)
Image.fromarray(image).save(f"pose_{i:03d}.png")

🎨 Advanced Visualization Features

🌐 Enhanced Rendering Options

Method Signature:

def vis_robot(self, q,
root_tf=None,
vis_world_frame=False,
vis_link_frame=False,
extra_pts=None,
extra_pyrender_objs=[],
cam_pose=np.array([[1,0,0,0], [0,1,0,-0.2], [0,0,1,1.5], [0,0,0,1]]),
cam_yfov=np.pi * 0.4
) -> np.ndarray

📥 Advanced Parameters:

  • vis_world_frame (bool): Display world coordinate frame axes
  • vis_link_frame (bool/list): Show link frames (True=all, list=specific links)
  • extra_pts (np.ndarray/torch.Tensor): Additional 3D points to visualize [N, 3]
  • extra_pyrender_objs (list): Custom PyRender objects (meshes, lights)
  • cam_pose (np.ndarray): Camera transformation matrix [4, 4]
  • cam_yfov (float): Camera field of view in radians

🎯 Debug Visualization

Coordinate Frame Visualization:

# Show world coordinate frame
image = vis(pose, vis_world_frame=True)

# Show specific link frames
image = vis(pose, vis_link_frame=["l_ankle_pitch_link", "r_ankle_pitch_link"])

# Show all link frames
image = vis(pose, vis_link_frame=True)

Point Cloud Visualization:

# Visualize target points (e.g., SMPL joint positions)
target_points = torch.randn(13, 3) # 13 key joints
image = vis(pose, extra_pts=target_points)

Custom Camera Positioning:

# Front view camera
front_cam = np.array([
[1, 0, 0, 0],
[0, 0, -1, 0],
[0, 1, 0, 2],
[0, 0, 0, 1]
])

# Side view camera
side_cam = np.array([
[0, 0, 1, 2],
[0, 1, 0, 0],
[-1, 0, 0, 0],
[0, 0, 0, 1]
])

front_image = vis(pose, cam_pose=front_cam)
side_image = vis(pose, cam_pose=side_cam)

🔧 Internal Architecture

🏗️ Initialization Process

URDF Loading and Mesh Processing:

def load_robot(self):
"""Load robot structure and meshes from URDF"""

# 1. Parse URDF with pytorch_kinematics
self.chain = pk.build_chain_from_urdf(open(self.urdf_path, "rb").read())

# 2. Extract joint and link information
self.link_names = self.chain.get_link_names()
self.joint_names = self.chain.get_joint_parameter_names()

# 3. Load all visual meshes
for link_name in self.link_names:
for visual in link.visuals:
mesh_path = os.path.join(self.urdf_dir_path, visual.geom_param[0])
mesh = trimesh.load_mesh(mesh_path)
# Apply transparency
mesh.visual.vertex_colors = [0.5, 0.5, 0.5, 1-self.robot_transmission]
self.scene.add(pyrender.Mesh.from_trimesh(mesh))

⚙️ Forward Kinematics Integration

Pose Computation and Mesh Positioning:

def set_robot_mesh_pose(self, q, root_tf=None):
"""Update robot mesh positions based on joint configuration"""

# 1. Compute forward kinematics
fk = self.robot_kinematics.forward_kinematics(q)

# 2. Apply root transformation
for link_name, link_tf in fk.items():
link_tf = torch.matmul(root_tf, link_tf[0])
self.link_transform[link_name] = link_tf

# 3. Update mesh poses in PyRender scene
for link_name, link_meshes in self.link_meshes.items():
for offset, mesh in link_meshes:
mesh_tf = torch.matmul(self.link_transform[link_name], offset)
self.scene.set_pose(mesh_node, mesh_tf.cpu().numpy())

🎨 Rendering Pipeline

PyRender Scene Management:

def vis_robot(self, q, **kwargs):
"""Complete rendering pipeline"""

# 1. Update robot pose
self.set_robot_mesh_pose(q, root_tf)

# 2. Add debug visualizations
if vis_world_frame:
self.add_axes(np.eye(4), self.vis_world_frame_length)
if extra_pts is not None:
self.add_points(extra_pts, self.vis_points_size)

# 3. Configure camera and lighting
camera = pyrender.PerspectiveCamera(yfov=cam_yfov)
self.scene.add(camera, pose=cam_pose)
self.add_raymond_lights()

# 4. Render to image
renderer = pyrender.OffscreenRenderer(self.offscreen_width, self.offscreen_height)
color_img, depth_img = renderer.render(self.scene)

return color_img

💡 Practical Applications

🐛 Debugging and Validation

Pose Verification:

def validate_pose_retargeting(human_pose, robot_action, vis):
"""Visual validation of human-to-robot pose transfer"""

# Render robot in predicted pose
robot_img = vis(robot_action, vis_world_frame=True)

# Add human joint positions as reference points
human_joints = get_human_joint_positions(human_pose)
comparison_img = vis(robot_action,
extra_pts=human_joints,
vis_world_frame=True)

# Side-by-side comparison
combined = np.concatenate([robot_img, comparison_img], axis=1)
return combined

Joint Limit Validation:

def check_joint_limits(actions_sequence, vis, urdf_limits):
"""Visualize joint limit violations in action sequence"""

violation_images = []
for i, action in enumerate(actions_sequence):
# Check for limit violations
violations = (action < urdf_limits[:, 0]) | (action > urdf_limits[:, 1])

if violations.any():
# Highlight problematic pose
img = vis(action, vis_link_frame=True)
violation_images.append((i, img))

return violation_images

🎬 Animation and Sequence Visualization

Rendered video example:

Motion Sequence Rendering:

def create_motion_video(action_sequence, vis, output_path="motion.mp4"):
"""Generate video from robot action sequence"""

import cv2

# Setup video writer
height, width = vis.offscreen_height, vis.offscreen_width
fourcc = cv2.VideoWriter_fourcc(*'mp4v')
out = cv2.VideoWriter(output_path, fourcc, 30.0, (width, height))

for action in action_sequence:
# Render frame
frame = vis(action)
frame_bgr = cv2.cvtColor(frame, cv2.COLOR_RGB2BGR)
out.write(frame_bgr)

out.release()

Multi-View Rendering:

def create_multi_view_image(action, vis):
"""Generate multi-angle view of robot pose"""

# Define camera positions
cameras = {
'front': np.array([[1,0,0,0], [0,0,-1,0], [0,1,0,2], [0,0,0,1]]),
'side': np.array([[0,0,1,2], [0,1,0,0], [-1,0,0,0], [0,0,0,1]]),
'top': np.array([[1,0,0,0], [0,1,0,0], [0,0,1,3], [0,0,0,1]])
}

views = []
for name, cam_pose in cameras.items():
view = vis(action, cam_pose=cam_pose, vis_world_frame=True)
views.append(view)

# Combine views
combined = np.concatenate([
np.concatenate([views[0], views[1]], axis=1), # Front | Side
np.concatenate([views[2], np.zeros_like(views[2])], axis=1) # Top | Empty
], axis=0)

return combined

🧪 Training Integration

Validation During Training:

class PoseTransformerTrainer:
def __init__(self, ..., visualizer=None):
self.vis = visualizer

def validate(self, save_figs=True):
"""Enhanced validation with visualization"""

# Standard validation metrics
val_loss = self.compute_validation_loss()

if save_figs and self.vis is not None:
# Visualize sample predictions
sample_batch = next(iter(self.test_loader))
predictions = self.model(sample_batch['pose'])

# Create comparison images
for i in range(min(4, len(predictions))):
# Robot pose prediction
robot_img = self.vis(predictions[i])

# Add human reference points
human_joints = sample_batch['target_joints'][i]
comparison_img = self.vis(predictions[i], extra_pts=human_joints)

# Save comparison
combined = np.concatenate([robot_img, comparison_img], axis=1)
Image.fromarray(combined).save(f"val_epoch_{epoch}_sample_{i}.png")

return val_loss

🔧 Configuration and Customization

🎨 Visual Appearance Tuning

Robot Transparency and Colors:

# Semi-transparent robot for better joint visibility
vis = RobotVisualizer(
urdf_path=urdf_path,
robot_transmission=0.3, # 30% transparency
vis_points_color=(1.0, 0.0, 0.0), # Bright red points
vis_link_frame_color=(0.0, 1.0, 0.0) # Green link frames
)

High-Resolution Rendering:

# 4K rendering for publication quality
vis_hires = RobotVisualizer(
urdf_path=urdf_path,
width=3840,
height=2160,
use_offscreen=True
)

📷 Camera Configuration

Optimal Camera Positions:

# Standard humanoid robot viewing angles
CAMERA_POSES = {
'front_view': np.array([
[1, 0, 0, 0],
[0, 0, -1, 0],
[0, 1, 0, 2.5], # 2.5m distance
[0, 0, 0, 1]
]),

'side_profile': np.array([
[0, 0, 1, 2.5],
[0, 1, 0, 0],
[-1, 0, 0, 0],
[0, 0, 0, 1]
]),

'three_quarter': np.array([
[0.707, 0, 0.707, 1.8],
[0, 1, 0, 0],
[-0.707, 0, 0.707, 1.8],
[0, 0, 0, 1]
])
}

# Use predefined camera
image = vis(action, cam_pose=CAMERA_POSES['three_quarter'])

🚨 Best Practices

✅ Performance Optimization

  1. Offscreen Rendering: Use use_offscreen=True for batch processing
  2. Resolution Management: Balance quality vs speed with appropriate width/height
  3. Device Selection: Use GPU device when available for faster kinematics
  4. Memory Management: Clear scene objects after rendering in loops

🔧 Common Usage Patterns

Debugging Workflow:

# 1. Initialize visualizer with debug features
vis = RobotVisualizer(urdf_path, vis_world_frame_length=0.2)

# 2. Quick pose check
def debug_pose(action, description=""):
img = vis(action, vis_world_frame=True, vis_link_frame=True)
Image.fromarray(img).save(f"debug_{description}.png")
return img

# 3. Validate transformations
def validate_symmetry(action, flipper):
original = vis(action)
flipped = vis(flipper(action))
comparison = np.concatenate([original, flipped], axis=1)
return comparison

Production Rendering:

# High-quality rendering for presentations
vis_production = RobotVisualizer(
urdf_path=urdf_path,
width=1920, height=1080,
robot_transmission=0.0, # Fully opaque
use_offscreen=True
)

def render_sequence(actions, output_dir):
for i, action in enumerate(actions):
img = vis_production(action, cam_pose=optimal_camera)
Image.fromarray(img).save(f"{output_dir}/frame_{i:06d}.png")

🐛 Troubleshooting

Common Issues:

  • Missing Meshes: Ensure URDF mesh paths are relative to URDF directory
  • Black Images: Check camera position and lighting setup
  • Memory Leaks: Clear extra nodes after each render in loops
  • Joint Mismatch: Verify action dimension matches vis.num_joints

This powerful visualization framework serves as an essential tool for robot motion analysis, providing intuitive visual feedback throughout the development and debugging process.


📊 Properties and Utilities

📋 Useful Properties

# Robot structure information
print(vis.robot_structure) # Complete kinematic chain description
print(vis.num_joints) # Number of controllable joints
print(vis.joint_names) # List of joint names from URDF
print(vis.link_names) # List of link names from URDF

# Frame name utilities
frame_name = vis.get_frame_name(frame_idx)
parent_frame, child_frame = vis.get_joint_frame_names(joint_idx)

Integration Example:

# Verify action dimensions before visualization
def safe_visualize(action, vis):
if len(action) != vis.num_joints:
raise ValueError(f"Action dimension {len(action)} doesn't match robot DOF {vis.num_joints}")

return vis(action)

This comprehensive visualization system provides the foundation for effective visual debugging, validation, and presentation of robot motion data across all stages of development.