play
Module: GBC.gyms.isaaclab_45.workflows.rsl_rl.play
Overview 📖
The inference and evaluation script for trained RL/IL agents using RSL-RL framework. This script loads pre-trained model checkpoints and runs them in the simulation environment for evaluation, demonstration, and video recording purposes.
Core Features 🎯
- 🤖 Model Inference: Load and run trained RL/IL agents in inference mode
- 🎥 Video Recording: Record agent performance for analysis and demonstration
- 📊 Multi-modal Support: Handles both standard observations and reference observations
- 🔄 Checkpoint Loading: Automatic checkpoint discovery and loading
- 📤 Model Export: Export trained policies to JIT/ONNX formats for deployment
- ⚡ Performance Optimized: Runs in inference mode for maximum performance
Command Line Usage 🚀
Basic Inference Command
python play.py --task {TASK_NAME}
Video Recording Command
python play.py --task {TASK_NAME} --video --video_length 500
Custom Environment Setup
python play.py \
    --task {TASK_NAME} \
    --num_envs 64 \
    --video \
    --video_length 1000 \
    --disable_fabric
Command Line Arguments ⚙️
Core Arguments
- --task(str): Task name (must match registration in- __init__.py)
- --num_envs(int, default=64): Number of parallel environments for evaluation
Video Recording Arguments
- --video(flag): Enable video recording during inference
- --video_length(int, default=200): Length of recorded video in steps
Performance Arguments
- --disable_fabric(flag): Disable Fabric and use USD I/O operations (slower but more compatible)
Checkpoint Loading Arguments (inherited from CLI)
- --load_run(str): Specific run directory to load checkpoint from
- --load_checkpoint(str): Specific checkpoint file to load
Main Function Pipeline 🔄
Configuration and Checkpoint Loading
def main():
    # Parse environment and agent configurations
    env_cfg = parse_env_cfg(args_cli.task, device=args_cli.device, num_envs=args_cli.num_envs)
    agent_cfg = cli_args.parse_rsl_rl_cfg(args_cli.task, args_cli)
    
    # Locate checkpoint path
    log_root_path = os.path.join("logs", "rsl_rl", agent_cfg.experiment_name)
    resume_path = get_checkpoint_path(log_root_path, agent_cfg.load_run, agent_cfg.load_checkpoint)
Environment Setup
# Create evaluation environment
env = gym.make(args_cli.task, cfg=env_cfg, render_mode="rgb_array" if args_cli.video else None)
# Wrap for RSL-RL compatibility
env = RslRlReferenceVecEnvWrapper(env)
Model Loading and Inference
# Load trained model
ppo_runner = OnPolicyRunnerMM(env, agent_cfg.to_dict(), log_dir=None, device=agent_cfg.device)
ppo_runner.load(resume_path)
# Get inference policy
policy = ppo_runner.get_inference_policy(device=env.unwrapped.device)
# Run inference loop
with torch.inference_mode():
    actions = policy(obs, ref_obs)
    obs, ref_obs, _, _, extras = env.step(actions)
Checkpoint Discovery 🔍
Automatic Checkpoint Location
The script automatically locates checkpoints using the following hierarchy:
logs/rsl_rl/{experiment_name}/{run_directory}/checkpoints/
Checkpoint Loading Logic
- 🎯 Specific Run: If --load_runspecified, search in that run directory
- 🔄 Latest Run: If not specified, use the most recent run directory
- 📁 Checkpoint Selection: Load the specified checkpoint or the latest one
Example Checkpoint Paths
# Load latest checkpoint from latest run
python play.py --task Isaac-Humanoid-Amp-v0
# Load specific run and checkpoint
python play.py --task Isaac-Humanoid-Amp-v0 --load_run "2024-01-15_10-30-00_experiment" --load_checkpoint "model_5000.pt"
Video Recording 🎥
Recording Configuration
video_kwargs = {
    "video_folder": os.path.join(log_dir, "videos", "play"),
    "step_trigger": lambda step: step == 0,  # Record from start
    "video_length": args_cli.video_length,
    "disable_logger": True,
}
Output Location
Videos are saved in the checkpoint's log directory:
logs/rsl_rl/{experiment_name}/{run_directory}/videos/play/
Video Features
- 📹 High Quality: RGB array rendering for clear visualization
- ⏱️ Configurable Length: Customize recording duration
- 📁 Organized Storage: Videos saved alongside model checkpoints
Model Export 📤
Export Functionality (Currently Commented)
# Export to different formats
export_model_dir = os.path.join(os.path.dirname(resume_path), "exported")
# JIT export for PyTorch deployment
export_policy_as_jit(
    ppo_runner.alg.actor_critic, 
    ppo_runner.obs_normalizer, 
    path=export_model_dir, 
    filename="policy.pt"
)
# ONNX export for cross-platform deployment
export_policy_as_onnx(
    ppo_runner.alg.actor_critic, 
    normalizer=ppo_runner.obs_normalizer, 
    path=export_model_dir, 
    filename="policy.onnx"
)
Export Output Structure
logs/rsl_rl/{experiment_name}/{run_directory}/exported/
├── policy.pt      # JIT compiled model
└── policy.onnx    # ONNX format model
Inference Loop Details ⚡
Multi-modal Observation Handling
# Get both standard and reference observations
obs, _ = env.get_observations()
ref_obs, _ = env.get_reference_observations()
# Policy uses both observation types
with torch.inference_mode():
    actions = policy(obs, ref_obs)
Performance Optimization
- 🚀 Inference Mode: torch.inference_mode()for maximum performance
- 🎯 No Gradient Computation: Disables autograd for evaluation
- ⚡ Optimized Rendering: Conditional rendering based on video recording
Usage Examples 💡
Quick Model Evaluation
# Run latest trained model for quick evaluation
python play.py --task Isaac-Humanoid-Amp-v0 --num_envs 16
High-Quality Video Recording
# Record long demonstration video
python play.py --task Isaac-Humanoid-Amp-v0 --video --video_length 2000 --num_envs 1
Specific Checkpoint Evaluation
# Evaluate specific training checkpoint
python play.py \
    --task Isaac-Humanoid-Amp-v0 \
    --load_run "2024-01-15_experiment" \
    --load_checkpoint "model_8000.pt" \
    --video
Performance Testing
# Test with many environments for throughput measurement
python play.py --task Isaac-Humanoid-Amp-v0 --num_envs 256 --disable_fabric
Troubleshooting 🔧
Common Issues
Checkpoint Not Found
# Error: No checkpoint found in logs/rsl_rl/{experiment}/...
Solution: Verify experiment name and run directory exist
Task Registration Error
# Error: gymnasium.error.UnregisteredEnv: No registered env with id: YourTask
Solution: Ensure task name matches registration in __init__.py
Video Recording Issues
# Error: Camera not enabled for video recording
Solution: Ensure --video flag automatically enables cameras
Performance Optimization Tips
- 🎯 Reduce Environments: Use fewer environments for video recording
- ⚡ Disable Fabric: Use --disable_fabricif experiencing rendering issues
- 📹 Shorter Videos: Reduce --video_lengthfor faster recording
Related Components 🔗
- 🏗️ train.py: Training script that generates the checkpoints
- 📊 OnPolicyRunnerMM: Multi-modal policy runner for inference
- 🎭 RslRlReferenceVecEnvWrapper: Environment wrapper for RSL-RL compatibility
- 💾 Checkpoint Utils: Utilities for checkpoint discovery and loading
This inference script provides a complete evaluation and demonstration system for trained RL/IL agents, enabling easy assessment of training results and generation of high-quality demonstration videos.