ref_obs_term_cfg
Module: GBC.gyms.isaaclab_45.managers.ref_obs_term_cfg
This module defines the configuration framework for reference observation terms used by the RefObservationManager
. It provides three core configuration classes that enable flexible and comprehensive specification of reference observation behavior, processing pipelines, and organizational structures for imitation learning tasks.
📚 Dependencies
from __future__ import annotations
import torch
from collections.abc import Callable
from dataclasses import MISSING
from typing import TYPE_CHECKING, Any, Tuple, Union, Literal
from isaaclab.utils import configclass
from isaaclab.utils.modifiers import ModifierCfg
from isaaclab.utils.noise import NoiseCfg
from isaaclab.managers.manager_base import ManagerTermBaseCfg
🏗️ Configuration Classes
📋 ReferenceObservationTermCfg
Module Name: GBC.gyms.isaaclab_45.managers.ref_obs_term_cfg.ReferenceObservationTermCfg
Purpose: Defines the configuration for individual reference observation terms, specifying data sources, processing pipelines, temporal behavior, and network integration settings.
Class Definition:
@configclass
class ReferenceObservationTermCfg(ManagerTermBaseCfg):
"""Configuration for a reference observation term."""
🔧 Core Data Specification:
- name (str | dict | None): Identifier for the reference observation term in pickle data
- func (Callable | None): Processing function to modify or transform the term
- params (dict[str, Any] | None): Parameters passed to the processing function
🎭 Symmetry and Mirroring:
- symmetry (Callable | None): Function for computing symmetrical observations (left-right mirroring)
- symmetry_params (dict[str, Any]): Parameters for the symmetry function (e.g., flipper configurations)
⚙️ Processing Pipeline:
- modifiers (list[ModifierCfg] | None): Sequential modification functions applied to observations
- noise (NoiseCfg | None): Noise model for domain randomization
- clip (tuple[float, float] | None): Value clipping bounds for outlier prevention
- scale (float | None): Linear scaling factor for normalization
🎯 Behavior Control:
- make_empty (bool): Generate empty observations (useful for debugging or placeholder terms)
- in_obs_tensor (bool): Include observation in network input tensor (False for statistics/debugging)
- is_constant (bool): Time-invariant observation flag (for statistics or static parameters)
- is_base_pose (bool): Use cumulative base pose computation from velocity integration
⏱️ Temporal Configuration:
- load_seq_delay (float): Sequence loading delay in seconds (for temporal offset effects)
- history_length (int): Number of past observations to store for temporal dependencies
- flatten_history_dim (bool): Flatten history dimensions from (N, H, D) to (N, H*D) format
🎯 Functionality: This class provides comprehensive control over individual observation terms, enabling:
- Data Source Specification: Define which data from pickle files to load
- Processing Customization: Apply custom transformations and processing functions
- Symmetry Support: Enable left-right mirroring for data augmentation
- Temporal Management: Configure history buffering and temporal delays
- Network Integration: Control inclusion in neural network inputs
- Quality Control: Apply noise, clipping, and scaling for robust training
📊 ReferenceObservationGroupCfg
Module Name: GBC.gyms.isaaclab_45.managers.ref_obs_term_cfg.ReferenceObservationGroupCfg
Purpose: Organizes multiple reference observation terms into logical groups with shared behavior and output formatting, enabling efficient management of related observations.
Class Definition:
@configclass
class ReferenceObservationGroupCfg:
"""Configuration for a group of reference observation terms."""
🔗 Output Organization:
- concatenate_terms (bool): Concatenate all terms in group along last dimension vs. return as dictionary
- enable_corruption (bool): Enable corruption/noise models for all terms in the group
⏱️ Group-Level Temporal Settings:
- load_seq_delay (float): Global sequence loading delay for all terms (overrides individual term delays)
- history_length (int | None): Global history length for all terms (overrides individual settings if set)
- flatten_history_dim (bool): Global history flattening behavior for all terms
🎯 Functionality: This class enables:
- Logical Organization: Group related observations (e.g., policy vs. critic observations)
- Consistent Behavior: Apply uniform temporal and processing settings across related terms
- Output Formatting: Control whether observations are concatenated or returned as dictionaries
- Hierarchical Configuration: Override individual term settings with group-level defaults
⚠️ Important Design Note:
For PPO training with teacher coefficients, reference actions must be the first term in the group for proper critic processing. This ordering requirement ensures correct indexing in the training pipeline.
🏛️ ReferenceObservationCfg
Module Name: GBC.gyms.isaaclab_45.managers.ref_obs_term_cfg.ReferenceObservationCfg
Purpose: Top-level configuration class that specifies data sources, operational modes, and global settings for the entire reference observation system.
Class Definition:
@configclass
class ReferenceObservationCfg:
"""Main configuration for reference observation system."""
📁 Data Source Configuration:
- data_dir (list[str]): List of directories containing GBC standard format pickle files (supports recursive search)
🔄 Operational Modes:
- working_mode (Literal["recurrent", "recurrent_strict", "singular"]): Defines how reference sequences are played back:
- "recurrent": Sequences with cyclic_subseq play repeatedly; others play once then zero out
- "recurrent_strict": Only sequences with cyclic_subseq are used; others discarded
- "singular": All sequences play once through entire episode then zero out
⏱️ Global Timing:
- static_delay (float): Global static delay for all reference observation retrieval (in seconds)
🎯 Functionality: This class provides:
- System-Wide Configuration: Control global behavior of the reference observation system
- Data Source Management: Specify multiple data directories for comprehensive reference datasets
- Playback Control: Define how reference sequences are handled during episodes
- Temporal Coordination: Set global timing offsets for reference data access
📊 Working Mode Details:
"recurrent" Mode:
- Cyclic Sequences: Sequences with
cyclic_subseq
metadata repeat continuously - Linear Sequences: Sequences without cyclic metadata play once, then observations become zero
- Flexible Training: Suitable for mixed datasets with both cyclic (walking) and acyclic (gestures) motions
"recurrent_strict" Mode:
- Cyclic Only: Only sequences with
cyclic_subseq
metadata are loaded and used - Continuous Playback: All loaded sequences repeat continuously throughout episodes
- Locomotion Focus: Ideal for purely locomotion-based training tasks
"singular" Mode:
- Single Playback: All sequences play through once during episodes
- Episode Alignment: Sequence timing aligned with episode duration
- Task-Specific: Suitable for discrete task training or demonstration following
🏗️ Configuration Hierarchy
ReferenceObservationCfg (System Level)
├── data_dir: Global data sources
├── working_mode: System playback behavior
├── static_delay: Global timing offset
└── [Group Configurations]
│
├── ReferenceObservationGroupCfg (Group Level)
│ ├── concatenate_terms: Output formatting
│ ├── load_seq_delay: Group timing override
│ ├── history_length: Group history override
│ └── [Term Configurations]
│ │
│ └── ReferenceObservationTermCfg (Term Level)
│ ├── name: Data source identifier
│ ├── func: Processing function
│ ├── symmetry: Mirroring function
│ ├── processing pipeline: noise, clip, scale
│ ├── behavior: in_obs_tensor, is_constant
│ └── temporal: load_seq_delay, history_length
🔧 Configuration Override Hierarchy:
- System Level:
ReferenceObservationCfg
sets global defaults - Group Level:
ReferenceObservationGroupCfg
can override for all terms in group - Term Level:
ReferenceObservationTermCfg
provides finest-grained control
This hierarchical configuration system enables both broad system-wide settings and fine-grained per-term customization, providing maximum flexibility for diverse imitation learning scenarios while maintaining organizational clarity and configuration consistency.