How to pre-initialize all the tensors in LeRobot models when training with Accelerate+DeepSpeed
Summary Training LeRobot models with Accelerate+DeepSpeed Stage 3 Offload requires pre-initializing all tensors to avoid runtime errors caused by FP32 data clips that cannot be dynamically created during training. This issue arises when using optimizer offloading to NVMe devices and specific model features like XVLA. Root Cause Dynamic tensor creation: Tensors are typically created on-the-fly … Read more