ESPADA: Execution Speedup via Semantics Aware Demonstration Data Downsampling for Imitation Learning
Seoul National University
Abstract
Behavior-cloning based policies inherit the slow, cautious tempo of human demonstrations. ESPADA is a semantic and spatially aware framework that segments demonstrations using a VLM-LLM pipeline with 3D gripper-object relations, enabling aggressive downsampling only in non-critical segments while preserving precision-critical phases. To scale from a single annotated episode to the full dataset, ESPADA propagates segment labels via Dynamic Time Warping on dynamics-only features, achieving approximately a 2x speed-up while maintaining success rates, without extra data, architectural modifications, or any form of retraining.
Materials
BibTeX
@article{kim2026espada,
title={ESPADA: Execution Speedup via Semantics Aware Demonstration Data Downsampling for Imitation Learning},
author={Kim, Byungju and Jinu, Pahk and Lee, Chungwoo and Kim, Jaejoon and Lee, Jangha and Kim, Theo Taeyeong and Shim, Kyuhwan and Lee, Jun Ki and Zhang, Byoung-Tak},
journal={IEEE Robotics and Automation Letters},
year={2026}
}