ESPADA: Execution Speedup via Semantics Aware Demonstration Data Downsampling for Imitation Learning

Kim, Byungju; Jinu, Pahk; Lee, Chungwoo; Kim, Jaejoon; Lee, Jangha; Kim, Theo; Shim, Kyuhwan; Lee, Jun; Zhang, Byoung-Tak

IEEE Robotics and Automation Letters (RA-L) 2026

Byungju Kim

Pahk Jinu

Chungwoo Lee

Seoul National University

Jaejoon Kim

Jangha Lee

Theo Taeyeong Kim

Kyuhwan Shim

¹Graduate School of AI & ²AI Institute, Seoul National University

Jun Ki Lee

AI Institute, Seoul National University

Byoung-Tak Zhang

¹Graduate School of AI & ²AI Institute, Seoul National University

Abstract

Behavior-cloning based policies inherit the slow, cautious tempo of human demonstrations. ESPADA is a semantic and spatially aware framework that segments demonstrations using a VLM-LLM pipeline with 3D gripper-object relations, enabling aggressive downsampling only in non-critical segments while preserving precision-critical phases. To scale from a single annotated episode to the full dataset, ESPADA propagates segment labels via Dynamic Time Warping on dynamics-only features, achieving approximately a 2x speed-up while maintaining success rates, without extra data, architectural modifications, or any form of retraining.

Materials

Paper Project Page

BibTeX

@article{kim2026espada,
  title={ESPADA: Execution Speedup via Semantics Aware Demonstration Data Downsampling for Imitation Learning},
  author={Kim, Byungju and Jinu, Pahk and Lee, Chungwoo and Kim, Jaejoon and Lee, Jangha and Kim, Theo Taeyeong and Shim, Kyuhwan and Lee, Jun Ki and Zhang, Byoung-Tak},
  journal={IEEE Robotics and Automation Letters},
  year={2026}
}