Frozen vision-language-action (VLA) policies fail unpredictably on precision phases of contact-rich manipulation: their internal latent advances past a sub-goal while the end-effector remains kinematically stalled — we call this failure mode premature commitment. We detect it step-by-step in a fully self-supervised way by reading two estimators of the same quantity — normalized demo phase — off two different sensors: the action-expert pre-head hidden state (policy belief) and proprioception (physical reality). Their disagreement δ_t is the execution gap; under three natural assumptions on kinematic ground-truth, the demo manifold, and policy determinism, δ_t > 0 deterministically implies departure from the success manifold (Theorem 1). On top of the same signal we add a rank-8 residual adapter (≈ 1K parameters) that refines actions only inside the detected critical phases. We prove that the return-to-base mechanism requires both a training-time sparsity regularizer and an inference-time hysteresis hard gate — neither suffices alone — yielding bit-exact Pareto-safety and bounded-Lipschitz smoothness (Lemmas 1–3). We validate on LIBERO-Long with π_0 and π_0.5 backbones.
@article{choi2026cpd,title={Self-Supervised Critical Phase Detection for VLA Refinement},author={Choi, Chanyeok and Lee, Youngmoon},journal={Preprint},year={2026},note={Preprint, under review},keywords={vision-language-action, manipulation, critical phase detection, self-supervised learning, residual policy learning, hysteresis control, LIBERO}}
We study reward-poisoning attacks on cooperative multi-agent reinforcement learning, where an attacker agent participates in the same environment as the cooperative crawler agents and places high-reward lure points that redirect the crawlers off-trajectory. On a Unity 50×50 m benchmark we evaluate PPO and SAC in both single- and multi-agent settings, observing cumulative-reward drops of 18.7% (multi-agent PPO) and 20.9% (multi-agent SAC), and up to 98.1% for single-agent SAC. We argue the asymmetry is structural: PPO’s on-policy clipping locks the policy into the first sampled lure region, while SAC’s off-policy replay buffer dilutes poison samples — except at small buffer sizes.
@inproceedings{choi2025poisoning,title={Poisoning Attacks on Multi-Agent Reinforcement Learning Systems},author={Choi, Chanyeok and Cho, Jaehwan and Lee, Youngmoon},booktitle={IEEE-RAS International Conference on Humanoid Robots (Humanoids), Late-Breaking Report},year={2025},keywords={multi-agent reinforcement learning, adversarial attacks, reward poisoning, PPO, SAC, humanoid robots}}
2024
ICPR
HAPtics: Human Action Prediction in Real-time via Pose Kinematics
Niaz Ahmad, Saif Ullah, Jawad Khan, Chanyeok Choi, and Youngmoon Lee
In International Conference on Pattern Recognition (ICPR), 2024
@inproceedings{ahmad2024haptics,title={{HAPtics}: Human Action Prediction in Real-time via Pose Kinematics},author={Ahmad, Niaz and Ullah, Saif and Khan, Jawad and Choi, Chanyeok and Lee, Youngmoon},booktitle={International Conference on Pattern Recognition (ICPR)},address={Kolkata, India},year={2024},doi={10.1007/978-3-031-78354-8_10},keywords={human pose, action prediction, vision}}
@inproceedings{choi2024drone,title={Causes and Fixes of Unexpected Drone Shutoffs},author={Choi, Hojun and Choi, Chanyeok and Lee, Youngmoon},booktitle={ACM/IEEE International Symposium on Low Power Electronics and Design (ISLPED)},year={2024},doi={10.1145/3665314.3670847},keywords={drones, reliability, field robotics}}
Snapbot is a manipulator-based photography system that frames, composes, and captures stylized portraits in real time, treating the human subject as a dynamic interaction partner rather than a static target.
@inproceedings{choi2024snapbot,title={{Snapbot}: Enabling Dynamic Human-Robot Interactions for Real-Time Computational Photography},author={Choi, Chanyeok and Lee, Youngmoon},booktitle={ACM/IEEE International Conference on Human-Robot Interaction (HRI), Late-Breaking Report},year={2024},keywords={human-robot interaction, manipulation, computational photography}}
2023
Preprint
Leveraging Keypoints as Dynamic Centroids for Unified Representation of Human Pose and Instance Segmentation
Niaz Ahmad, Jawad Khan, Chanyeok Choi, Youngmoon Lee, and Kang G. Shin
@article{ahmad2023keypoints,title={Leveraging Keypoints as Dynamic Centroids for Unified Representation of Human Pose and Instance Segmentation},author={Ahmad, Niaz and Khan, Jawad and Choi, Chanyeok and Lee, Youngmoon and Shin, Kang G.},journal={Preprint},year={2023},note={Withdrawn from CVPR 2024},keywords={human pose, instance segmentation, keypoints}}
ICDM-W
SSK-DNN: Semantic and Sentiment Knowledge for Incremental Text Sentiment Classification
Jawad Khan, Niaz Ahmad, Chanyeok Choi, Saif Ullah, Gyu Rin Kim, and Youngmoon Lee
In IEEE ICDM Workshop on Incremental Learning (IncrLearn), 2023
@inproceedings{khan2023ssk,title={{SSK-DNN}: Semantic and Sentiment Knowledge for Incremental Text Sentiment Classification},author={Khan, Jawad and Ahmad, Niaz and Choi, Chanyeok and Ullah, Saif and Kim, Gyu Rin and Lee, Youngmoon},booktitle={IEEE ICDM Workshop on Incremental Learning (IncrLearn)},address={Shanghai, China},year={2023},keywords={natural language processing, sentiment classification, incremental learning}}