Efficient Policy Learning via Knowledge Distillation for Robotic Manipulation
Loading...
Date
2025
Authors
Severhin, Oleksandr
Kuzmenko, Dmytro
Shvai, Nadiya
Journal Title
Journal ISSN
Volume Title
Publisher
Національний університет "Києво-Могилянська академія"
Abstract
The work focuses on the computational intractability of large-scale Reinforcement Learning (RL) models for robotic manipulation. While world-like models like TD-MPC2 demonstrate high performance in various manipulative tasks, their immense parameter count (e.g., 317M) hinders training and deployment on resource-constrained hardware. This research investigates Knowledge Distillation (KD) with a loss function specifically described in [1] and [2] as a primary method for model compression. This involves training a lightweight "student" model to mimic the behavior of a large, pre-trained "teacher" model. Unlike in supervised learning, distilling knowledge in RL is uniquely complex; the objective is to transfer a dynamic, reward-driven policy, not a simple input-output function.
Description
Keywords
model compression, Reinforcement Learning (RL), robotic manipulation, World-like models / TD-MPC2, conference materials
Citation
Severhin O. Efficient Policy Learning via Knowledge Distillation for Robotic Manipulation / Severhin O., Kuzmenko D., Shvai N. // Теоретичні та прикладні аспекти побудови програмних систем : праці 16 Міжнародної науково-практичної конференції, 23-24 листопада 2025 року, Київ / [за заг. ред. М. М. Глибовця, Т. В. Панченка та ін. ; Факультет інформатики Національного університету "Києво-Могилянська академія" та ін.]. - Київ : НаУКМА, 2025. - С. 64-66.