Researchers introduce SDFT, a novel approach that enables models to learn continuously from demonstrations without catastrophic forgetting, outperforming traditional methods by preserving prior capabilities while acquiring new skills.
Continual learning—the ability of AI models to acquire new knowledge and skills without forgetting what they've already learned—represents one of the most significant challenges in modern machine learning. A new paper from researchers at Idan Shenfeld, Mehul Damani, Jonas Hübotter, and Pulkit Agrawal introduces a promising solution called Self-Distillation Fine-Tuning (SDFT) that addresses this fundamental limitation in foundation models.
The core problem with current approaches to continual learning lies in the trade-off between new learning and preserving existing capabilities. Traditional methods typically fall into two categories: on-policy reinforcement learning and supervised fine-tuning (SFT). While on-policy reinforcement learning can effectively reduce forgetting, it requires explicit reward functions that are often unavailable in real-world applications. SFT, the primary alternative for learning from expert demonstrations, is inherently off-policy and suffers from catastrophic forgetting—where new training overwrites previously learned capabilities.
SDFT presents an elegant solution by enabling on-policy learning directly from demonstrations without requiring reward functions. The method leverages in-context learning by using a demonstration-conditioned model as its own teacher. This self-distillation approach generates on-policy training signals that preserve prior capabilities while simultaneously acquiring new skills.
The technical implementation of SDFT is conceptually straightforward yet powerful. By treating the model as both student and teacher, it creates a self-reinforcing learning process that maintains alignment with previous capabilities while adapting to new information. This approach avoids the distribution shift inherent in traditional SFT methods, where the model's behavior changes significantly between training on old and new tasks.
Experimental results demonstrate SDFT's superiority across multiple skill learning and knowledge acquisition tasks. The method consistently outperforms traditional SFT, achieving higher accuracy on new tasks while substantially reducing catastrophic forgetting. In sequential learning experiments, SDFT enabled a single model to accumulate multiple skills over time without performance regression—a critical advancement for building truly adaptive AI systems.
The implications of this research extend beyond academic interest. As foundation models become increasingly prevalent in real-world applications, the ability to continually learn without forgetting will be essential for deployment in dynamic environments. SDFT provides a practical path toward this goal, establishing on-policy distillation as a viable approach for continual learning from demonstrations alone.
The researchers have made their work available through arXiv:2601.19897, where they provide detailed methodology, experimental results, and analysis. This paper represents a significant step forward in addressing one of machine learning's most persistent challenges, bringing us closer to AI systems that can learn and adapt like humans—building upon past knowledge while acquiring new capabilities.
As the field continues to evolve, methods like SDFT will play a crucial role in developing more robust and adaptable AI systems. The simplicity of the approach, combined with its strong empirical performance, suggests it could see rapid adoption in both research and applied settings. Future work may explore extensions to more complex learning scenarios and integration with other continual learning techniques to further enhance performance.

Comments
Please log in or register to join the discussion