When a network is trained on a new task, gradient updates can overwrite the weights that encoded earlier knowledge, erasing prior skills — catastrophic forgetting. Kirkpatrick et al. (2017) addressed it with Elastic Weight Consolidation, which slows learning on the weights most important to past tasks, letting a model retain old abilities while learning new ones.
It is a central obstacle to continual learning and to fine-tuning models without degrading their general capabilities.