Fine-tuning takes a model that already understands language and nudges its weights on a smaller, targeted dataset. Howard & Ruder (2018) established the modern transfer-learning recipe for NLP with ULMFiT, and Devlin et al. (2019) made pre-train-then-fine-tune the dominant paradigm with BERT.
Compared with prompting, fine-tuning can bake in a behaviour permanently and cheaply at inference — at the cost of a training run and a dataset to maintain.