Sutskever et al. (2014) introduced the sequence-to-sequence framework: an encoder network compresses an input sequence into a representation, and a decoder network generates an output sequence from it. This encoder–decoder pattern underpinned neural machine translation and is the conceptual ancestor of modern generative models.
Attention and then the Transformer were developed to overcome the bottleneck of squeezing a whole input into one fixed vector.