Self-attention in the Transformer (Vaswani et al., 2017) operates over a fixed span of tokens; that span is the context window. Everything the model can "see" at once — instructions, prior turns, retrieved documents and the text it is currently producing — must fit inside it.
Exceed the window and the oldest content falls out of view: the model does not error, it simply stops attending to what no longer fits. This is why long conversations lose earlier detail and why retrieval is used to feed only the most relevant passages back in.