Because a model treats all text in its context as potentially instructive, attacker-controlled text can subvert intended behaviour. Perez & Ribeiro (2022) demonstrated goal-hijacking and prompt-leaking attacks, and Greshake et al. (2023) showed indirect prompt injection, where malicious instructions are planted in web pages or documents the model later retrieves.
Prompt injection is recognised as the top security risk for LLM applications by OWASP, and defending against it is an active area with no complete solution.