🛡️

Session Flagged

Your session has been flagged for unusual activity.

You can try our app by searching for MultipleChat AI on Google and clicking the multiplechat.ai link to try it free.

Quick verification

Please confirm you're human to continue.

SafetySecurity Updated 2026

Jailbreak

A prompt crafted to bypass a model's safety training and make it produce content it is meant to refuse.

A jailbreak circumvents the guardrails installed by safety training. Wei et al. (2023) attribute success to two failure modes — competing objectives and mismatched generalisation — and Zou et al. (2023) showed that automatically optimised adversarial suffixes can transfer across many models, including closed commercial ones.

Jailbreaking differs from prompt injection: a jailbreak targets the model's own safety policy, while injection smuggles instructions through data the model processes. Both are active, unsolved security problems.

References

Primary, peer-reviewed and archival sources for this definition.

Jailbroken: How Does LLM Safety Training Fail?

Wei, A., Haghtalab, N., & Steinhardt, J. (2023). Advances in Neural Information Processing Systems 36 (NeurIPS 2023).

Source arXiv:2307.02483

Universal and Transferable Adversarial Attacks on Aligned Language Models

Zou, A., Wang, Z., Carlini, N., Nasr, M., Kolter, J. Z., & Fredrikson, M. (2023). arXiv preprint.

Source arXiv:2307.15043

Dictionary & encyclopedic entries

OWASP — LLM Top 10 — Prompt Injection & jailbreaking
Wikipedia — Prompt injection — Jailbreaking

Cite this entry

MultipleChat. "Jailbreak." MultipleChat AI & LLM Glossary, 2026. https://multiple.chat/ai-glossary/jailbreak

Related terms

Prompt Injection Alignment System Prompt

Back to the full glossary

See this in practice

Run the same prompt across ChatGPT, Claude, Gemini and Grok — grounded in your own sources, cross-checked against each other.

Try MultipleChat Free

Continue learning

See paid plans

Pricing

Jailbreak

References

Dictionary & encyclopedic entries

Cite this entry

Related terms

See this in practice

Compare MultipleChat plans

Compare AI models side by side

Which AI should I use?

Use ChatGPT, Claude and Gemini together

Multi-model AI platform

What is multi-model AI?

AI model comparison tool

AI productivity toolkit 2026

Free AI tools from MultipleChat

References

Dictionary & encyclopedic entries

Cite this entry

Related terms

See this in practice

Related AI guides and next steps

Compare MultipleChat plans

Compare AI models side by side

Which AI should I use?

Use ChatGPT, Claude and Gemini together

Multi-model AI platform

What is multi-model AI?

AI model comparison tool

AI productivity toolkit 2026

Free AI tools from MultipleChat