🛡️

Session Flagged

Your session has been flagged for unusual activity.

You can try our app by searching for MultipleChat AI on Google and clicking the multiplechat.ai link to try it free.

Quick verification

Please confirm you're human to continue.

LLM ArchitectureFoundations Updated 2026

Transformer

The neural-network architecture behind virtually every modern large language model, built on a mechanism called self-attention instead of recurrence or convolution.

The Transformer is a sequence-modelling architecture introduced by Vaswani et al. (2017). Its central idea, self-attention, lets every position in a sequence attend directly to every other position, so the model can capture long-range relationships in a single step rather than passing information along a chain as recurrent networks do.

Because attention over a sequence is highly parallelisable, Transformers train far more efficiently on modern hardware than the RNNs and LSTMs they replaced. That efficiency is what made today's large language models practical to train.

Why it matters

ChatGPT, Claude, Gemini and Grok are all Transformer-based. Understanding attention, layers and context length — all Transformer concepts — is the foundation for almost every other term in this glossary.

References

Primary, peer-reviewed and archival sources for this definition.

Attention Is All You Need

Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, Ł., & Polosukhin, I. (2017). Advances in Neural Information Processing Systems 30 (NeurIPS 2017).

Source arXiv:1706.03762

Dictionary & encyclopedic entries

Wikipedia — Transformer (deep learning architecture)
IBM — Think / Topics — What is a transformer model?

Cite this entry

MultipleChat. "Transformer." MultipleChat AI & LLM Glossary, 2026. https://multiple.chat/ai-glossary/transformer

Related terms

LLM (Large Language Model) Embedding Context Window

Back to the full glossary

See this in practice

Run the same prompt across ChatGPT, Claude, Gemini and Grok — grounded in your own sources, cross-checked against each other.

Try MultipleChat Free

Continue learning

See paid plans

Pricing

Transformer

Why it matters

References

Dictionary & encyclopedic entries

Cite this entry

Related terms

See this in practice

Compare MultipleChat plans

Compare AI models side by side

Which AI should I use?

Use ChatGPT, Claude and Gemini together

Multi-model AI platform

What is multi-model AI?

AI model comparison tool

AI productivity toolkit 2026

Free AI tools from MultipleChat

Why it matters

References

Dictionary & encyclopedic entries

Cite this entry

Related terms

See this in practice

Related AI guides and next steps

Compare MultipleChat plans

Compare AI models side by side

Which AI should I use?

Use ChatGPT, Claude and Gemini together

Multi-model AI platform

What is multi-model AI?

AI model comparison tool

AI productivity toolkit 2026

Free AI tools from MultipleChat