🛡️

Session Flagged

Your session has been flagged for unusual activity.

You can try our app by searching for MultipleChat AI on Google and clicking the multiplechat.ai link to try it free.

Quick verification

Please confirm you're human to continue.

LLM ArchitectureFoundations Updated 2026

Vision Transformer (ViT)

A Transformer applied directly to image patches, showing that the architecture behind language models also excels at computer vision.

Dosovitskiy et al. (2021) split an image into fixed-size patches, treated each as a token, and fed the sequence to a standard Transformer. With enough pre-training data, this Vision Transformer matched or beat leading convolutional networks while using less compute — demonstrating that self-attention is a general architecture, not a language-only one.

ViTs are now common in multimodal models, where image and text are processed by related Transformer machinery.

References

Primary, peer-reviewed and archival sources for this definition.

An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale

Dosovitskiy, A., Beyer, L., Kolesnikov, A., Weissenborn, D., Zhai, X., et al. (2021). International Conference on Learning Representations (ICLR 2021).

Source arXiv:2010.11929

Dictionary & encyclopedic entries

Wikipedia — Vision transformer
IBM — Think / Topics — What is a vision transformer?

Cite this entry

MultipleChat. "Vision Transformer (ViT)." MultipleChat AI & LLM Glossary, 2026. https://multiple.chat/ai-glossary/vit

Related terms

Transformer CNN (Convolutional Neural Network) Multimodal

Back to the full glossary

See this in practice

Run the same prompt across ChatGPT, Claude, Gemini and Grok — grounded in your own sources, cross-checked against each other.

Try MultipleChat Free

Continue learning

See paid plans

Pricing

Vision Transformer (ViT)

References

Dictionary & encyclopedic entries

Cite this entry

Related terms

See this in practice

Compare MultipleChat plans

Compare AI models side by side

Which AI should I use?

Use ChatGPT, Claude and Gemini together

Multi-model AI platform

What is multi-model AI?

AI model comparison tool

AI productivity toolkit 2026

Free AI tools from MultipleChat

References

Dictionary & encyclopedic entries

Cite this entry

Related terms

See this in practice

Related AI guides and next steps

Compare MultipleChat plans

Compare AI models side by side

Which AI should I use?

Use ChatGPT, Claude and Gemini together

Multi-model AI platform

What is multi-model AI?

AI model comparison tool

AI productivity toolkit 2026

Free AI tools from MultipleChat