transformer architecture

Transformer architecture is a neural network design that models sequence dependencies using self-attention instead of recurrence or convolutions.

A standard transformer stacks encoder and decoder blocks composed of multihead self-attention and positionwise feed-forward layers, wrapped with residual connections and layer normalization.

Transformers can be specialized for different goals, such as encoder-only models for representation and discrimination, decoder-only models for autoregressive generation, and encoder–decoder models for sequence-to-sequence tasks.

By Leodanis Pozo Ramos • Updated Oct. 21, 2025

AI Coding Glossary Share Feedback