Skip to content

transformer

A transformer is a neural network model that processes sequences using self-attention to capture long-range dependencies without recurrence or convolutions.

The standard design stacks attention and feed-forward layers with residual connections, layer normalization, and positional encodings. Common variants include encoder-only models for representation learning, decoder-only models for autoregressive generation, and encoder–decoder models for sequence-to-sequence tasks.

Transformers form the foundation of modern language and vision models because of efficient parallel training, strong scaling behavior, and adaptable decoding for tasks such as classification, translation, and text or image generation.

Tutorial

Hugging Face Transformers: Leverage Open-Source AI in Python

As the AI boom continues, the Hugging Face platform stands out as the leading open-source model hub. In this tutorial, you'll get hands-on experience with Hugging Face and the Transformers library in Python.

intermediate ai

For additional information on related topics, take a look at the following resources:


By Leodanis Pozo Ramos • Updated July 1, 2026