attention mechanism
An attention mechanism is a neural network operation that computes a weighted sum of value vectors based on the similarity between a query and a set of keys. This mechanism allows models to focus on the most relevant information in or across sequences.
Variants include self-attention for intra-sequence dependencies and cross-attention to condition a decoder on encoder outputs. Earlier sequence-to-sequence models introduced additive and global or local attention to address fixed-length bottlenecks, while transformers popularized multi-head self-attention, masking for causal generation, and encoder–decoder cross-attention.
By Leodanis Pozo Ramos • Updated Oct. 21, 2025