large language model (LLM)

A large language model (LLM) is a neural network trained on very large text corpora using self-supervised objectives, such as next-token prediction or masked modeling, to predict missing or future tokens, enabling broad language understanding and generation.

Modern LLMs are typically transformer-based and are pre-trained once on large unlabeled text corpora, then adapted to downstream tasks via prompting, fine-tuning, or intermediate techniques, such as instruction tuning or reinforcement learning from human feedback.

They support multiple capabilities, such as text generation, summarization, question answering, translation, and code generation. However, they also inherit limitations from their training data and modeling process, including factual inaccuracies, biases, limited context window, and failures in reasoning or coherence.


By Martin Breuss • Updated Oct. 10, 2025 • Reviewed by Leodanis Pozo Ramos