LLM observability
LLM observability is the practice of collecting and correlating telemetry about large language model applications across prompts, intermediate steps, tool calls, retrieval operations, and outputs. This data is then used to debug behavior and monitor quality, reliability, safety, and cost.
An LLM observability stack typically combines logs, metrics, and end-to-end traces built from hierarchical spans that link each operation involved in handling a request. It capture behavior details, such as token counts, latencies, errors, costs, and evaluation signals. It also provides controls and retention policies so that teams can gain deep visibility into LLM behavior while respecting security and governance requirements.
By Leodanis Pozo Ramos • Updated Nov. 18, 2025