activation function
An activation function is a nonlinear mapping applied to a neuron’s weighted sum, enabling neural networks to model complex nonlinear relationships rather than just stacked linear transformations.
The choice of activation affects gradient propagation, output range, sparsity of neuron responses, and training stability.
Common hidden-layer functions include ReLU and its sparse, efficient variants, sigmoid and tanh, and newer smooth functions like GELU and SiLU.
For multiclass output, softmax converts logits into a probability distribution. In practice, it’s important to avoid issues such as dead ReLU units and saturation in bounded activations. It’s also important to select activations that suit the task and interact well with normalization or regularization schemes.
Related Resources
Course
Building a Neural Network & Making Predictions With Python AI
In this step-by-step course, you'll build a neural network from scratch as an introduction to the world of artificial intelligence (AI) in Python. You'll learn how to train your neural network and make accurate predictions based on a given dataset.
By Leodanis Pozo Ramos • Updated June 30, 2026