Um, What Is a Kolmogorov-Arnold Network?
A Kolmogorov-Arnold Network (KAN) is an alternative to Multi-Layer Perceptrons (MLPs) inspired by the Kolmogorov-Arnold representation theorem. Instead of having fixed weights on edges and learnable activation functions on nodes, KANs have learnable univariate functions on edges and simple summation on nodes. Each edge contains a spline-parameterized function that can learn complex transformations, making KANs potentially more interpretable and parameter-efficient than traditional neural networks. For more details, see the original KAN paper.
How Do KANs Differ from Regular Neural Networks?
Traditional neural networks use linear combinations of inputs (weighted sums) followed by non-linear activation functions. KANs flip this: they apply learnable univariate functions on each edge, then simply sum the results at each node. This design is based on the Kolmogorov-Arnold theorem, which states that any multivariate continuous function can be represented as compositions of continuous univariate functions. The univariate functions in KANs are typically implemented as B-splines with learnable control points.
This Is Cool, Can I Repurpose It?
Please do! We've adapted the original TensorFlow Playground to work with Kolmogorov-Arnold Networks. You're free to use it in any way that follows our Apache License. And if you have any suggestions for additions or changes, please let us know.
We've also provided some controls below to enable you tailor the playground to a specific topic or lesson. Just choose which features you'd like to be visible below then save , or refresh the page.
What Do All the Colors Mean?
Orange and blue are used throughout the visualization in slightly different ways, but in general orange shows negative values while blue shows positive values.
The data points (represented by small circles) are initially colored orange or blue, which correspond to positive one and negative one.
In the KAN, the lines are colored by the average output values of the learnable functions on each edge. Blue shows positive function outputs, while orange shows negative outputs. The thickness represents the magnitude of the learned function.
In the output layer, the dots are colored orange or blue depending on their original values. The background color shows what the network is predicting for a particular area. The intensity of the color shows how confident that prediction is.
What Are the Key Parameters?
Learning Rate: Controls how fast the network learns during training. Higher values make the network learn faster but may cause instability or overshooting. Lower values provide more stable training but slower convergence. KANs may require different learning rates than traditional MLPs due to their unique spline-based architecture.
Spline Degree: Sets the polynomial degree of the B-spline basis functions. Linear (degree 1) creates piecewise linear functions, while higher degrees create smoother curves. Cubic splines (degree 3) are commonly used as they provide a good balance between smoothness and computational efficiency. Higher degrees can capture more complex patterns but may be more prone to overfitting.
Control Points: Sets how many control points each B-spline function on an edge uses. More control points let the spline capture finer detail but increase compute and memory usage; fewer control points enforce smoother, simpler functions, which can regularize but may underfit complex patterns.
Initialization: Determines how spline control points are initialized at network creation. "LeCun" and "Glorot" use basis-aware initialization schemes that account for B-spline properties and help preserve variance during forward and backward passes. "Identity" creates identity or negative identity functions, starting the network close to linear transformations. Numeric values (0.1-2.0) initialize control points with random noise of varying magnitudes: smaller values (0.1-0.3) provide gentle perturbations, while larger values (1.0-2.0) create more varied initial functions that can help break symmetry but may reduce training stability.
Problem Type: Determines the network's output configuration and loss function. Classification problems use categorical outputs with cross-entropy loss for predicting discrete classes (e.g., orange vs. blue points). Regression problems use continuous outputs with mean squared error loss for predicting continuous values (e.g., temperature, price).
Credits
KANLab is adapted from the original TensorFlow Playground Repository created by Daniel Smilkov and Shan Carter. The KAN implementation is based on the paper "KAN: Kolmogorov-Arnold Networks" and uses B-spline interpolation for the learnable univariate functions. Many thanks to the original authors and the KAN research community for their foundational work.