Draft
Kernel Function
Compare two inputs as if they were mapped into a feature space, often without constructing that space explicitly.
Hook problem: mapping first can be expensive
A feature map says, “rewrite every input, then compare the rewritten vectors.” That is clear, but it can be wasteful when the rewritten vector is huge.
A kernel function asks for the comparison directly.
phi(A) * phi(B)
K(A, B) = (A * B)^2
The shortcut is useful when phi is huge or infinite.
First naive idea: explicitly build every feature
The direct route is:
- compute
phi(x); - compute
phi(z); - take their inner product.
This is fine for a small quadratic map. It is painful when the feature space has thousands, millions, or infinitely many coordinates.
Core invention: compare through a shortcut
A kernel function has the form:
The inputs x and z stay in the original space. The function returns the inner product they would have after the feature map phi.
Interactive similarity lab
Kernel similarity lab
exp(-gamma ||x - z||^2), gamma = 0.500: Turns nearness into similarity; far points fade toward zero.
Compare every point with the chosen anchor. Notice how each kernel means a different kind of close.
similarity; dot 2, distance^2 0
similarity; dot 3, distance^2 1
similarity; dot 1, distance^2 5
similarity; dot -2, distance^2 10
Static no-JS fallback:
| Point | Dot | Distance^2 | K(A, z) |
|---|---|---|---|
| A | 2 | 0 | 1 |
| B | 3 | 1 | 0.607 |
| C | 1 | 5 | 0.082 |
| D | -2 | 10 | 0.007 |
Valid-kernel boundary
Not every similarity score is a valid kernel for kernel methods. A valid kernel must behave like an inner product in some feature space, which means its Gram matrix should be positive semidefinite for any finite sample.
For this node, keep the practical rule: named kernels are useful because they come with known geometry and known conditions. The proof machinery belongs in a later node.
Implementation sketch
function rbfKernel(x: Point, z: Point, gamma: number) {
const squaredDistance = (x.a - z.a) ** 2 + (x.b - z.b) ** 2;
return Math.exp(-gamma * squaredDistance);
}
Common confusions
- A kernel is a function of two inputs, not a new coordinate vector.
- Kernel value is similarity in a chosen geometry, not universal semantic similarity.
- Some kernels have parameters; changing them changes the geometry.
idea layer
idea layer
named choice
named choice
named choice
named choice
Exercises
- What does
K(x,z)return? - Why might we avoid constructing
phi(x)explicitly? - Why is the sigmoid kernel treated more cautiously than RBF?
Graph connections : Kernel Function