Graph connections

Draft

Polynomial Kernel

Add powers and feature interactions through a dot-product shortcut instead of manually expanding every term.

concept intermediate machine-learningkernelssimilarity

Hook problem: interactions matter

The raw coordinate x1 and the raw coordinate x2 might each be weak, while their product x1 x2 is useful. Manually adding every square, cube, and interaction quickly becomes tedious.

The polynomial kernel packages those interactions into one formula.

Interactions create curved decisionsA degree-2 polynomial kernel can behave like a dot product over square and interaction features.
Polynomial kernel from anchor A
PointDotDistance^2K(A, z)
A204
B316.250
C152.250
D-2100

First naive idea: expand the feature vector by hand

For degree 2, a map can include x1^2, x1 x2, and x2^2. For higher degrees and more dimensions, the number of terms grows quickly.

The kernel shortcut keeps the effect of those terms without forcing the page or program to list all of them.

Formal version

K(x,z)=(γxTz+c)dK(x,z)=(\gamma x^Tz+c)^d

Here d is the degree, gamma scales the dot product, and c shifts it before the power. Degree 1 behaves like a scaled linear kernel. Higher degrees add higher-order interactions.

Interactive comparison

Kernel similarity lab

(gamma x * z + c)^d: Adds interaction terms without writing every expanded feature. RBF-only decay rate; other kernels keep fixed parameters.

Compare every point with the chosen anchor. Notice how each kernel means a different kind of close.

A -> A4

similarity; dot 2, distance^2 0

A -> B6.250

similarity; dot 3, distance^2 1

A -> C2.250

similarity; dot 1, distance^2 5

A -> D0

similarity; dot -2, distance^2 10

Implementation sketch

function polynomialKernel(dot: number, gamma = 0.5, c = 1, degree = 2) {
  return (gamma * dot + c) ** degree;
}

Common confusions

  • Polynomial kernels are not automatically better than linear kernels.
  • Higher degree can create flexible boundaries, but it can also overfit or produce very large values.
  • The parameters are part of the geometry; changing them changes the similarity scale.

Connections

The polynomial kernel generalizes the linear kernel by adding interaction features. It contrasts with the RBF kernel, which measures closeness by distance rather than powers of dot products.

Exercises

  1. What does the degree d control?
  2. Why is manually expanding polynomial features painful?
  3. What happens when d=1?

Graph connections : Polynomial Kernel