Fowlkes-Mallows Index | ReConcept Lab

Hook problem: focus on pairs placed together

Rand Index includes TN: pairs that are different in the reference labels and different in the clustering.

Sometimes you want a metric that focuses on the positive clustering decision: “these two items belong together.”

Fowlkes-Mallows Index uses only TP, FP, and FN.

Pair-positive balanceFMI ignores true negatives and balances pair precision with pair recall.

Pair precision0.75

TP / (TP + FP)

Pair recall0.429

TP / (TP + FN)

FMI0.567

TP / sqrt((TP + FP)(TP + FN))

TN20

not used by FMI

First naive idea: use pair precision only

Pair precision asks:

\frac{TP}{TP+FP}

That measures how trustworthy same-cluster pairs are. But it ignores same-label pairs that the clustering split apart.

Core invention: balance pair precision and pair recall

Pair recall asks:

\frac{TP}{TP+FN}

FMI combines pair precision and pair recall with a geometric mean:

FMI=\sqrt{\frac{TP}{TP+FP}\cdot\frac{TP}{TP+FN}}

The equivalent count form is:

FMI=\frac{TP}{\sqrt{(TP+FP)(TP+FN)}}

For the fixture:

FMI=\frac{3}{\sqrt{(3+1)(3+4)}}=\frac{3}{\sqrt{28}}\approx 0.567

Interactive preset lab

Clustering metric preset lab

Explanation: One mixed cluster and two split true classes make Purity look high while pair metrics reveal damage.

Purity0.875

7/8

Rand Index0.821

23/28

Adjusted Rand Index0.444

S=3, E=1

Fowlkes-Mallows Index0.567

pair P=0.75, pair R=0.429

TP3

FP1

FN4

TN20

Why `TN` is absent

TN means two items were apart in both the reference labels and the clustering. That is real agreement for RI, but it does not tell us whether predicted same-cluster groups are useful.

FMI therefore focuses on the co-clustered pair decision.

Rand pair countsRI counts both same-same and different-different agreements.

TP3

together in both

FP1

clustered together by mistake

FN4

split apart by mistake

TN20

apart in both

RI23/28 = 0.821

Edge cases

If TP + FP = 0, there are no predicted same-cluster pairs, so pair precision is unavailable.

If TP + FN = 0, there are no reference same-label pairs, so pair recall is unavailable.

In either case, render FMI as not available rather than NaN.

Implementation sketch

function fowlkesMallows(tp: number, fp: number, fn: number) {
  const denominator = Math.sqrt((tp + fp) * (tp + fn));
  return denominator === 0 ? null : tp / denominator;
}

Complexity

Once pair counts are known, FMI is O(1). Building pair counts directly is O(n^2), or it can be derived from a contingency table.

Common confusions

FMI is not classifier F1, though both balance two positive-side quantities.
FMI does not adjust for chance the way ARI does.
A clustering that merges everything can have high pair recall but weak pair precision.

Preset contrastThe same four metrics react differently to over-splitting and merging.

Fixture

Purity: 0.875

RI: 0.821

ARI: 0.444

FMI: 0.567

Perfect match

Purity: 1

RI: 1

ARI: 1

FMI: 1

Singleton over-split

Purity: 1

RI: 0.75

ARI: 0

FMI: not available

All merged

Purity: 0.375

RI: 0.25

ARI: 0

FMI: 0.5

Exercises

Why does FMI ignore TN?
What happens to FMI in the singleton over-split preset?
Which mistake hurts pair precision: FP or FN?

Graph connections : Fowlkes-Mallows Index

Hook problem: focus on pairs placed together

First naive idea: use pair precision only

Core invention: balance pair precision and pair recall

Interactive preset lab

Why TN is absent

Edge cases

Implementation sketch

Complexity

Common confusions

Exercises

Why `TN` is absent