Draft
Rand Index
Evaluate clustering by asking whether every item pair is together or apart in both partitions.
Hook problem: Purity missed a split
The fixture has high Purity: 7/8 = 0.875. But the true graph cards are split across clusters A and B, and the true tree cards are split across clusters B and C.
Rand Index changes the question from “what is each cluster’s majority label?” to “what happened to every pair of items?”
First naive idea: keep the cluster-majority score
Purity gives cluster B one point because it contains one graph card and one tree card, so either label is a majority tie.
That hides the pair mistake: Cy and Di were placed together even though their reference labels differ.
Core invention: score pair agreement
For each unordered pair, ask:
- Are the two items in the same reference class?
- Are the two items in the same predicted cluster?
This creates pair counts:
TP: same label and same cluster.FP: different labels but same cluster.FN: same label but different clusters.TN: different labels and different clusters.
together in both
clustered together by mistake
split apart by mistake
apart in both
Formal version
For n items, the number of unordered pairs is:
Rand Index is the fraction of pair decisions that agree:
For the fixture:
Interactive preset lab
Clustering metric preset lab
Explanation: One mixed cluster and two split true classes make Purity look high while pair metrics reveal damage.
7/8
23/28
S=3, E=1
pair P=0.75, pair R=0.429
Static no-JS fallback:
| TP | 3 |
|---|---|
| FP | 1 |
| FN | 4 |
| TN | 20 |
| TOTAL | 28 |
Correctness intuition
Each pair has exactly one status: TP, FP, FN, or TN. The numerator TP + TN counts exactly the pairs where both partitions make the same together/apart decision.
Limitation: true negatives can dominate
When many classes exist, most pairs may be different-label pairs. A clustering can get many TN agreements without doing a good job recovering the same-label groups.
That is why the next node adjusts for chance agreement.
Purity: 0.875
RI: 0.821
ARI: 0.444
FMI: 0.567
Purity: 1
RI: 1
ARI: 1
FMI: 1
Purity: 1
RI: 0.75
ARI: 0
FMI: not available
Purity: 0.375
RI: 0.25
ARI: 0
FMI: 0.5
Implementation sketch
function randIndex(items: { label: string; cluster: string }[]) {
let tp = 0, fp = 0, fn = 0, tn = 0;
for (let i = 0; i < items.length; i += 1) {
for (let j = i + 1; j < items.length; j += 1) {
const sameLabel = items[i].label === items[j].label;
const sameCluster = items[i].cluster === items[j].cluster;
if (sameLabel && sameCluster) tp += 1;
else if (!sameLabel && sameCluster) fp += 1;
else if (sameLabel && !sameCluster) fn += 1;
else tn += 1;
}
}
const total = tp + fp + fn + tn;
return total === 0 ? null : (tp + tn) / total;
}
Complexity
Direct pair enumeration is O(n^2) time and O(1) extra counting space. If you already have a contingency table, the same counts can be computed from cell combinations without listing every pair.
Common confusions
- Rand Index uses pair
TP/FP/FN/TN; these are not individual classifier examples. TNmeans “apart in both partitions,” not “the item is negative.”- A high RI may still need chance adjustment.
Exercises
- Which pair is the fixture’s only
FP? - Why are there four
FNpairs? - Why does RI include
TNwhile FMI does not?
Graph connections : Rand Index