Graph connections

Draft

Multidimensional Scaling

Build a low-dimensional map whose distances imitate an original pairwise-distance table.

algorithm intermediate machine-learningdimensionality-reductiondistances

Hook problem: a table is not a map

Imagine you are comparing five campus study spots: Library, Lab, Cafe, Dorm, and Gym. You do not have useful original coordinates. You only have a pairwise dissimilarity table: each number says how different two places feel for studying.

A spreadsheet can answer “how far is Library from Gym?” but it cannot give the quick visual judgment a map gives. Multidimensional Scaling, or MDS, asks for low-dimensional coordinates whose visible distances imitate that table.

A distance table wants to become a mapThe input is a symmetric dissimilarity table D, not original feature columns.
Campus study-spot dissimilarities D. Symmetric entries repeat the same promise; the diagonal is zero and ignored.
From / toLibraryLabCafeDormGym
Library01.423.23.6
Lab1.401.62.72.4
Cafe21.601.52.7
Dorm3.22.71.502.2
Gym3.62.42.72.20
layout xlayout yMDS must invent coordinates Y

Conventions: entries are nonnegative; D is symmetric; only one unordered pair with i < j contributes to stress.

Read the table as fixed input DD. Its entries are nonnegative. The diagonal is zero or ignored because an item is distance zero from itself. The table is symmetric, so Library -> Lab and Lab -> Library are the same promise. When MDS computes stress, each unordered pair is counted once, using the convention i<ji < j.

The word “dissimilarity” is intentionally broad: the numbers may come from ratings, edit distances, survey responses, graph distances, or true Euclidean measurements. MDS can still try to draw a map, but some tables cannot fit perfectly in two dimensions.

First naive idea: place the nearest pairs

The closest pair is easy. Put Library and Lab about 1.4 units apart. Then add Cafe near both of them. The pain starts when Dorm and Gym arrive: each new point must satisfy several distances at once.

Naive placement traceA deterministic trace using the same Library, Lab, Cafe, Dorm, and Gym fixture as the table.
layout xlayout y1.41.2LibraryLabCafeDormGym
Step 1/4: Read the fixed distance table

Every unordered pair has one target distance. The diagonal is ignored, and the same table will score every candidate map.

Total stress12.39

Every candidate layout is judged against the same fixed table D.

Selected pair residuals
PairTargetMapResidualSquared contribution
Library - Lab1.41.40
nearly matched
0
Library - Cafe21.22-0.78
too close / compressed
0.61

A drawing made by eye has no single way to decide which promise should win. Pull Gym away from Library, and it may become too close to Dorm. Nudge Dorm, and several other distances change.

Pair promises can fightThe naive layout satisfies one promise but compresses or stretches others.
layout xlayout y1.42.21LibraryLabCafeDormGym
Library - Labtarget: 1.4naive map: 1.4

nearly matched (0)

Library - Gymtarget: 3.6naive map: 2.15

too close / compressed (-1.45)

Dorm - Gymtarget: 2.2naive map: 1

too close / compressed (-1.2)

Notice the axis labels: layout x and layout y. These are coordinates invented for the map. They are not original measured features like “noise level” or “walking time” unless the input table was built that way and the solution happens to align with them.

Core invention: make mismatch measurable

MDS repairs the by-eye problem by giving every candidate map the same score. Place each item ii at a low-dimensional point yiy_i. Then compare the map’s distance for every pair with the original target distance.

Stress accumulates pair mismatchResidual convention: map distance - target distance.

residual = map distance - target distance. Positive means too far / overstretched; negative means too close / compressed.

Naive layout residuals
PairTargetMapResidualSquared
Library - Gym3.62.15-1.45 too close / compressed2.09
Lab - Dorm2.71.41-1.29 too close / compressed1.65
Cafe - Dorm1.50.64-0.86 too close / compressed0.74
Dorm - Gym2.21-1.2 too close / compressed1.44
Lower-stress layout residuals
PairTargetMapResidualSquared
Library - Gym3.63.73+0.13 too far / overstretched0.02
Lab - Dorm2.72.67-0.03 nearly matched0.0
Cafe - Dorm1.51.58+0.08 too far / overstretched0.01
Dorm - Gym2.21.77-0.43 too close / compressed0.19
Naive stress12.39
Improved stress0.32

The signed residual convention on this page is:

rij=δijdijr_{ij} = \delta_{ij} - d_{ij}

Positive residual means the map puts the pair too far apart, or overstretches it. Negative residual means the map puts the pair too close together, or compresses it.

MDS accumulates those mismatches as stress:

stress(Y)=i<j(δijdij)2\text{stress}(Y) = \sum_{i<j}(\delta_{ij} - d_{ij})^2

Squaring makes positive and negative residuals both count as error. It also makes large distortions hurt more than small ones.

Trace lab: from table to lower stress

The full trace uses the same five places, the same distance table, and the same residual convention. It does not simulate random optimization; it simply compares a rough layout with a deterministic lower-stress layout so the scoring idea stays visible.

MDS trace labA deterministic trace using the same Library, Lab, Cafe, Dorm, and Gym fixture as the table.
layout xlayout y1.41.2LibraryLabCafeDormGym
Step 1/5: Read the fixed distance table

Every unordered pair has one target distance. The diagonal is ignored, and the same table will score every candidate map.

Total stress12.39

Every candidate layout is judged against the same fixed table D.

Selected pair residuals
PairTargetMapResidualSquared contribution
Library - Lab1.41.40
nearly matched
0
Library - Cafe21.22-0.78
too close / compressed
0.61

The lower-stress layout is not declared “true.” It is only better under the chosen objective and fixed table DD. A different stress formula, output dimension, or dissimilarity table could prefer a different map.

Formal version

Let dijd_{ij} be the original target dissimilarity between items ii and jj. Let YY be the reconstructed coordinate table, and let yiy_i be the map point for item ii.

Symbols follow the reconstruction pipelineD names the fixed table; Y names the coordinates MDS is searching for.
  1. Dfixed pairwise distance table
  2. Ycandidate map coordinates
  3. delta_ijdistance shown by the map
  4. r_ijsigned residual
  5. stress(Y)sum of squared mismatches

The map distance for the same pair is:

δij=yiyj2\delta_{ij} = \lVert y_i - y_j \rVert_2

In words, δij\delta_{ij} is the distance the reconstructed map shows for pair i,ji, j.

The residual and stress are:

rij=δijdijr_{ij} = \delta_{ij} - d_{ij} stress(Y)=i<j(δijdij)2\text{stress}(Y) = \sum_{i<j}(\delta_{ij} - d_{ij})^2

MDS searches for coordinates where this total mismatch is small. Classical MDS has a closed-form route when the distances fit the Euclidean assumptions. Metric and nonmetric MDS optimize related stress ideas. This node focuses on the shared distance-preserving question, not those solver derivations.

Implementation state

An implementation carries only a few ideas, but most of them are pairwise:

StateWhat it meansWhy it matters
Distance matrix DDThe fixed input table of dijd_{ij} valuesEvery candidate map is judged against the same promises
Coordinates YYCurrent low-dimensional points yiy_iThese are the variables MDS changes
Map distances δij\delta_{ij}Distances measured inside the current mapThey are compared with the target table
Residuals rijr_{ij}Signed differences δijdij\delta_{ij} - d_{ij}They show which pairs are compressed or overstretched
Total stressSum of squared residualsIt gives one score for the whole layout
Stopping or inspectionIteration limit, small improvement, or human reviewLow stress still needs residual inspection
read or compute the pairwise distance matrix D;
choose output dimension k and initialize coordinates Y;
repeat: evaluate pair residuals and reduce stress;
inspect the largest remaining distortions;

Correctness intuition: useful maps still have scars

The invariant is simple: do not let the target move. The same fixed DD scores every candidate layout. If one layout has lower stress than another, it fits this objective better.

That statement is intentionally limited. It does not prove the displayed layout is globally optimal. It does not recover the original axes. It does not promise zero distortion when the table is noisy, non-Euclidean, or too rich for the chosen dimension.

Residual inspectorSelect a pair and inspect how the lower-stress map still distorts it.
layout xlayout yDorm - GymLibraryLabCafeDormGym
Target distance d_ij2.2
Map distance delta_ij1.77
Residual r_ij-0.43
Squared contribution0.19
Interpretation

too close / compressed

Invariant

Every candidate layout is scored against the same fixed D. Lower stress means better fit to this objective, not proof of globally optimal coordinates or true original axes.

This is why residuals are part of the lesson, not a debugging afterthought. A map can be helpful exactly because it summarizes many pairwise promises, but the remaining residuals say where the summary bends the truth.

Cost: every point talks to every other point

With nn items, a full distance table has about n(n1)/2n(n-1)/2 useful entries. Storing the table is O(n2)O(n^2). If the output has kk dimensions, a full stress evaluation computes every map distance, so it costs O(n2k)O(n^2 k).

The work is pairwiseA full stress scan touches every unordered pair.
Dstore/read all pair distancesO(n^2)
Ykeep n map points in k dimensionsO(nk)
delta, revaluate every unordered pairO(n^2 k)
Trepeat pairwise work across updatesT * O(n^2 k)
Classical MDS note

A closed-form classical route can require eigendecomposition, which is more expensive than one stress scan.

Fixture fit

Normalized stress of the improved map: 0.005

If an optimizer performs TT evaluations or updates, expect the pairwise part to appear repeatedly. Classical MDS can also require eigendecomposition, which is a different and often heavier cost than one stress scan.

Common confusions

ConfusionRepair
”The axes must mean original features.”MDS axes are layout coordinates. They are useful for distance reconstruction, not guaranteed feature interpretation.
”Low stress means zero stress.”Low stress means smaller accumulated mismatch. Some pair residuals may remain visible.
”MDS is just PCA.”PCA starts from coordinates and preserves high-variance directions. MDS starts from pairwise distances and preserves those distances.
”All MDS variants are the same solver.”Classical, metric, and nonmetric MDS differ in solver path or distance interpretation, but share the map-from-distances motivation.

Connections in the graph

PCA is the useful contrast: it has original coordinates and asks which directions keep variance. MDS can begin after coordinates are gone or unhelpful, as long as a meaningful pairwise table remains.

Isomap reuses the MDS layout step. Its repair is upstream: before drawing the map, it replaces raw straight-line distances with shortest-path distances on a neighbor graph.

Dimensionality reduction pathLinear projections, distance-preserving maps, supervised discriminants, and neighbor embeddings solve different pains.
PCA

Keep the directions where centered data varies most.

MDS

Place points so low-dimensional distances imitate the original distance table.

Isomap

Use neighbor-graph shortest paths before applying an MDS-style layout.

LDA

Use labels to find projections that separate class means while keeping classes tight.

QDA

Let each class keep its own covariance, creating quadratic boundaries rather than one shared projection.

SNE

Match neighbor probabilities between high and low dimensions.

t-SNE

Repair SNE's crowding problem with a heavy-tailed low-dimensional similarity.

UMAP

Build a fuzzy neighbor graph, then optimize a low-dimensional graph with similar membership strengths.

Exercises

  1. In the campus table, why is it enough to count Library - Lab once instead of also counting Lab - Library?
  2. Pick one negative residual in the inspector. What physical action on the map would make that pair less compressed, and which other pair might get worse?
  3. Why does a lower-stress MDS layout not prove that the two displayed axes are meaningful original features?

Graph connections : Multidimensional Scaling