Draft
Quadratic Discriminant Analysis
Let each class keep its own covariance shape, producing curved supervised decision boundaries.
Hook problem: one shared class shape is too stiff
LDA assumes classes share one covariance shape. That is a useful simplification, but it can fail when one class is stretched, another is round, and the boundary needs to bend.
Quadratic Discriminant Analysis, or QDA, repairs the shared-shape assumption.
shared covariance assumption
separate covariance per class
Important boundary: QDA is not a standard projection method
QDA is mainly a supervised classifier. It appears in this dimensionality-reduction cluster because it is the natural contrast after LDA: instead of one linear discriminant geometry, it models one covariance matrix per class and gets a quadratic decision boundary.
Core invention: separate covariance per class
For each class c, QDA estimates its own mean and covariance. A class score has the form:
The likelihood term uses the class-specific covariance matrix Sigma_c. When those covariance matrices differ, equal-score curves are quadratic.
Trace lab
QDA keeps separate covariance shapes for each class instead of forcing one shared oval.
class shapes differ
Implementation sketch
for each class:
estimate mean_c and covariance_c;
for a new point:
score each class with its Gaussian log likelihood plus log prior;
choose the largest score;
Correctness intuition and cost
QDA is more flexible than LDA because each class can have its own oval shape. That flexibility costs more parameters, so it needs more data and more careful regularization when features are many.
Common confusions
- QDA does not produce a single low-dimensional embedding like PCA or t-SNE.
- QDA is supervised; labels are required.
- Curved boundaries can overfit when each class has too few examples.
Keep the directions where centered data varies most.
Place points so low-dimensional distances imitate the original distance table.
Use neighbor-graph shortest paths before applying an MDS-style layout.
Use labels to find projections that separate class means while keeping classes tight.
Let each class keep its own covariance, creating quadratic boundaries rather than one shared projection.
Match neighbor probabilities between high and low dimensions.
Repair SNE's crowding problem with a heavy-tailed low-dimensional similarity.
Build a fuzzy neighbor graph, then optimize a low-dimensional graph with similar membership strengths.
Exercises
- What assumption does QDA relax from LDA?
- Why can QDA overfit more easily than LDA?
- Why should QDA not be described as a general-purpose visualization algorithm?
Graph connections : Quadratic Discriminant Analysis