Precision | ReConcept Lab

Hook problem: trust a spam alarm or not

Your model scans 12 emails with fixed fixture subjects. You need to know whether an alarm predicted = spam is trustworthy.

Spam alarm funnel12 evaluated emails become 5 spam alarms before precision applies.

All 12 evaluated emails

Each email has an actual label and a predicted label.

Prize claim now

Obvious prize bait caught by the filter.

actual: spam, predicted: spam, TP: True Positive

Project notes

A normal work message left in the inbox.

actual: not-spam, predicted: not-spam

Receipt attached

A real receipt incorrectly flagged as spam.

actual: not-spam, predicted: spam, FP: False Positive

Account alert

A fake alert slipped into the inbox.

actual: spam, predicted: not-spam

Limited offer

Promotional spam correctly blocked.

actual: spam, predicted: spam, TP: True Positive

Team lunch

A casual team email correctly kept.

actual: not-spam, predicted: not-spam

Password reset

A requested reset email reached the user.

actual: not-spam, predicted: not-spam

Urgent transfer

A scam message was missed by the filter.

actual: spam, predicted: not-spam

Flight update

A useful travel update became a false alarm.

actual: not-spam, predicted: spam, FP: False Positive

e10

Crypto bonus

Suspicious bonus spam correctly caught.

actual: spam, predicted: spam, TP: True Positive

e11

Invoice approved

A business invoice correctly accepted.

actual: not-spam, predicted: not-spam

e12

Verify wallet

A phishing-style wallet email was missed.

actual: spam, predicted: not-spam

Predicted-spam alarms only (5)

These are the five messages used as denominator in precision: e1, e3, e5, e9, e10.

Prize claim now

Obvious prize bait caught by the filter.

actual: spam, predicted: spam, TP: True Positive

Receipt attached

A real receipt incorrectly flagged as spam.

actual: not-spam, predicted: spam, FP: False Positive

Limited offer

Promotional spam correctly blocked.

actual: spam, predicted: spam, TP: True Positive

Flight update

A useful travel update became a false alarm.

actual: not-spam, predicted: spam, FP: False Positive

e10

Crypto bonus

Suspicious bonus spam correctly caught.

actual: spam, predicted: spam, TP: True Positive

Accuracy says:

\frac{TP + TN}{TP + FP + TN + FN} = \frac{3 + 4}{12} = \frac{7}{12}

That gives 58.3%, but it mixes two question types:

Did the model get the whole mail stream mostly correct?
Did the model’s spam alarms stay correct?

First naive idea: keep using all examples

If we stay with the global fraction above, those five predicted alarms sit in denominator along with all non-alarms.

Accuracy vs precision contrastAccuracy uses all predictions. Precision uses only predicted-positive examples.

Accuracy

Question: How many of all emails were correct?

(7 / 12 = 58.3%)

Precision

Question: How many spam alarms were correct?

(3 / 5 = 0.6)

Why this hurts

e3 and e4 are both wrong, but they do not affect a trust metric in the same way:

e3: actual not-spam, predicted spam (FP) → hurts trust directly.
e4: actual spam, predicted not-spam (FN) → hurts recall, not spam-alarm trust.

Core invention: read only predicted positives

For precision, set the denominator to the count of predicted positives:

\text{Precision} = \frac{TP}{TP + FP}

This gives the fraction of alarms that were right:

\frac{3}{3 + 2} = \frac{3}{5} = 0.6

Interactive visual demo: predicted-positive trace

Each step below processes only e1, e3, e5, e9, e10 in fixture order:

1,2,3,4,5 → e1, e3, e5, e9, e10

Precision over predicted spam alarms

e1 → TP: Prize claim now. Actual spam, Predicted spam. Cell: TP (True Positive). Running precision: 1/1 = 1.0.

Subject

Prize claim now

Current step

1/5

Trusted alarms

1/1

Running precision

1.0

percent: 100.0%

Final fixture value

3/5 = 0.6

Precision trace over predicted-spam emails
Step	Email	Actual	Predicted	Cell	Trusted alarms	All alarms	Precision
1	e1	spam	spam	tp (TP)	1	1	1/1 = 1.0
2	e3	not-spam	spam	fp (FP)	1	2	1/2 = 0.5
3	e5	spam	spam	tp (TP)	2	3	2/3 = 0.667
4	e9	not-spam	spam	fp (FP)	2	4	2/4 = 0.5
5	e10	spam	spam	tp (TP)	3	5	3/5 = 0.6

The static ledger below is still useful when scripts are unavailable.

Precision trace over predicted alarms (no-JS fallback)
Step	Email	Subject	Actual	Predicted	Cell	Trusted	All	Running value
1	e1	Prize claim now	spam	spam	TP	1	1	1/1 = 1
2	e3	Receipt attached	not-spam	spam	FP	1	2	1/2 = 0.5
3	e5	Limited offer	spam	spam	TP	2	3	2/3 = 0.667
4	e9	Flight update	not-spam	spam	FP	2	4	2/4 = 0.5
5	e10	Crypto bonus	spam	spam	TP	3	5	3/5 = 0.6

Correctness intuition

Each processed alarm is either TP or FP, so the denominator TP + FP exactly counts all alarm predictions.

The non-alarm examples are outside the denominator:

TN + FN outside: e2, e4, e6, e7, e8, e11, e12
FN missed positives outside: e4, e8, e12

At this fixture end:

TP = 3
FP = 2
precision = 3 / 5 = 0.6

Implementation sketch

interface Counts {
  tp: number;
  fp: number;
}

function precisionFromCounts(counts: Counts): { numerator: number; denominator: number; value: number | null } {
  const denominator = counts.tp + counts.fp;
  if (denominator === 0) {
    return { numerator: counts.tp, denominator, value: null };
  }
  return { numerator: counts.tp, denominator, value: counts.tp / denominator };
}

Complexity

If confusion-matrix counts already exist:

O(1) to compute precision from counts.
O(n) + O(1) extra space if you scan one fixture and count tp/fp.

Common confusions

Precision is not accuracy. Accuracy uses all 12 emails. Precision uses only 5 alarm predictions.
Precision is not recall. Recall asks about FN coverage, which is a different denominator.
High precision does not mean no missed spam. Precision can stay at 3/5 while e4,e8,e12 are still missed.

Common confusionsA high precision does not mean no missed spam, and recall is a different question.

Boundary check

Example	Why precision still reads 3/5
e3	False alarm (FP) — hurts trust.
e4	Missed spam (FN) — ignored by precision denominator.
e8	Missed spam (FN) — still outside precision denominator.
e12	Missed spam (FN) — still outside precision denominator.

Missed spam outside denominator

e4, e8, e12

Zero-denominator display

not available

Zero-denominator convention

When there are no predicted = spam outputs:

tp = 0
fp = 0
denominator tp + fp = 0
precision = null (internally)

Rendered value should be:

English: not available
Chinese: 不可用

Predicted-positive columnThis column has only TP and FP.

TP (3)

Predicted spam and actually spam.

Prize claim now

Obvious prize bait caught by the filter.

actual: spam, predicted: spam, TP: True Positive

Limited offer

Promotional spam correctly blocked.

actual: spam, predicted: spam, TP: True Positive

e10

Crypto bonus

Suspicious bonus spam correctly caught.

actual: spam, predicted: spam, TP: True Positive

FP (2)

Predicted spam and actually not-spam.

Receipt attached

A real receipt incorrectly flagged as spam.

actual: not-spam, predicted: spam, FP: False Positive

Flight update

A useful travel update became a false alarm.

actual: not-spam, predicted: spam, FP: False Positive

Ignored by precision denominator

TN + FN outside denominator (7)

e2, e4, e6, e7, e8, e11, e12

Graph connection

Graph stripPrecision is implemented as a follow-up from confusion-matrix and stays scoped to binary alarms.

confusion-matrix

implemented

→

precision

implemented

Edges and edges only

From this node:

confusion-matrix -> precision (uses)

Exercises

Why does accuracy use all 12 emails but precision use only TP + FP?
Compute the first two running precision values by hand from e1 then e3.
In a dataset with no positive predictions, what should precision render, and why?
Which of e3, e4, and e8 changes the precision denominator?

Graph connections : Precision