Draft
Recall
Measure how many real positives were found by the model.
Hook problem: did we catch enough real spam?
Your model evaluated 12 emails. You care about one question:
Of the real positives, how many were caught?
Prize claim now
Obvious prize bait caught by the filter.
actual: spam, predicted: spam (TP: True Positive)
Project notes
A normal work message left in the inbox.
actual: not-spam, predicted: not-spam (TN: True Negative)
Receipt attached
A real receipt incorrectly flagged as spam.
actual: not-spam, predicted: spam (FP: False Positive)
Account alert
A fake alert slipped into the inbox.
actual: spam, predicted: not-spam (FN: False Negative)
Limited offer
Promotional spam correctly blocked.
actual: spam, predicted: spam (TP: True Positive)
Team lunch
A casual team email correctly kept.
actual: not-spam, predicted: not-spam (TN: True Negative)
Password reset
A requested reset email reached the user.
actual: not-spam, predicted: not-spam (TN: True Negative)
Urgent transfer
A scam message was missed by the filter.
actual: spam, predicted: not-spam (FN: False Negative)
Flight update
A useful travel update became a false alarm.
actual: not-spam, predicted: spam (FP: False Positive)
Crypto bonus
Suspicious bonus spam correctly caught.
actual: spam, predicted: spam (TP: True Positive)
Invoice approved
A business invoice correctly accepted.
actual: not-spam, predicted: not-spam (TN: True Negative)
Verify wallet
A phishing-style wallet email was missed.
actual: spam, predicted: not-spam (FN: False Negative)
actual positives = TP + FN = 6
TP ids: e1, e5, e10
FN ids: e4, e8, e12
First naive idea: reuse accuracy
If you still use global correctness:
This measures overall correctness. It says nothing about coverage of real spam.
Why it hurts: FN examples are invisible to that denominator
e4, e8, and e12 are all real spam but missed by the model (FN):
e1: caught spam (TP)e4: missed spam (FN)e5: caught spam (TP)e8: missed spam (FN)e10: caught spam (TP)e12: missed spam (FN)
So for recall, denominator is TP + FN = 6, not all 12.
All 12 predictions are in scope.
Numerator = TP + TN = 7
Accuracy = 7 / 12 = 58.3%
Only actual-positive examples are in scope.
Denominator = TP + FN = 6
Numerator = TP = 3
Recall = 3/6 = 0.5
Actual negatives (6): e3, e9, e2, e6, e7, e11
Core invention: read the actual-positive row
The matrix orientation is fixed:
Recall reads only one row:
For this fixture:
Positive-label orientation: rows are actual, columns are predicted.
| actual \ predicted | predicted spam | predicted not-spam |
|---|---|---|
| actual spam | TP (3): e1, e5, e10 | FN (3): e4, e8, e12 |
Recall reads TP + FN = 6.
TP from the actual-positive row.
3 (e1, e5, e10)
TP + FN from the actual-positive row.
6 (e1, e5, e10, e4, e8, e12)
Recall = 3/6 = 0.500
Meaning: 3 out of 6 real spam were caught.
Interactive trace: actual-positive step-by-step
The interactive demo now shows only e1, e4, e5, e8, e10, e12 in fixture order, and one row of the running recall after each row.
Recall over actual spam
e1: Prize claim now. Actual spam, Predicted spam. Cell: TP (True Positive). Caught positives 1/1, Running recall: 1.0.
1/6
Prize claim now
e1
spam
spam
TP (True Positive)
1/1
1.0
percent: 100.0%
3/6 = 0.5
| Step | Subject | Cell | Caught | Actual positives seen | Running recall | Actual | Predicted | |
|---|---|---|---|---|---|---|---|---|
| 1 | e1 | Prize claim now | TP (True Positive) | 1 | 1 | 1/1 = 1.0 | spam | spam |
| 2 | e4 | Account alert | FN (False Negative) | 1 | 2 | 1/2 = 0.5 | spam | not-spam |
| 3 | e5 | Limited offer | TP (True Positive) | 2 | 3 | 2/3 = 0.667 | spam | spam |
| 4 | e8 | Urgent transfer | FN (False Negative) | 2 | 4 | 2/4 = 0.5 | spam | not-spam |
| 5 | e10 | Crypto bonus | TP (True Positive) | 3 | 5 | 3/5 = 0.6 | spam | spam |
| 6 | e12 | Verify wallet | FN (False Negative) | 3 | 6 | 3/6 = 0.5 | spam | not-spam |
If JavaScript is unavailable, use this ledger:
| Step | Subject | Actual | Predicted | Cell | Caught positives | Actual positives seen | Running recall | |
|---|---|---|---|---|---|---|---|---|
| 1 | e1 | Prize claim now | spam | spam | TP | 1 | 1 | 1/1 = 1 |
| 2 | e4 | Account alert | spam | not-spam | FN | 1 | 2 | 1/2 = 0.5 |
| 3 | e5 | Limited offer | spam | spam | TP | 2 | 3 | 2/3 = 0.667 |
| 4 | e8 | Urgent transfer | spam | not-spam | FN | 2 | 4 | 2/4 = 0.5 |
| 5 | e10 | Crypto bonus | spam | spam | TP | 3 | 5 | 3/5 = 0.6 |
| 6 | e12 | Verify wallet | spam | not-spam | FN | 3 | 6 | 3/6 = 0.5 |
Missed-spam pain (what recall protects)
Recall stays honest because it measures missed real positives directly.
Account alert
A fake alert slipped into the inbox.
actual: spam, predicted: not-spam (FN: False Negative)
Urgent transfer
A scam message was missed by the filter.
actual: spam, predicted: not-spam (FN: False Negative)
Verify wallet
A phishing-style wallet email was missed.
actual: spam, predicted: not-spam (FN: False Negative)
These are real spam emails that were not caught.
Missed count: 3
Each missed sample still contributes to the denominator.
Formula: TP + FN.
Zero-denominator convention
If there are no real positives in the evaluated set:
TP = 0FN = 0TP + FN = 0- internal value:
null - rendered value:
not available
| Case | TP | FN | Denominator (TP + FN) | Internal value | Rendered |
|---|---|---|---|---|---|
| No actual positives | 0 | 0 | 0 | null | not available |
Implementation sketch
interface RecallCounts {
tp: number;
fn: number;
}
function recallFromCounts(counts: RecallCounts) {
const denominator = counts.tp + counts.fn;
if (denominator === 0) {
return { numerator: counts.tp, denominator, value: null };
}
return { numerator: counts.tp, denominator, value: counts.tp / denominator };
}
| Step | Numerator (tp) | Denominator (tp + fn) | Branch | Result |
|---|---|---|---|---|
| 1 | tp | tp + fn | if denominator==0 | null |
| 2 | tp | tp + fn | else | tp / (tp + fn) |
Correctness intuition
Each actual-positive example is exactly one of:
TP: actual spam and predicted spam (caught)FN: actual spam and predicted not-spam (missed)
So TP + FN is exactly the set of real spam. It must equal 6 here.
| Invariant | Value |
|---|---|
TP + FN from final counts | 6 |
Final TP | 3 |
Final FN | 3 |
| Final recall | 3 / (3 + 3) |
Complexity
- If counts already exist:
O(1)to compute recall from{tp, fn}. - If scanning examples directly:
O(n)time andO(1)extra space.
Accuracy counts true negatives too; recall ignores TN/FP.
3 caught is not enough; denominator must be all real spam.
Precision asks: of predicted positives, how many were correct?
Recall asks: of real positives, how many were caught?
Graph connection
Recall is built from the confusion-matrix actual-positive row and can be contrasted with precision.
implemented
implemented
contrast
Exercises
- Why is
7 / 12not a recall score? - What is the recall denominator in this fixture and why?
- Compute the running recall after
e8. - Explain what should render when denominator is
0. - Which examples are
FN, and why do they affect recall?
| Final reference |
|---|
TP + FN = 3 + 3 = 6, final recall is 3 / 6 = 0.5, which is 50%. |
| Actual-spam set |
|---|
e1, e4, e5, e8, e10, e12 |
Graph connections : Recall