-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathprecision_recall
executable file
·39 lines (32 loc) · 1.52 KB
/
precision_recall
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
When working on a machine learning application with a highly imbalanced dataset, traditional metrics like accuracy can be misleading. For example, if you're detecting a rare disease, a model with 99% accuracy might seem impressive, but if only 0.5% of patients have the disease, a model that always predicts "no disease" would still achieve 99.5% accuracy without being useful.
To better evaluate such models, we use precision and recall:
Precision: The fraction of true positive predictions among all positive predictions. It tells us how many of the predicted positive cases were actually positive.
Formula:
Precision
=
True Positives
True Positives
+
False Positives
Precision=
True Positives+False Positives
True Positives
Recall: The fraction of true positive cases detected among all actual positive cases. It measures the model's ability to identify all relevant cases.
Formula:
Recall
=
True Positives
True Positives
+
False Negatives
Recall=
True Positives+False Negatives
True Positives
Using a confusion matrix helps visualize these metrics:
True Positives (TP): Correctly predicted positive cases.
True Negatives (TN): Correctly predicted negative cases.
False Positives (FP): Incorrectly predicted positive cases.
False Negatives (FN): Incorrectly predicted negative cases.
For a model to be useful, both precision and recall should be reasonably high. This ensures the model not only makes accurate positive predictions but also identifies a significant portion of actual positive cases.