“In the depth of winter, I finally learned that there was in me an invincible summer. ”
Albert Camus, French author of The Plague

In our past newsletters, we had covered important topics in data science such as the imbalanced dataset and parameters vs hyperparameters. As the COVID-19 pandemic rages around the world, it is utterly impossible to focus on anything other than the vaccines that promise to bring salvation to this public health scourge of the century. This extreme yearn for an effective vaccine renders one thinking about evaluation metrics in data science.

Precision, also known as positive predictive value, is the total of true positives divided by the sum of true and false positives. Recall, on the other hand, is also termed sensitivity and is the total of true positives divided by the sum of true positives and false negative (the sum being actual positives). Precision-recall (PR) area under the curve (AUC) and F1 score are considered good metrics for classification problems, with the latter being the harmonic mean between precision and recall and conveniently between 0 to 1 (with the latter being highly accurate). These aforementioned metrics are not influenced by large numbers of true negatives, which can be unfairly favorable for metrics such as accuracy.

Accuracy is defined as the number of true positives and the number of true negatives divided by the total population. While accuracy is relatively easier to comprehend than the other metrics, a large number of true negatives in the population (low incidence of disease such as cancer) can result in an accuracy that is very high (close to 1). Similarly, a large number of true negatives can result in a very low error rate as it is the sum of false positives and false negatives divided by the total population (including the high number of true negatives). Finally, the receiver operating characteristic area under the curve (AUROC) with its true positive rate (y-axis) and false positive rate (x-axis) is also not a good metric to be used if the data is heavily imbalanced. This is due to the false positive rate being decreased simply because of the large number of true negatives. There is also at times some confusion between accuracy and AUROC.

As the number of people infected with SARS-CoV2 increases exponentially, there is no longer an imbalanced data problem (as earlier in the pandemic) that would render some of the metrics less effective in assessing accuracy of the test.

These and other topics are often discussed (and debated) at the many AIMed events. We at AIMed continue to present many AIMed Talks for our colleagues during this pandemic. The popular AIMed Clinician Series will return early next year, and we are planning to virtually host our annual meeting AIMed20 in December.

These are exceedingly difficult months of the pandemic, but there is light with the historically expedient vaccines. We look forward to seeing you virtually and please stay healthy and safe!


Anthony Chang, MD, MBA, MPH, MS
Founder, AIMed
Chief Intelligence and Innovation Officer
Medical Director, The Sharon Disney Lund
Medical Intelligence and Innovation Institute (mi3)
Children’s Hospital of Orange County