Abstract Winner – Medical Imaging & Biomedical Diagnostics


In recent years, there has been growing hype surrounding the role of deep learning artificial intelligence (AI) and the future of radiologists. However to date, there has yet to be a single clinically relevant tool that can replicate human accuracy in the detection of even a simple radiographic finding. This limitation in part reflects the current state of medical deep learning research, which often relies on relatively small datasets and generic, pre-trained neural network architectures.

In this study we hope to present the first such example of a fully automated deep learning tool with unprecedented accuracy in detection of intraparenchymal hemorrhage (IPH) on non-contrast CT (NCCT) exams, a time-sensitive medical emergency. We do so by carefully customizing a convolutional neural network (CNN) tailored to the task at hand, and by training as well as validating the algorithm on the largest cohort of NCCT examinations to date.


After IRB approval, a retrospectively query was performed for all emergency department NCCT head examinations obtained at a single large academic institution between January 2015 and December 2016. A total of 268 IPH and 5,633 non-IPH exams were identified. Each IPH exam was manually segmented to delineate hemorrhage margins. All final diagnoses and segmentation masks were visually inspected by a board-certified radiologist.

A custom hybrid 3D/2D CNN was designed for detection and quantification of IPH. Accuracy for IPH detection was assessed on a patient-by-patient basis. Precision for quantification of IPH volume was assessed using Dice score and Pearson correlation, in addition to comparison with estimates derived using the simplified ABC/2 formula (the most commonly used proxy for estimating IPH volume from single dimensional measurements).


Upon five-fold cross-validation, the trained algorithm demonstrated high accuracy for IPH detection, with AUC, accuracy, sensitivity, specificity, PPV and NPV of 0.996, 0.996, 0.993, 0.996, 0.920 and 0.999 respectively. Out of 5,901 total cases, only 2 small mic

rohemorrhages were missed (2/268 = 0.74%) with just 23 false positives (23/5633 = 0.41%).

CNN-based IPH quantification demonstrated high Dice score of 0.931 and Pearson correlation of 0.999, overall slightly underestimating true volume by 2.1%. By comparison, the simplified ABC/2 formula demonstrated a Pearson correlation of 0.954, overall overestimating true volume by 20.2%.


A fully automated, customized deep learning AI tool is highly accurate in the detection of IPH on NCCT head exams, with only 2 missed small microhemorrhages and just 23 false positives in large dataset of 5,901 patients. The deep learning algorithm can also quantify hemorrhage volume with high correlation to manual human annotations (r = 0.999), outperforming currently available feasible alternatives. Clinically, this tool can be easily implemented to alert radiologists and triage positive cases of IPH, reducing turn-around time and facilitating improved patient outcomes.



Author: Peter Chang

Coauthor(s): Daniel S. Chow, MD

Status: Completed Work