The use of machine learning in clinical applications has grown exponentially in recent years. However, successful application and implementation of new clinical risk models has been limited. While a great deal of the slowness to adoption on the clinical front lines has been secondary to barriers in implementation, there is some evidence that the models themselves may be flawed due to incomplete information. In part, this may be due to the fact that these algorithms are designed to attempt to replace the rather than supplement the clinician and overlooks the clinical judgements that took place during their disease course. In two recent op-ed pieces, there has been particular concern regarding the “black box” model of machine learning algorithms. The JAMA article by Cabitza, et al. in particular points to “contextual errors” in a machine learning algorithm because the training data sets did not capture the current standard of care for the models to train on. As a result, patients who were actually initially high-risk due to a known clinical risk factor, resulting in more aggressive treatment, were considered by machine learning algorithms based on improved outcomes alone to be assigned a “low-risk” label. Naively deploying such models in the clinical setting could be potentially dangerous as high-risk patients who were over-treated in a training set would be treated as low-risk by any clinician using the inherently flawed model. This project aims to collect the contextual information on underlying standard of care to improve model training both through structured data in the electronic health record (EHR) as well as through semantic analysis of free text clinical notes.

For this project, structured data consisting of orders and diagnosis codes written by the clinician will be pulled from the EHR, as well as clinical problems abstracted from free text using natural language processing (NLP). We will primarily target orders typically associated with severity of patient illness such as patient floor disposition and frequency of nursing checks. While many machine learning models have sought to create a “raw” representation of the patient by capturing objective data such as labs, vitals, and diagnosis codes, our approach actively embraces clinician-influenced data to allow models to account for the effect of clinical effort.

While structured data will likely capture some component of a provider’s clinical effort, the majority of a physician’s concern and associated attentiveness to a patient will likely be captured in their clinical free text. NLP has made significant strides in semanti

c analysis to detect emotion from free text, particularly in social media such as Facebook and Twitter. We aim to use the same models to detect both “worry” and “uncertainty” in clinician notes.

As a whole, the purpose of this project will be to more accurately capture physicians “clinical effort” throughout the treatment process. In doing so, our hope is that machine learning models will be better able to account for physician generated treatment effects in creating algorithms for risk stratification of patients.



Author: Michael Wang

Coauthor(s): Lindsay Busby