Aziz Nazha

Hospital systems, payers, and regulators have focused on reducing length of stay (LOS) and early readmission, with uncertain benefit. Interpretable machine learning (ML) may assist in transparently identifying the risk of important outcomes.


Retrospective cohort study of hospitalizations at a tertiary academic medical center and its branches from January 2011 to May 2018. A consecutive sample of all hospitalizations in the study period were included. Readmissions were defined as any new hospitalization for any reason for any patient within the time threshold, e.g. 30 days. Data were censored as appropriate, e.g. observation (n=261,942) or emergency department status (n= 7058) for LOS; discharge disposition of “Expired” (n=18,615) for readmission.

Explainable ML algorithms were trained on medical, sociodemographic, and institutional variables to predict readmission, length of stay (LOS), and death within 48–72 hours. Prediction performance was measured by area under the receiver operator characteristic curve (AUC), Brier score loss (BSL), which measures how well predicted probability matches observed probability, and other metrics. Interpretations were generated using multiple feature extraction algorithms.


The study cohort included 1,485,880 hospitalizations for 708,089 unique patients (median age of 59, QI [39, 73]; 55.6% female; 71% white). There were 211,022 thirty-day readmissions for an overall readmission rate of 14% (for patients > 65 years: 16%). Median LOS, including observation and labor and delivery patients, was 2.94 days (QI [1.67, 5.34]), or, if these patients are excluded, 3.71 days (QI [2.15, 6.51]).

Predictive performance was as follows: Thirty-day readmission (AUC 0.76/BSL 0.1); LOS >5 days (AUC 0.85/BSL 0.14); death within 48–72 hours (AUC 0.91/BSL 0.001). Explanatory diagrams showed factors that impacted each prediction.


Interpretable ML achieves state-of-the-art predictive power for hospital readmission and extended LOS, and provides locally and globally interpretable predictions2019-11-04 20:32:55 (GMT) Genomic Biomarkers to Predict Response to Hypomethylating Agents in Patients with Myelodysplastic Syndromes (MDS) using Artificial Intelligence


While treatment with the hypomethylating agents (HMAs) azacitidine (AZA) and decitabine (DAC) improves cytopenias and prolongs survival in MDS patients (pts), response is not guaranteed. Identification of non-responders could prevent prolonged exposure to ineffective therapy, avoid toxicities and decrease unnecessary costs. Some genomic abnormalities may predict response to HMAs in small subset of pts, though this approach doesn’t take into account the genomic heterogeneity of the disease and the association of these mutations with others.


We developed an unbiased framework to study the association of several mutations in predicting response to HMAs, analogous to Netflix or Amazon’s recommender system in which customers who bought products A and B is likely to buy C: pts who have a mutation in gene A, and B are likely to respond or not respond to HMAs.


We screened a cohort of 433 pts with MDS (per 2008 WHO criteria) who received HMAs (230 at our institution [training cohort], and 203 at multiple other academic institutions [validation cohort]) for the presence of common myeloid mutations in 29 genes that are obtained by NGS prior to start HMA. The association between mutations and response was evaluated by the Apriori market basket analysis algorithm. Association rule is a machine learning method that can discover hidden relationships between variables in a dataset. Rules with highest confidence (confidence that the association exist) and highest lift (how strong is the association) were chosen.


Among 433 pts, 193 (45%) received AZA, 176 (40%) DAC, and 64 (15%) received HMA +/- combination. The median age was 70 years (range, 31-100) and 28% were female.
Association rules identified the following genomic combinations as being highly associated with no response: (ASXL1, NF1), (ASXL1, EZH2, TET2); (ASXL1, EZH2, RUNX1), (EZH2, SRSF2, TET2); (ASXL1, EZH2, SRSF2); (ASXL1, RUNX1, SRSF2); (ASXL1, TET2, SRSF2); (ASXL1, BCOR, RUNX1); and (SRSF2, RUNX1, BCOR) while the combination of (TET2, RUNX1, SRSF2) predicted response. When applying these rules to the validation cohort, the accuracy of these genomic biomarkers for no response to HMA was 93 %.


Genomic biomarkers can identify about a third of pts with MDS who will not respond to HMAs, with high sensitivity. Although these abnormalities are only present in a subset of pts, it can be used to tailor treatment options for these pts by offering alternative therapies to pts with lower risk disease and enrolling pts with higher risk disease on clinical trial with novel combination or proceed to transplant without prior HMA treatment. This study highlights the complexity of interpreting genomic data in regards to response to HMA and addresses the importance of machine learning technologies such as the recommender system algorithm in translating genomic data into useful clinical tools.