Author: Nathaniel Bischoff

Coauthor(s): Ari Cedars, MD

Status: Project Concept

The prevalence of adult congenital heart disease (ACHD) is increasing in the industrialized world. This unique group of patients is prone to rates of hospitalization far higher than the general population. Given significant heterogeneity both between lesions and within anatomical groups based on prior repair and genetic background, relatively small numbers of individuals with any given lesion, and comparatively low event rates, clinical trials in ACHD have been largely disappointing. As a result clinicians are left with few proven therapeutic options. To move the field forward, it will be essential to use big data including clinical, imaging, genetic, and patient reported variables to individualize intervention and minimize heterogeneity in treatment effects, and thereby improve the probability of conducting successful clinical trials. Furthermore, by identifying ACHD patients at high risk for clinical events, event rates will be higher improving study power and decreasing the number of subjects necessary to demonstrate efficacy.
A large volume of clinical data has been collected in the patient centered outcomes research network (PCORnet) representing nearly ⅓ of the total US population. The volume and heterogeneity of this clinical data are major impediments to traditional statistical methodologies.
Using a validated algorithm, identify patients in PCORnet with complex congenital heart disease. Using a number of machine learning algorithms including recurrent and deep neural networks, we propose to identify patterns of clinical variables in PCORnet on a time spectrum which anticipate hospitalization, emergency room visit, or death related to congenital heart disease. Training sets will be used to construct a predictive algorithm, after which the derived tool will be validated in testing sets to assess performance. This initial step will include only clinical data derived from PCORnet, however in the future other data will be added including imaging, genetic, and patient-sourced variables derived from a mobile platform with the goal of improving the tool’s performance.