“We developed a technology called artificial swarm intelligence, which is all about tapping the knowledge, wisdom, and intuition of groups.”

Louis B. Rosenberg, technology writer

 

 

Deep learning models feature millions of parameters and are in need of large curated data sets. It is challenging, however, to have accurate, robust, and unbiased models with data heterogeneity from data sharing since data security as well as data anonymization are often difficult. This perspective from Nature’s Digital Medicine elucidates a potential solution to this difficult problem: federated learning.

The traditional centralized data lake functions as a central repository of data from different sites with centers extracting data for local training. Federated learning, a learning paradigm that addresses the data governance and privacy, does not require data to be donated but rather train the algorithms collaboratively (by aggregating model parameters and gradients).

Federated learning can have two different types: 1) FL aggregation server- a global model provides model to a federation of training nodes that in turn resubmit their partially trained models to a central server for aggregation; and 2) FL peer to peer- a decentralized strategy in which each training node exchanges its partially trained model with some or all of its peers and each subgroup does its own aggregation.

Based on the aforementioned architecture, several FL topologies can be designed: centralized (hub-and-spoke), decentralized (peer to peer), hierarchical (federation and sub-federations), and hybrid hierarchical (mix of federations and peer to peer).

In addition, the FL compute plan, the trajectory of a model across several partners, can be: sequential training/cyclic transfer learning, aggregation, or peer to peer.

This strategy of “data-driven medicine requires federated efforts” accommodates multiple collaborators to train their models without sharing or centralizing their sensitive medical data. In short, federated learning combines knowledge learned from non-co-located data.

The possible dividends of being able to share valuable insights include: medical image interpretation with higher accuracy from many subspecialists, precision medicine with many similar patient cohorts, drug discovery with more expedient trials etc.

Among the questions that remain to be answered is the logistics of the creation of the global model and the administration of such a federation. In addition, this federated learning strategy still depends on data quality, surveillance for bias, and standardization as well as attention to data security.

Finally, federated learning can fully explore the artificial intelligence philosophy of swarm intelligence by moving the model to the data via collecting the insights safely. This is akin to receiving signals from the peripheral nervous system to input to the brain.

The full article can be read here