”What is real? Because unceasingly we are bombarded with pseudo-realities manufactured by very sophisticated people using very sophisticated electronic mechanisms. I do not distrust their motives; I distrust their power. They have a lot of it. And it is an astonishing power: that of creating whole universes, universes of the mind.”

Philip K. Dick, American science fiction writer

This commentary of synthetic data in health care from Nature Biomedical Engineering journal is a good overall review of this increasingly relevant topic. Biases in algorithms often stem from training set data, so data is under scrutiny more than ever before. Biases can occur with sample selection or class imbalances and AI models are vulnerable to new phenotypic expressions of diseases. In addition, there is a paucity of large datasets that are both diverse and accurate.

One solution to the aforementioned data problems is the use of synthetic data. Synthetic data can be used to both increase the diversity in datasets and robustness of adaptability of AI models. In addition, synthetic datasets capture the original distribution of data so patient privacy concerns can be lessened. Synthetic data can also improve AI algorithms by data augmentation. On the other hand, one vulnerability of generative models is information leakage.

In addition, another limitation of synthetic data is that generative models are constrained by the size and quality of the training dataset. Lastly, training generative models with multi-institutional datasets can improve model generalization and reduce biases, but sharing data is complex.

Deepfakes have become an issue with AI-synthesized media, and the insufficient healthcare security measures render it vulnerable. Healthcare data can therefore be synthesized for nefarious purposes such as fraud, but can also be used to anonymize patient data for security purposes. Three metrics are used to understand the fidelity-diversity and privacy-utility trade-offs. As healthcare adopts synthetic data, the need for regulatory oversight is urgently needed.

Read the full paper here: https://www.nature.com/articles/s41551-021-00751-8