Picture: GraceOda Flickr

Health data is exploding and threatens to overwhelm our ability to store, analyze and use it. Doctors and patients want actionable information at the point of care that assists them with shared decision making. Health care systems want data that allows them to better manage populations which identifies those at risk for a specific outcome and identifying the risk factors that can be changed.

Medical science blazed new paths of innovation with the development of vaccines and antibiotics, and new ways of treating heart disease and cancer. Going forward, continued advances in medicine and health increasingly will be tied to the ability to interpret massive amounts of data, according to Dr. Lloyd Minor, dean of the Stanford University School of Medicine. “We are now at the point where innovation is at the algorithmic level,” he said.

In addition, all that data is a treasure trove for researchers interested in learning how to make care better, more personalized, and less wasteful.

The challenges of using EMR data for research are several:

  1. Too much data-there is more potential data available than a human can absorb and understand – volume and variety create challenges for extracting knowledge from data. We also have to deal with “recency bias” and the availability heuristic which describes the tendency to assume that future events will closely resemble recent experience. Given that every two years for about the last three decades the amount of data in the world has increased by about 10 times, there is a tendency to ignore historical trends and just rely on more recent, more abundant recent data.
  2. What does it mean? Lack of semantic standards -“Serum Na” and “sodium, serum”
  3. Lack of syntactic standards -SQL,XML,HL7, text. Is it complete and accurate?
  4. Limited longitudinal data or fragmented data – a person’s data is not linked across various  health care systems and often available date presents limited and incomplete time periods.
  5. Lack of consistent documentation using structured data fields , which, may not be random-narrative text and dictated documentation or incomplete documentation may occur when doctors are rushed or caring for a patient that warrants additional time and attention. When data is missing not at random, the secondary use of the data has limitations as the reason data is missing might bias any inferences obtained from the data
  6. Inconsistent use of neurolinguistic programming methods to extract narrative and text data and convert to structured data.
  7. Unrecognized duplicate or redundant data.
  8. Many important data types, such as imaging data or echocardiology results, are only stored as text documents, some only as pdfs, with limited ability to use discrete data, such as a measure of ejection fraction for research or clinical decision support.
  9. Who’s data is it and how do I access it. Who has the decision rights to change or modify the record?
  10. Who will fill the gap between the data and the doctor, given their overworked schedules and lack of training. Are more coaches the answer?
  11. What are the ethical issues concerning the use of data, including reconciling the ethics of medicine with the ethics of data entrepreneurship?
  12. How do we define the expectation of data privacy?

As if that’s not enough to handle, here are some more challenges.

What’s more, algorithms can be deceiving, being used as weapons on math destruction.

Here are some solutions.

Where are all the sick care data scientists interested in digital data disruption?

The latest addition to our armamentarium is , of course, artificial intelligence and blockchain and data scientists and clinicians are still trying to figure out whether and where these techniques add the most value or change patient and doctor behavior. At last count, for e.g. there were about 30 radiology AI companies in the Bay area, and that has radiologists nervous or excited, depending on their individual viewpoint. Also, collaborative cloud applications requiring identity management are a growing trend.

The lack of data interoperability in the world of HIPAA continues to block progress.

The task at hand is to organize patient-academic-industry partnerships to create solutions for trusted digital ecosystems that support and facilitate the use of de-identified and secure data to generate knowledge that will improve the delivery of health care value and health care outcomes for all stakeholders.

By Arlen Meyers, MD, MBA is the President and CEO of the Society of Physician Entrepreneurs.

Arlen Meyers is a professor emeritus of otolaryngology, dentistry, and engineering at the University of Colorado School of Medicine and the Colorado School of Public Health and President and CEO of the Society of Physician Entrepreneurs at www.sopenet.org . He has created several medical device and digital health companies. Most of them failed. His primary research centers around biomedical and health innovation and entrepreneurship and life science technology commercialization.

He consults for and speaks to companies, governments, colleges and universities around the world who need his expertise and contacts in the areas of bio entrepreneurship, bioscience, healthcare, healthcare IT, medical tourism — nationally and internationally, new product development, product design, and financing new ventures.

And Lisa Schilling, MD, MSPH, is a Professor of Medicine at the University of Colorado School of Medicine.