“You must understand little data before big data and shallow learning before deep learning” said Dr. Robert Hoyt, Associate Clinical Professor, Internal Medicine Department, Virginia Commonwealth University during his talk “Machine Learning for non-data scientists” at the AIMed 19 conference.
Dr. Hoyt believes while there is no easy way to learn a programming language or data science in general, medical professionals should not regard them as hurdles during their quest for artificial intelligence (AI)-driven solutions. This is important knowledge which not only allows them to “intelligently read or review medical articles today” but also keeps them on par with recent and upcoming developments.
In a recent interview with AIMed, Dr. Hoyt reviewed again some of the challenges associated with understanding data science, alternative software which the medical community can leverage, and the progress of health informatics in this new decade.
AIMed: Based on your experiences in teaching health informatics, what are some of the challenges faced by medical professionals as they learn programming language and machine learning?
Dr. Hoyt: This is an excellent question and I think I am going to broaden it a little by saying, data science, which will include AI, machine learning (ML) and biostatistics.
The biggest challenge is most medical professionals do not learn any data science in medical school. If they happen to have a secondary degree like a Master of Public Health or Master of Science, they may pick up a few skills. Basically, the only people who really learn data science are those who have completed a related degree. This puts medical professionals at a considerable disadvantage as there is a huge gap between what they currently know and what they probably ought to know to be a member of a data science team.
Besides, a data scientist spends 60-80% of his/her time, not running a neural network or some very interesting algorithms but locating appropriate data, cleaning them, figuring how they are going to be used, what to do with the missing data and so on. So, my point is, no one can start learning AI or ML right away, they will need some sort of background to get them ready to handle the data. It’s simplistic to say “I am going to train someone in AI or ML” without learning some basics.
AIMed: In that case, do you think it’s more practical for medical professionals to collaborate with data scientists to build an AI solution than for them to pick up programming language and ML from scratch?
Dr. Hoyt: I think it’s more realistic for medical professionals to be part of a data science team but that being said, they need to at least “speak the language” which is a concern that I have. It’s like learning a foreign language, we are not asking people to be fluent but they should have conversational skills. As such, even if a medical professional is not really trained to be a data scientist, they need to have more knowledge than they currently have.
In my opinion, we are moving towards AI as a service online so that hospitals and healthcare systems will not necessarily need additional hardware or software; all they have to do is to contract with an AI provider. Nevertheless, many of these automated ML and AI platforms require you to have the data ready for analysis. The question becomes how many medical professionals are trained and have the data polished and curated in such a way that they are ready for these platforms? So, we really need to talk more about a broad education and not only ML.
AIMed: Is there a so-called “shortcut” or “quickest path” in picking up programming language or ML?
Dr. Hoyt: Honestly, there is no shortcut in learning a programming language and it’s easiest for those who already have some experiences. There are many low-cost or free courses for one to pick up a programming language but based on personal experience, one may need a mentor or instructor to run questions by because learning a programming language is one thing, applying it to ML is another, as it requires different sets of packages or libraries.
Learning the ins-and-outs of ML is perhaps a little bit easier than learning a programming language especially since there are automated ML platforms like RapidMiner which runs five or six algorithms all at once and gives you a meaningful output. It’s great but again, as I mentioned, one needs to get the data ready. In other words, AI is labor intensive, there is no quickest way out, one will need to understand higher mathematics, calculus, and linear algebra and a programming language alone is not enough.
AIMed: Some believe using pre-programmed platforms to develop ML solutions will aggravate the AI Blackbox problem, is it correct and what’s your thought on this?
Dr. Hoyt: I don’t think that’s correct. Most people use ML to perform predictive analytics but it’s only with deep learning and advanced neural network, there is the Blackbox issue. I do recommend ML software platforms for the average clinician. Programs like RapidMiner are very good because you can also see exactly what processes are taking place in the background. So, even though an “auto-model” is present to give you the solution, you can always trace the pathway of how it comes to that. As such, I really don’t think it’s an issue.
AIMed: It’s interesting we keep encouraging medical professionals to pick up data science, but have you asked whether they are open to the idea?
Dr. Hoyt: That’s a good question. It’s unclear and I think what we do need is a whole lot of data about what does the industry need, what we are currently teaching, and what does the medical community know. At my alma mater, there’s minimal health informatics and data science and the reason given is that there are just so many hours in a day and it’s not possible to cram anything else into the curriculum. I think that’s a problem. We may need to think of different new methods or strategies to include data science in medical education.
AIMed: As AI gets more prevalent in medicine and healthcare, how do you think health informatics will progress in this new decade?
Dr. Hoyt: Some health informatics instructors believe they are already teaching data science but I think that is incorrect. A study written by Meyer and published last spring suggested that when industry requests for a healthcare data scientist; they expect them to be able to do a programming language, biostatistics, ML and have good soft skills, meaning one needs to know how to present visualized data and so on. But I can guarantee most health and clinical informaticists are not being consistently taught those skills.
I do think, clearly, the health informatics community is currently dealing with how to incorporate more data science into the curriculum. There may be an overlap between health informatics and data science but how it all ends out remains unclear. On top of which, there are a multitude of new biomedical data science courses being taught in the US yet we are a long way from graduating enough people to handle the need.