I am a pediatric cardiologist and have cared for children with heart disease for the past three decades. In addition, I have an educational background in business and finance as well as healthcare administration and global health – I gained a Masters Degree in Public Health from UCLA and taught Global Health there after I completed the program.
“The measure of intelligence is the ability to change.”
This timely and noteworthy Research Letter from JAMA delineates the accuracy of a generative artificial intelligence model (GPT-4) in complex diagnostic challenges (vs standardized medical examinations). Even though the generative AI models have performed well in a battery of tests, including the USMLE (United States Medical Licensing Examination), the day-to-day clinics or emergency room sessions are full of cases that are not the classic cases found in the standardized tests.
For this study, the source of the challenges are the 2023 clinico-pathologic conferences from the New England Journal of Medicine, of which a few cases are used to develop a standard chat prompt. The instructions are for the AI model to provide a differential diagnosis ranked by probability. For the study, the cases that were published between January 2021 to December of 2022 were used. Each case was run in independent chats. The outcome to be measured is how the model’s top diagnosis matched the final case diagnosis from NEJM; in addition, a secondary outcome is whether the final case diagnosis is among the model’s list of differential diagnoses. A composite 5-point rating system is used based on accuracy and usefulness of the model’s list. Even though only a few raters were used, a Cohen’s kappa was used to measure the agreement between two raters.
The results showed that AI model’s top diagnosis agreed with the final case diagnosis in 27/70 (39%) of cases while in 45/70 (64%) cases, the AI model included the final diagnosis in its differential diagnosis. Of note, in only 4/70 cases did the AI model fail to have the final case diagnosis in its differential diagnosis at all (“complete miss”). It would have been fascinating to see how clinicians would have done in this intellectual exercise. Overall, generative AI is a promising and necessary adjunct to clinician cognition now and into the future.
To read the full article, click here.
This fascinating topic of generative AI along with others will be discussed at the annual Ai-Med Global Summit, scheduled for May 29-31 2024 in Orlando. Book your place now!
We at Ai-Med believe in changing healthcare one connection at a time. If you are interested in discussing the contents of this article or connecting, please drop me a line – [email protected]