Machine learning and health policy expert, Dr. Ziad Obermeyer, urges us to stop turning algorithms into ‘literal genies’

 

“The algorithm is a genie. You can ask for whatever you want but always be careful what you wish for,” laughs Dr. Ziad Obermeyer, Emergency Physician and the Blue Cross of California Distinguished Associate Professor at the University of California, Berkeley School of Public Health.

To illustrate how an AI algorithm can easily misread what we want it to do for us, Obermeyer tells a joke: “Brad finds a genie in the lamp. The genie asks Brad what his first wish is. Brad replies that he wants to be rich. The genie says ok, Brad, what’s your second wish? Brad says he’d like to maximize his happiness and spend less time commuting. So the genie organises a global pandemic so everyone stays home and the roads are traffic-free for an easy commute.”

Not exactly a big laugh but reality can be even less amusing. For instance, a social media platform recently showed a user, her own house burning down as the ‘highlight of the year’. “Perhaps the algorithm was trained to predict things that will get people’s attention,” explains Obermeyer. “It’s all about click and engagement but we didn’t tell the ‘genie’ that we wanted some good attention. All we said was, ‘Hey, genie, give me some attention!’ and that’s what the ‘genie’ did.”

In healthcare, many algorithms look ahead to the next year to identify patients who are more likely than others to get sick and generate high healthcare costs. It’ll then select them to be under a population management program, under the attention of a team of health professionals. In an example, Obermeyer cites two patients. Patient 1 is White, has mild depression, takes citalopram and recently had a knee replacement. He is now walking freely without assistance. Patient 2 is Black, with diabetes, heart failure and COPD. He is on insulin and seven other kinds of medications. He had two emergency room visits last year which led to one hospitalization and a prolonged rehabilitation stay. Interestingly, these two patients received the same algorithmic scores. They were both equally high risks and automatically enrolled in the same population health program.

Obermeyer finds it hard to believe that the algorithm sees these patients the same. But when he and his research team examine patients with the same algorithmic scores, they found Black patients tended to have more chronic illnesses and biomarkers indicating poorer health conditions. “The algorithm suggests that 18% of the high-risk patients are Black,” he says. But what we found was that in an unbiased world, that group should have been about 50%.”

So how does it happen?

Obermeyer refers back to his ‘Literal Genie’ joke. When creating this particular biased algorithm, developers wanted it to predict patients with higher healthcare costs. This is not unreasonable because individuals with high needs are likely to have high costs. However, as with all genie wishes, there is a catch – and this one is that a seemingly healthy patient and a patient whose health is spiraling downhill are deemed equally high risk. It turned out that Patient 1 had just had a knee replacement, a procedure which incurs health costs that are comparable to the amount Patient 2 spent on his medication, emergency room visits and hospitalization.

“This example is not the fault of the algorithm,” says Obermeyer. “It has done well by predicting healthcare costs in an extremely accurate manner. The problem is what we asked the algorithm to do. Healthcare cost should not be a proxy for at-risk patients.”

Obermeyer and his research team contacted the company that made the algorithm. They replicated the results on 3.7 million patient data and expressed their interest to work with the research team to fix the problem.

In the revised version, the algorithm is now predicting healthcare needs rather than healthcare costs. Although healthcare need is a harder-to-measure proxy, the additional work makes the new algorithm 84% less biased. Obermeyer stresses this is not affirmative action and the algorithm itself is not able to differentiate White and Black patients.

“All of the differential results concerning race come from the outcomes that we ask the algorithm to predict,” he says. “If we ask the algorithm to predict healthcare cost, it naturally introduces a racial bias in favor of people who have more access and receive better treatment from the healthcare system. Instead, when we ask the algorithm to focus on needs, it will redirect all its resources to people who need help. Those people happen to be disproportionately Black and poor. We did not build those people into the algorithm. We are just helping the algorithm to do what we wanted it to do”.

Obermeyer’s research opened a door for him to broaden his efforts. He is now helping organizations fix their algorithmic bias and designing new algorithms to fight bias in healthcare.

“I remembered during my residency, there was a senior doctor who happened to be Black. He pointed out that patients who were receiving care in the hallway were all Black while patients receiving care in the rooms were all White. Nobody else had noticed this. So, at times, bias can be subtle. This is what makes this field so exciting for me. All these small things like making more datasets freely available and asking the right questions when training algorithms makes a huge difference that will impact people’s lives tremendously.”