There has been a lot of talk about the potential for AI in health, but most of the studies so far have been stand-ins for the actual practice of medicine: simulated scenarios that predict what the impact of AI could be in medical settings.
But in one of the first real-world tests of an AI tool, working side-by-side with clinicians in Kenya, researchers showed that AI can reduce medical errors by as much as 16%.
In a study available on OpenAI.com that is being submitted to a scientific journal, researchers at OpenAI and Penda Health, a network of primary care clinics operating in Nairobi, found that an AI tool can provide a powerful assist to busy clinicians who can’t be expected to know everything about every medical condition. Penda Health employs clinicians who are trained for four years in basic health care: the equivalent of physician assistants in the U.S. The health group, which operates 16 primary care clinics in Nairobi Kenya, has its own guidelines for helping clinicians navigate symptoms, diagnoses, and treatments, and also relies on national guidelines as well. But the span of knowledge required is challenging for any practitioner.
That’s where AI comes in. “We feel it acutely because we take care of such a broad range of people and conditions,” says Dr. Robert Korom, chief medical officer at Penda. “So one of the biggest things is the breadth of the tool.”
Read More: A Psychiatrist Posed As a Teen With Therapy Chatbots. The Conversations Were Alarming
Previously, Korom says he and his colleague, Dr. Sarah Kiptinness, head of medical services, had to create separate guidelines for each scenario that clinicians might commonly encounter—for example, guides for uncomplicated malaria cases, or for malaria cases in adults, or for situations in which patients have low platelet counts. AI is ideal for amassing all of this knowledge and dispensing it under the appropriately matched conditions.
Korom and his team built the first versions of the AI tool as a basic shadow for the clinician. If the clinician had a question about what diagnosis to provide or what treatment protocol to follow, he or she could hit a button that would pull a block of related text collated by the AI system to help the decision-making. But the clinicians were only using the feature in about half of visits, says Korom, because they didn’t always have time to read the text, or because they often felt they didn’t need the added guidance.
So Penda improved on the tool, called AI Consult, that runs silently in the background of visits, essentially shadowing the clinicians’ decisions, and prompting them only if they took questionable or inappropriate actions, such as over prescribing antibiotics.
“It’s like having an expert there,” says Korom—similar to how a senior attending physician reviews the care plan of a medical resident. “In some ways, that’s how [this AI tool] is functioning. It’s a safety net—it’s not dictating what the care is, but only giving corrective nudges and feedback when it’s needed.”
Read More: The World’s Richest Woman Has Opened a Medical School
Penda teamed up with OpenAI to conduct a study of AI Consult to document what impact it was having on helping about 20,000 doctors to reduce errors, both in making diagnoses and in prescribing treatments. The group of clinicians using the AI Consult tool reduced errors in diagnosis by 16% and treatment errors by 13% compared to the 20,000 Penda providers who weren’t using it.
The fact that the study involved thousands of patients in a real-world setting sets a powerful precedent for how AI could be effectively used in providing and improving health care, says Dr. Isaac Kohane, professor of biomedical informatics at Harvard Medical School, who looked at the study. “We need much more of these kinds of prospective studies as opposed to the retrospective studies, where [researchers] look at big observational data sets and predict [health outcomes] using AI. This is what I was waiting for.”
Not only did the study show that AI can help reduce medical errors, and therefore improve the quality of care that patients receive, but the clinicians involved viewed the tool as a useful partner in their medical education. That came as a surprise to OpenAI’s Karan Singhal, Health AI lead, who led the study. “It was a learning tool for [those who used it] and helped them educate themselves and understand a wider breadth of care practices that they needed to know about,” says Singhal. “That was a bit of a surprise, because it wasn’t what we set out to study.”
Kiptinness says AI Consult served as an important confidence builder, helping clinicians gain experience in an efficient way. “Many of our clinicians now feel that AI Consult has to stay in order to help them have more confidence in patient care and improve the quality of care.”
Clinicians get immediate feedback in the form of a green, yellow, and red-light system that evaluates their clinical actions, and the company gets automatic evaluations on their strengths and weaknesses. “Going forward, we do want to give more individualized feedback, such as, ‘You are great at managing obstetric cases, but in pediatrics, these are the areas you should look into,'” says Kiptinness. “We have many ideas for customized training guides based on the AI feedback.”
Read More: The Surprising Reason Rural Hospitals Are Closing
Such co-piloting could be a practical and powerful way to start incorporating AI into the delivery of health care, especially in areas of high need and few health care professionals. The findings have “shifted what we expect as standard of care within Penda,” says Korom. “We probably wouldn’t want our clinicians to be completely without this.”
The results also set the stage for more meaningful studies of AI in health care that move the practice from theory to reality. Dr. Ethan Goh, executive director of the Stanford AI Research and Science Evaluation network and associate editor of the journal BMJ Digital Health & AI, anticipates that the study will inspire similar ones in other settings, including in the U.S. “I think that the more places that replicate such findings, the more the signal becomes real in terms of how much value [from AI-based systems] we can capture,” he says. “Maybe today we are just catching mistakes, but what if tomorrow we are able to go beyond, and AI suggests accurate plans before a doctor makes mistakes to being with?”
Tools like AI Consult may extend access of health care even further by putting it in the hands of non-medical people such as social workers, or by providing more specialized care in areas where such expertise is unavailable. “How far can we push this?” says Korom.
The key, he says, would be to develop, as Penda did, a highly customized model that accurately incorporates the work flow of the providers and patients in a given setting. Penda’s AI Consult, for example, focused on the types of diseases most likely to occur in Kenya, and the symptoms clinicians are most likely to see. If such factors are taken into account, he says, “I think there is a lot of potential there.”