Doctors have been using A.I. like ChatGPT for various tasks including doing tedious paperwork, predicting health problems, and even improving their bedside manner, but what about a large language model (LLM), trained on medical exams, that can help them with diagnoses? Google is hoping to take A.I. into the health care mainstream with a new, medicine-specific chatbot called Med-PaLM 2, that it’s been testing since April, the Wall Street Journal reported, citing people familiar with the matter.
Med-PaLM 2 is an LLM that answers medical questions, organizes information, and can synthesize various modes of data including images and health records, according to Google’s website. Google, also the maker of the chatbot Bard, trained Med-PaLM 2 on medical licensing exams, and unsurprisingly it is the first A.I. to have produced passing answers for U.S. Medical Licensing Examination (USMLE)–style questions. Questions in the USMLE style present a patient scenario that lists symptoms, medical history, age, and other descriptors, and asks questions such as what complication is most likely. Med-PaLM 2 was able to provide long-form answers to these questions, in addition to selecting from multiple choices.
OpenAI’s GPT-4, ChatGPT’s successor, scored similarly to Med-PaLM 2 on medical exam questions, despite not being specifically trained on the exams. However, both technologies are still not reliable enough for high-stakes use in health care.
“I don’t feel that this kind of technology is yet at a place where I would want it in my family’s health care journey,” Greg Corrado, a senior research director who worked on Med-PaLM 2 told the Wall Street Journal.
Google is currently piloting Med-PaLM 2 at the research hospital Mayo Clinic, and has not announced when the chatbot could be released to the general public. Hospitals are already using ChatGPT—and have been almost immediately after its release—and not just for quick medical questions. Doctors are using A.I. less like an encyclopedia and more like an assistant, even asking the chatbot how to conduct difficult interactions, such as interventions for those struggling with addiction.
Using A.I. templates to communicate with patients may seem like an insufficient substitution for human connection, but Med-PaLM 2’s responses to medical questions were actually preferred to real doctors’ responses, according to research published by Google in May. Physicians compared A.I.-generated responses to human-written responses on nine verticals, and preferred the A.I.’s answers in eight of the nine.
Despite the possibly higher quality of some A.I. answers, a 2018 survey found that the majority of patients prioritize compassion in medical care, and would pay a higher fee for a more compassionate experience. A.I. fundamentally cannot provide compassion, but its use in creating scripts for an improved bedside manner seems to be facilitating smoother or gentler doctor-patient conversations.
Still, many are wary that integrating A.I. into medicine too quickly and without regulation could have disastrous consequences. A.I. often has “hallucinations,” in which it states false information as fact, which could lead to false diagnoses or treatments if not carefully checked by a person. Moreover, A.I. has the potential to replicate and amplify bias already ingrained in the health care system if not trained correctly. The World Health Organization released a statement in May calling for a very cautious introduction of A.I. into medicine.
“Precipitous adoption of untested systems could lead to errors by health care workers, cause harm to patients, erode trust in A.I., and thereby undermine (or delay) the potential long-term benefits and uses of such technologies around the world,” the WHO wrote.
There’s also the question of how patient data will be used if input into hospital A.I. Google and Microsoft both did not train their algorithms on patient data, but each hospital could train their A.I. on patient data in the future. Google has already started using patient data from Mayo Clinic’s Minnesota headquarters for specific projects.
Patient data would generally be encrypted and inaccessible by the company, Google said, but the tech giant has caused controversy with its use of health care data in the past. In 2019, Google launched an initiative called “Project Nightingale” in which it collected medical data from millions of Americans across 21 states without their consent. The data included patient names and other identifying information, diagnoses, lab results, and records. It was used internally by Google without doctor or patient knowledge to provide a service to a business partner under a Business Associates Agreement.
“Careful consideration will need to be given to the ethical deployment of this technology including rigorous quality assessment when used in different clinical settings and guardrails to mitigate against overreliance on the output of a medical assistant,” Google wrote in its report on Med-PaLM.
Google did not respond to Fortune’s request for comment.