Paging Dr. AI to the ER? Artificial intelligence shows promise in emergency room diagnosis

When a patient first comes through the doors of an emergency room, Ontario physician Dr. Nour Khatib says it can be a puzzle determining a diagnosis, course of treatment and what they might need to be sent home safely.

Khatib, who works in ERs at Oak Valley Health’s Markham Stouffville and Uxbridge Hospitals, is like many other physicians who are increasingly relying on artificial intelligence to make that process more efficient.

“It’s just another tool to help us give the patient the highest quality care possible,” she said.

A new study published Thursday in the journal Science may be a further step toward that.

The study looked at the emergency room performance of large language models (LLMs), which can analyze huge amounts of online information to generate human-like responses. It found that LLMs could diagnose cases as well as, or even better than, actual doctors.

But even as the technology develops, Khatib and other physicians — including this study’s author — insist that computers won’t replace the eyes, ears and skills of a trained emergency medical professional.

How can AI be used in emergency rooms?

Khatib has already been working with AI scribes, which transcribe exchanges between doctors and patients and creates detailed medical notes. It’s a pilot project with Oak Valley health and done with prior consent from patients.

She says hospitals are also exploring the use of self-scheduling using AI, and also chatbots that can help patients get a better understanding of specific illnesses.

The LLM used as part of the recent study is a specialized type known as a reasoning model, which is trained to solve complex tasks by explaining its thinking before giving a final answer. It’s already becoming “commonplace” in U.S. hospitals, says lead author Dr. Adam Rodman, a physician at Beth Israel Deaconess Medical Center in Boston.

“A reasoning model is different from your standard large language model because it has been instructed to think out loud, to solve problems like humans,” he told CBC News.

When you look at how these “reasoners” make a diagnosis, he says, it’s similar to the steps a doctor would have taken to solve a problem.

“Getting a model to think in this way,” he says, “it improves the diagnostic accuracy.”

WATCH | Doctors turn to AI transcription tools to cut down on paperwork:

How doctors are using AI in the exam room — and why it could become the norm

The Quebec government says it’s launching a pilot project involving artificial intelligence transcription tools for health-care professionals, with an increasing number saying they cut down the time they spend filling paperwork.

How was AI put to the test?

The researchers carried out several trials with both real patient cases and synthetic cases using “unstructured” data from the records of an emergency department, in an effort to “mirror the high-stakes decisions” that doctors and nurses make in the ER.

They used OpenAI’s o1-preview model at a Boston emergency room during three points of patient interaction: initial triage, doctor examination in the ER and admission to the medical floor or intensive care unit. The research relied only on data. None of the testing involved actual doctor-patient interactions and had no effect on real diagnoses or treatments.

With the real patient cases, Rodman says the model was asked at each stage a very narrow set of questions focused on the presentation of symptoms to produce the “most likely” diagnosis.

LISTEN | How one family doctor is using AI in his daily practice:

The Morning Edition – K-W6:45A new AI assistant is coming to doctor’s offices

With the synthetic cases, he explains, the tool was also asked about the reasoning for its output as well as next steps in patient management.

Overall, Rodman’s study found that the model identified the exact or a very close diagnosis, at times surpassing the performance of the physicians who participated in the trial at each stage of care.

“It doesn’t mean that computers can do medicine, but within this narrow task it can solve diagnoses better than humans,” says Rodman.

WATCH | How using AI for hospital scheduling could free up more time for patient care:

This Toronto hospital will use AI to save scheduling time for patients

Credit Valley Hospital is planning to use AI to reduce the time spent on ER scheduling. CBC’s Talia Ricci spoke to hospital staff about how this innovation could free up more time for patient care.

What could this mean for doctors and patients?

Dr. Amol Verma, an internal medicine physician and scientist at Toronto’s St. Michael’s Hospital, sees how good AI tools have become at answering medical questions and diagnosing patient cases.

But he says it’s a “false comparison” to say they are “better than doctors.”

“I don’t know a single doctor who makes all of their decisions based purely on text information,” he said.

It’s the physical examination — how someone looks, sounds and feels — that forms a diagnosis, he says.

Khatib echoes that, offering the example of a recent emergency room patient she treated.

She says the information obtained from the patient during triage provided details about symptoms aligned with an existing disease.

But her understanding of the patient’s condition changed when she listened with her stethoscope — something AI isn’t going to do.

It’s also not going to intubate a patient in an ER or put a cast on an injured limb, she says.

WATCH | AI-supported screenings for early stage breast cancer effective:

AI better at catching early breast cancer than humans: study

AI-supported screenings are better at detecting early stages of breast cancer than human-only screenings, says a new Swedish study published in the Lancet medical journal. Canadian doctors and radiologists now want to see the technology more widely available here.

What challenges, concerns still exist?

Rodman admits there are limitations to his study and that more work is needed to understand how humans and machines can collaborate effectively in an emergency medical environment.

But he believes this is a first step, even though there will need to be more “robust” studies clinical trials to ensure real-world efficacy and safety.

Verma not only wants to see further evaluation of reasoning models in ERs, but also in Canadian settings.

OpenAI is an American company — something he says he finds concerning, in regards to the privacy of patient information — with the study relying on a model trained on U.S. data in a largely privatized health-care system.

“It may not apply to the Canadian context,” he said.

Although this study helps make the case that a reasoning model can be effective at diagnosing ER patients, in some cases, Khatib says all exploration of AI in hospital settings must be done responsibly and that the right people must be using it safely, securely and accurately.

“We are dealing with AI by putting guardrails first,” she said. “We’re not chasing AI headlines first.”

WATCH | Should you be asking your AI chatbot for health advice:

Are AI chatbots giving good medical advice to Canadians?

More Canadians are turning to artificial intelligence with questions about their health, but research shows the answers they’re getting could be incorrect or misleading. As CBC’s Jo Horwood explains, while some AI tools are being explored for use in the health industry, researchers are hesitant to put them in the hands of consumers.