The Next Leap for AI Scribes Provides Eyes in the Clinic

Vision-enabled artificial intelligence (AI) medical scribes could increase the accuracy of patient notes and save valuable time for clinicians

The introduction of vision-enabled artificial intelligence (AI) to medical scribes – the recording devices used by doctors to document meetings with patients in real-time – could increase the accuracy of patient notes and save valuable time for clinicians.

Flinders University study, published in npj Digital Medicine, has found that AI medical scribes already reduce some administrative work that takes time away from patients, but these devices have the capacity to do more when fitted with visual recording apparatus.

Researchers from Flinders’ College of Medicine and Public Health found that a vision-enabled AI scribe, employing a combination of Google’s Gemini model and Ray-Ban Meta smart glasses, substantially improved the documentation accuracy of pharmacist-patient consultations and reduced omissions and errors in clinical notes.

“AI scribes are already helping clinicians by listening to consultations, but healthcare involves far more than spoken words,” says research author Bradley Menz, an academic pharmacist in Flinders’ College of Medicine and Public Health.

“A lot of clinically important information is visual. Important visual cues during consultations include patients’ medicine containers, prescriptions and devices, as well as their body language. When an AI system can use both what it hears and what sees in these consultations, it captures more of the details that matter for patient care.”

In the study, 10 clinical pharmacists recorded 110 ‘mock’ medication-history interviews, which contained more than 100 different medicine containers, including tablets, capsules, injections and creams.

Researchers wore Meta AI Ray-Ban glasses to record the interview before passing the video footage through to the AI scribe, which was developed using Google’s Gemini AI model.

An AI scribe that analysed both video and audio achieved 98% accuracy, compared with 81 per cent  when the same system processed only audio information.

A significant benefit was capturing medication strength and form, which are crucial details for safe dosing. The AI scribe with video input captured this information 97% of the time, while audio-only recordings fell to 28%.

“This is an augmented tool, not a replacement for clinical judgement,” says Mr Menz. “The clinician still needs to review and sign off the document.

“The AI scribe can contain a verification step, take screenshots of medication packages, and generate a full spoken transcript, giving the health professional a much stronger basis for checking what the AI has produced.”

Senior author, Associate Professor Ashley Hopkins, says the study may point to the next stage of AI scribe usage in health care.

“AI scribes have gained traction because they reduce the burden of documentation and give clinicians more time with their patients. These findings suggest that the next step – when the scribe can see as well as hear – produces a more accurate and complete draft,” says Associate Professor Hopkins. “This means less time editing AI-documentation and even more time focusing on patient care.

“These findings suggest the next step may be that all scribe systems can interpret visual information as well as speech, which could open the door to wider clinical uses.”

The authors say the study has some limitation and underlines the need for human oversight and careful governance before these tools are adopted more broadly. The paper also highlights privacy, consent, data security and workflow integration as important issues that will need to be addressed as vision-enabled AI scribes move closer to practice.

Source: Flinders University

Leave a Reply

Your email address will not be published. Required fields are marked *