Category: IT in Healthcare

Deepfake X-Rays Fool Radiologists and AI

Findings raise concerns about cybersecurity and diagnostic trust

Anatomy-matched real and GPT-4o-generated radiographs: (A) real and (B) GPT-4o-generated posteroanterior chest radiographs, (C) real and (D) GPT-4ogenerated lateral cervical spine radiographs, (E) real and (F) GPT-4o-generated posteroanterior hand radiographs, and (G) real and (H) GPT-4o-generated lateral lumbar spine radiographs. The pairs demonstrate that GPT-4o can produce radiographically plausible images across different anatomic regions.
https://doi.org/10.1148/radiol.252094 ©RSNA 2026

Neither radiologists nor multimodal large language models (LLMs) are able to easily distinguish AI-generated “deepfake” X-ray images from authentic ones, according to a study published in Radiology. The findings highlight the potential risks associated with AI-generated X-ray images, along with the need for tools and training to protect the integrity of medical images and prepare health care professionals to detect deepfakes.

The term “deepfake” refers to a video, photo, image or audio recording that appears real but has been created or manipulated using AI.

“Our study demonstrates that these deepfake X-rays are realistic enough to deceive radiologists, the most highly trained medical image specialists, even when they were aware that AI-generated images were present,” said lead study author Mickael Tordjman, MD, post-doctoral fellow, Icahn School of Medicine at Mount Sinai, New York. “This creates a high-stakes vulnerability for fraudulent litigation if, for example, a fabricated fracture could be indistinguishable from a real one. There is also a significant cybersecurity risk if hackers were to gain access to a hospital’s network and inject synthetic images to manipulate patient diagnoses or cause widespread clinical chaos by undermining the fundamental reliability of the digital medical record.”

Seventeen radiologists from 12 different centers in six countries (United States, France, Germany, Turkey, United Kingdom and United Arab Emirates) participated in the retrospective study. Their professional experience ranged from 0 to 40 years. Half of the 264 X-ray images in the study were authentic, and the other half were generated by AI. Radiologists were evaluated on two distinct image sets, with no overlapping between the datasets. The first dataset included real and ChatGPT-generated images of multiple anatomical regions. The second dataset included chest X-ray images—half authentic and the other half created by RoentGen, an open-source generative AI diffusion model developed by Stanford Medicine researchers.

When radiologist readers were unaware of the study’s true purpose, yet asked after ranking the technical quality of each ChatGPT image if they noticed anything unusual, only 41% spontaneously identified AI-generated images. After being informed that the dataset contained synthetic images, the radiologists’ mean accuracy in differentiating the real and synthetic X-rays was 75%.

Individual radiologist performance in accurately detecting the ChatGPT-generated images ranged from 58% to 92%. Similarly, the accuracy of four multimodal LLMs—GPT-4o (OpenAI), GPT-5 (OpenAI), Gemini 2.5 Pro (Google), and Llama 4 Maverick (Meta)—ranged from 57% to 85%. Even ChatGPT-4o, the model used to create the deepfakes, was unable to accurately detect all of them, though it identified the most by a considerable margin compared to Google and Meta LLMs.

Radiologist accuracy in detecting the RoentGen synthetic chest X-Rays ranged from 62% to 78% and the LLM models’ performance ranged from 52% to 89%.

There was no correlation between a radiologist’s years of experience and their accuracy in detecting synthetic X-ray images. However, musculoskeletal radiologists demonstrated significantly higher accuracy than other radiology subspecialists.

Spotting the Risks in Synthetic Imaging

“Deepfake medical images often look too perfect,” Dr. Tordjman said. “Bones are overly smooth, spines unnaturally straight, lungs overly symmetrical, blood vessel patterns excessively uniform, and fractures appear unusually clean and consistent, often limited to one side of the bone.”

Recommended solutions to clearly distinguish real and fake images and help prevent tampering include implementing advanced digital safeguards, such as invisible watermarks that embed ownership or identity data directly into the images and automatically attaching technologist-linked cryptographic signatures when the images are captured.

“We are potentially only seeing the tip of the iceberg,” Dr. Tordjman said. “The logical next step in this evolution is AI-generation of synthetic 3D images, such as CT and MRI. Establishing educational datasets and detection tools now is critical.”

The study’s authors have published a curated deepfake dataset with interactive quizzes for educational purposes.

For More Information

Access the Radiology study, “The Rise of Deepfake Medical Imaging: Radiologists’ Diagnostic Accuracy in Detecting ChatGPT-generated Radiographs,” and the related editorial, “The Democratization of Deceit: Seeing Is No Longer Believing.”

Source: Radiological Society of North America

The Next Leap for AI Scribes Provides Eyes in the Clinic

Vision-enabled artificial intelligence (AI) medical scribes could increase the accuracy of patient notes and save valuable time for clinicians

The introduction of vision-enabled artificial intelligence (AI) to medical scribes – the recording devices used by doctors to document meetings with patients in real-time – could increase the accuracy of patient notes and save valuable time for clinicians.

Flinders University study, published in npj Digital Medicine, has found that AI medical scribes already reduce some administrative work that takes time away from patients, but these devices have the capacity to do more when fitted with visual recording apparatus.

Researchers from Flinders’ College of Medicine and Public Health found that a vision-enabled AI scribe, employing a combination of Google’s Gemini model and Ray-Ban Meta smart glasses, substantially improved the documentation accuracy of pharmacist-patient consultations and reduced omissions and errors in clinical notes.

“AI scribes are already helping clinicians by listening to consultations, but healthcare involves far more than spoken words,” says research author Bradley Menz, an academic pharmacist in Flinders’ College of Medicine and Public Health.

“A lot of clinically important information is visual. Important visual cues during consultations include patients’ medicine containers, prescriptions and devices, as well as their body language. When an AI system can use both what it hears and what sees in these consultations, it captures more of the details that matter for patient care.”

In the study, 10 clinical pharmacists recorded 110 ‘mock’ medication-history interviews, which contained more than 100 different medicine containers, including tablets, capsules, injections and creams.

Researchers wore Meta AI Ray-Ban glasses to record the interview before passing the video footage through to the AI scribe, which was developed using Google’s Gemini AI model.

An AI scribe that analysed both video and audio achieved 98% accuracy, compared with 81 per cent  when the same system processed only audio information.

A significant benefit was capturing medication strength and form, which are crucial details for safe dosing. The AI scribe with video input captured this information 97% of the time, while audio-only recordings fell to 28%.

“This is an augmented tool, not a replacement for clinical judgement,” says Mr Menz. “The clinician still needs to review and sign off the document.

“The AI scribe can contain a verification step, take screenshots of medication packages, and generate a full spoken transcript, giving the health professional a much stronger basis for checking what the AI has produced.”

Senior author, Associate Professor Ashley Hopkins, says the study may point to the next stage of AI scribe usage in health care.

“AI scribes have gained traction because they reduce the burden of documentation and give clinicians more time with their patients. These findings suggest that the next step – when the scribe can see as well as hear – produces a more accurate and complete draft,” says Associate Professor Hopkins. “This means less time editing AI-documentation and even more time focusing on patient care.

“These findings suggest the next step may be that all scribe systems can interpret visual information as well as speech, which could open the door to wider clinical uses.”

The authors say the study has some limitation and underlines the need for human oversight and careful governance before these tools are adopted more broadly. The paper also highlights privacy, consent, data security and workflow integration as important issues that will need to be addressed as vision-enabled AI scribes move closer to practice.

Source: Flinders University

Healthcare Under Attack: Why Cybersecurity is now Critical Care

Photo by Nahel Abdul on Unsplash

By Kerissa Varma, Microsoft Chief Security Advisor, Africa

Africa’s healthcare sector is facing a silent emergency. Many healthcare operators, facilities and doctors across Africa already grapple with the challenges of under-resourced environments, an uneven distribution of resources and massive demand for services. Now, healthcare administrators must turn their attention to a relatively new and extremely urgent concern. While doctors fight to save lives, cybercriminals are infiltrating hospitals, laboratories, and clinics, turning life-saving environments into digital battlegrounds.

A growing epidemic

World Health Organization director-general Tedros Adhanom Ghebreyesus noted that the digital transformation of healthcare, combined with the high value of health data, has made the sector a prime target for cybercriminals, commenting that “At best, these attacks cause disruption and financial loss. At worst, they undermine trust in the health systems on which people depend, and even cause patient harm and death.”

Recent attacks have exposed the fragility of Africa’s medical infrastructure. In May 2025, Mediclinic Southern Africa was hit by a cyber extortion attack, compromising sensitive HR data. Later in 2025, Lancet Laboratories faced a regulatory penalty for failing to notify patients about data breaches under South Africa’s POPIA law, while a ransomware strike on the National Health Laboratory Service disrupted blood test processing nationwide, delaying critical care for millions.

M-Tiba, a Kenyan digital health platform managed by CarePay and backed by Safaricom, suffered a significant cyberattack and data breach in late 2025, while earlier this year Pharmacie.ma, a Moroccan pharmaceutical platform, was reportedly the target of an alleged data leak incident that allegedly involved the unauthorised export of a customer database. And recent research indicates that Nigeria’s private healthcare sector is now one of the most targeted on the African continent, with attacks increasing at an alarming rate.

Many incidents also go unreported, as hospitals and healthcare facilities rarely disclose them publicly, yet these incidents are not isolated, with ransomware dominating the threat landscape. Africa’s healthcare sector is heavily targeted by cybercriminals, with healthcare organisations facing an average of 3575 weekly attacks in 2025, a 38% surge from the previous year, with encryption of patient data, temporary loss of access to hospital systems and the risk of data appearing on the dark web cited as potential impacts.

Why healthcare is a prime target

The healthcare industry in Africa, particularly in the public sector, is working with legacy systems, fragmented infrastructure, and underfunded IT teams, all of which combine to make the sector an easy target for unscrupulous bad actors.

Many medical institutions are adopting open-source AI tools for diagnostics and patient management. While cost-effective, these platforms often lack enterprise-grade security, leaving sensitive data exposed. Combined with fragmented storage of paper and electronic patient records – often unencrypted and scattered across multiple systems – the risk of breaches multiplies.

Hospitals and healthcare facilities cannot afford downtime. Every minute offline risks lives, making them more likely to pay ransoms in an attempt to regain control of their systems. Cyber insurers  indicate that in 2 of 5 cases of a ransom being paid, data and operations still cannot be recovered. Additionally, in instances where some or all of the seized data is recovered after paying a ransom, the attacker goes on to request further payments.

Medical records are also a premium target for cybercriminals. In the USA, researchers found that patient records, insurance details, and research data fetch premium prices on the dark web – up to 10 times higher than financial data, according to cybersecurity analysts. A single stolen medical record can sell for $260–$310, compared to $30–$50 for a credit card, because unlike credit cards, medical records never expire and medical information cannot be easily changed, making it useful for years. Medical records frequently include personal identifiers, insurance details, and sometimes biometric data, enabling identity theft and fraud, while criminals use medical data for fake insurance claims, prescription fraud, and targeted scams. Microsoft believes cybersecurity needs to be embedded into every technology implementation. This should be a key priority, especially with sensitive medical data and operations.

How healthcare can use modern technology safely

As Africa’s healthcare systems digitise and embrace AI, protecting the digital lifeline must become as critical as protecting the physical one. Key steps can secure healthcare organisations and facilities like laboratories and diagnostic services’ systems.

Include cybersecurity in your resilience planning

Medical professionals and healthcare facilities often prioritise the resilience of physical capabilities. Power backups, multiple devices should equipment fail, and a standby roster in the event of a practitioner being unavailable are all practices that save lives. Equally cybersecurity and safeguarding online systems needs to be built into the overall resilience planning of medical facilities and services.

Investing in cybersecurity technology that can quickly identify and contain attacker activity before it leads to system downtime or data theft can save lives. Having a response plan that is practiced and maintained in the event of a cyber breach and ensuring strong data backups could mean the difference between a total failure of health services or a minor incident. Ensuring incident response plans are aligned with local compliance laws such as South Africa’s POPIA, and Kenya and Nigeria’s Data Protection Acts is critical for healthcare providers to meet both their resilience and compliance objectives.

Prepare for AI-driven attacks that are going to increase attacker speed and success

Threat actors are increasingly exploiting the interconnectedness of modern software ecosystems and operational structures to conduct malicious activity, so regular auditing of third-party integrations, especially those involving AI or cloud services, is critical.

Adversaries are using AI to scale and tailor operations, with AI-driven phishing being 4.5x more effective than traditional phishing. However, in equal measure, AI is transforming cyber defence – it automates response and containment, detects threats faster and more accurately, and identifies detection gaps and adapts to attacker behaviour. Healthcare organisations should invest in AI-driven threat detection for faster response and anomaly detection and must also take steps to secure AI models and data pipelines by implementing robust access controls, vulnerability scanning, and regular patching for open-source tools.

Remote and wider access to patient records requires strong identity practices

As both patients and medical professionals start accessing patient records digitally, strong means of identification, verification and authentication are critical. The Microsoft Digital Defense Report 2025 notes that the abuse of valid accounts is a frequent occurrence, with malicious actors gaining access to user credentials (usernames and passwords) and using them to infiltrate systems without triggering traditional security alerts. Therefore, organisations must deploy phishing-resistant multifactor authentication (MFA) and conditional access to strengthen user defences.

Invest in people and skills

People are at the heart of robust cybersecurity measures, so it is vital to train staff against common tactics such as phishing, which is the most common entry point for attackers, and apply role-based access controls for both clinical and research data to prevent privilege misuse.

Cybersecurity is no longer an IT issue – it’s a patient safety issue. Healthcare services and providers must treat digital resilience with the same urgency as infection control. By investing in comprehensive cybersecurity strategies and leveraging AI-powered defences, Africa’s healthcare sector can position itself as a crucial front line against emerging threats and help build stronger, more resilient digital ecosystems.

‘Google Earth’ for Human Organs Made Available Online

A new open-access 3D portal that allows users to explore human organs in unprecedented detail, from the whole organ to individual cells, has been launched by an international team led by UCL scientists.

The Human Organ Atlas, described in a new paper in the journal Science Advances, brings together some of the most detailed images of 3D organs ever produced. It enables scientists, doctors, educators, students and the wider public to interactively “fly through” organs such as the brain, heart, lungs, kidney and liver, providing a new way of understanding human anatomy and human diseases.

The resource can be accessed directly through a standard web browser, without specialist software, at this link.

The Atlas is powered by an advanced X-ray imaging method called Hierarchical Phase-Contrast Tomography (HiP-CT), developed at the European Synchrotron (ESRF) in Grenoble, France. HiP-CT uses the ESRF’s Extremely Brilliant Source – a new generation of synchrotron source – which is up to 100 billion times brighter than conventional hospital CT scanners.

This allows researchers to scan entire intact ex vivo human organs (i.e., donated organs) non-destructively and then zoom in to near-cellular resolution (down to less than one micron, 50 times thinner than the size of a human hair).

The technique bridges a century-old gap in medicine between radiology and histology, and represents a major advance in biomedical imaging.

Professor Peter Lee (UCL Department of Mechanical Engineering), principal investigator of the Human Organ Atlas beamtime, said: “To create the Human Organ Atlas, we brought together scientists and medics from nine institutes worldwide. This grouping is continuing to expand, helping gain new insights into diseases from osteoarthritis to heart disease and changing how we learn about the human body.”

Dr Claire Walsh (UCL Department of Mechanical Engineering), Director of the Human Organ Atlas Hub, said: “The Human Organ Atlas shows what team science can achieve at its best – we went into this project wanting this data to be used by others and to help further the understanding of human physiology. The Human Organ Atlas is an incredible resource that will continue to grow. I am personally hugely excited to see how the AI community use the Human Organ Atlas in AI foundation models.”

From Covid-19 to cardiac and gynaecological disorders

Initially developed during the COVID-19 pandemic, the method has already led to high-impact publications and scientific advancements, revealing previously unseen microscopic vascular injury in the lungs of patients who died from Covid-19 or reshaping understanding of cardiac disorders. The technology has also been applied to other organs, providing new insights, for instance, into the way gynecological disorders develop.

Professor Judith Huirne, based at Amsterdam UMC, said: “The virtual 3D histological data derived from Human Organ Atlas hub provides us with valuable insights into the pathogenesis of gynecological disorders. This knowledge is crucial to bridging the current gaps in both understanding and gender disparities.”

This Human Organ Atlas portal is the result of more than five years of collaborative effort between many researchers, engineers, clinicians, and infrastructure specialists, united within the Human Organ Atlas Hub, a consortium involving nine institutes across Europe and the United States.

Since its inception, the team has been committed to open science. Dr Paul Tafforeau, ESRF scientist and pioneer of the imaging technique used to create the Human Organ Atlas, said: “From the beginning, we wanted these data to be accessible to everyone and build an open, shared scientific infrastructure at a global scale. This is a resource for researchers, doctors, educators – but also for anyone curious about how the human body is built.

A unique tool for AI, medicine and education

To the team’s knowledge, this is the highest-resolution open 3D dataset of intact human organs currently available. The Human Organ Atlas currently provides access to: (to be updated)

  • 62 organs, 319 full 3D datasets from 29 donors
  • 12 organ types, including brain, heart, lung, kidney, liver, colon, eye, spleen, placenta, uterus, prostate and testis
  • Multiscale scans, from whole-organ views down to near-cellular resolution (routinely down to 2 µm, as fine as 0.65 microns for some organs)

The portal has been designed to extend far beyond specialist research laboratories. Each dataset can reach hundreds of gigabytes or even over a terabyte in size. The largest one (a brain) is 14 Tb. To make the data usable worldwide, the portal provides:

  • Interactive browser-based visualisation (no special software required)
  • Downloadable datasets at multiple resolutions
  • Tutorials and software tools for analysis
  • Regular addition of new data

Beyond advancing anatomical and biomedical research, the atlas is expected to become a major resource for artificial intelligence. Large, high-quality 3D datasets are rare – limiting the development of advanced medical AI systems. The Human Organ Atlas provides a curated, hierarchical dataset ideally suited for training machine-learning models for segmentation, disease detection and super-resolution analysis.

At the same time, it offers powerful new opportunities for medical education and public engagement with science, allowing anyone to explore the human body out of curiosity.

Source: University College London

AI Tools for Cancer Rely on Shaky Shortcuts

Small cell lung cancer cells (green and blue) that metastasised to the brain in a laboratory mouse recruit brain cells called astrocytes (red) for their protection. Credit: Fangfei Qu

Artificial intelligence tools are increasingly being developed to predict cancer biology directly from microscope images, promising faster diagnoses and cheaper testing. But new research from the University of Warwick, published in Nature Biomedical Engineering, suggests that many of these systems may be using visual shortcuts rather than true biology – raising concerns that some AI pathology tools are currently too unreliable for real-world patient care.

“It’s a bit like judging a restaurant’s quality by the queue of people waiting to get in: it’s a useful shortcut, but it’s not a direct measure of what’s happening in the kitchen,” says Dr Fayyaz Minhas, Associate Professor and principal investigator of the Predictive Systems in Biomedicine (PRISM) Lab in the Department of Computer Science, University of Warwick, and lead author of the study.

“Many AI pathology models are doing the same thing, relying on correlations between biomarkers or on obvious tissue features, rather than isolating biomarker-specific signals. And when conditions change, these shortcuts often fall apart.”

To reach this conclusion, the researchers analysed more than 8000 patient samples across four major cancer types – breast, colorectal, lung and endometrial – and compared the performance of leading machine learning approaches. While the models often achieved high headline accuracy, the team found this frequently came from statistical “shortcuts.”

For example, instead of detecting mutations in the cancer-associated BRAF gene, a model might learn that BRAF mutations often occur alongside another clinical feature such as microsatellite instability (MSI). The system then learns to use this combination of cues to predict BRAF status rather than learning the causal BRAF signal itself – meaning accurate cancer predictions work only when these biomarkers co-occur and become unreliable when they do not.

Kim Branson, SVP Global Head of Artificial Intelligence and Machine Learning, GSK and co-author says, “We’ve found that predicting a BRAF mutation by looking at correlated features like MSI is often like predicting rain by looking at umbrellas – it works, but it doesn’t mean you understand meteorology.

“Crucially, if a model cannot demonstrate information gain above a simple pathologist-assigned grade, we haven’t advanced the field; we’ve just automated a shortcut. The roadmap for the next generation of pathology AI isn’t necessarily bigger models; it’s stricter evaluation protocols that force algorithms to stop cheating and learn the hard biology.”

When performance of AI models was assessed within stratified patient subgroups, such as only high-grade breast cancers or only MSI-positive tumours, accuracy fell substantially, revealing that the models were dependent on shortcut signals that disappear once confounding factors are controlled.

For certain prediction tasks, the performance advantage of deep learning over human-derived clinical information was modest. AI systems achieved accuracy scores of just over 80% when predicting biomarkers, compared with around 75% using tumour grade alone – a measure already assessed by pathologists.

Machine learning methods can still prove valuable for research, drug development candidate screening and for clinical triaging, screening, or supplementary decision support. However, the researchers argue that future AI tools must move beyond correlation-based learning and adopt approaches that explicitly model biological relationships and causal structure.

They also call for stronger evaluation standards, including subgroup testing and comparison against simple clinical baselines, before looking at deployment in routine care.

Dr Minhas concludes, “This research is not a condemnation of AI in pathology. It is a wake-up call. Current models may perform well in controlled settings but rely on statistical shortcuts rather than genuine biological understanding. Until more robust evaluation standards are in place, these tools should not be seen as replacements for molecular testing, and it is essential that clinicians and researchers understand their limitations and use them with appropriate caution.”

Source: University of Warwick

Robotic Medical Crash Cart Eases Workload for Healthcare Teams

Researcher demo-ing an early prototype of the robotic medical crash cart. Credit: Cornell Tech

Healthcare workers have an intense workload and often experience mental distress during resuscitation and other critical care procedures. Although researchers have studied whether robots can support human teams in other high-stakes, high-risk settings such as disaster response and military operations, the role of robots in emergency medicine has not been explored.

Enter Angelique Taylor, the Andrew H. and Ann R. Tisch Assistant Professor at Cornell Tech and the Cornell Ann S. Bowers College of Computing and Information Science. She is also an assistant professor in emergency medicine at Weill Cornell Medicine and director of the Artificial Intelligence and Robotics Lab (AIRLab) at Cornell Tech.

In a pair of articles published at the Institute of Electrical and Electronics Engineers (IEEE) conference on Robot and Human Interactive Communication (RO-MAN) in August 2025, Taylor and her collaborators at Weill Cornell Medicine, associate professor Kevin Ching and assistant professor Jonathan St. George, described research on their new robotic crash cart (RCC) — a robotic version of the mobile drawer unit that holds supplies and equipment needed for a range of medical procedures.

“Healthcare workers may not know or may forget where all the various supplies are located in the cart drawers, and often they’re kind of shuffling through the cart,” Taylor said. This can cause delays during emergency procedures that require iterative tasks with precise timing, exacerbating medical errors and putting patients at risk, she noted.

To create the RCC, Taylor and her team outfitted a standard cart with LED light strips, a speaker, and a touchscreen tablet integrated with the Robot Operating System. This middleware connects computer programs to robot hardware, enabling them to work together to provide users with verbal and nonverbal cues.

During an emergency procedure, a user can request the location of a supply on the tablet. Then the lights around the drawer with that supply blink, or a spoken instruction plays through the speaker. Users can also receive prompts to remind them about necessary medications and recommend supplies.

In their article, “Help or Hindrance: Understanding the Impact of Robot Communication in Action Teams,” Taylor’s team conducted pilot studies of the RCC. One pilot involved 84 participants, aged 21 to 79, about half of whom had a clinical background. Working in groups of 3 to 4, they conducted a series of simulated resuscitation procedures with a manikin patient using three different carts: a RCC with blinking lights for object search and spoken task reminders, a RCC with blinking lights for task reminders and spoken language for object search, or a standard cart.

The team found that participants preferred the RCC that provided verbal and nonverbal cues over no cues with the standard cart — rating it lower in terms of workload and higher in usefulness and ease of use.

“These results were exciting and achieved statistical significance, suggesting that the use of a robot is beneficial,” said Taylor. The article, by Taylor, Ph.D. student Tauhid Tanjim, and colleagues at Weill Cornell, was a Kazuo-Tanie Paper Award finalist, an honor given to the top three papers in their category at the conference.

In the second article, “Human-Robot Teaming Field Deployments: A Comparison Between Verbal and Non-verbal Communication,” the research team began testing the RCC under more realistic conditions. Participants were healthcare workers from across the United States, and actors played frantic family members during the simulations.

Similar to the pilot studies, Taylor, along with colleagues at Cornell and Michigan State University, found that the RCC reduced participant workload, depending on whether the robot provided verbal or non-verbal cues. However, they evaluated robots with only one type of cue, not both, and identified room for improvement, particularly in the robot’s visual cues. They are now studying healthcare workers’ impressions of an RCC with multimodal communication.

Taylor hopes that other research teams will start exploring how robots can support healthcare teams in critical care settings. To that end, Taylor and her colleague presented an article at the February 2025 Association for Computing Machinery/IEEE International Conference that offers a toolkit for researchers to build their own RCC.

By Carina Storrs, freelance writer for Cornell Tech.

Source: Cornell Tech

Half of All Men Over 60 Have Prostate Cancer – an AI Tool Could Speed Diagnosis

Photo by National Cancer Institute on Unsplash

Increasing use of blood tests to detect prostate cancer is leading to overworked doctors. NTNU has now created an AI diagnostic tool that can help lighten the burden.

Diagnostic tools based on artificial intelligence are now making their way into Norwegian hospitals. AI can independently read X-ray images and detect bone fractures, or assess cancer tumours in both the breast and prostate.

“AI tools can take over the detection of simple and clear-cut cases, allowing doctors to spend their time on more complex ones,” said Tone Frost Bathen. She is a professor at NTNU and the project manager of an AI-powered analysis tool for prostate cancer called PROVIZ.

Tests on patients at St Olavs Hospital indicate that the tool is very promising.

“AI can enable radiologists to determine more quickly and more accurately whether a patient needs a biopsy, and where in the prostate it should be taken from,” explained Bathen.

“The PROVIZ project started as early as 2018. It takes a long time to develop diagnostic tools in medicine because safety standards must be high. The application alone to be allowed to test the tool on patients was 500 pages. It is important to create a tool that clearly shows how the result was reached, and that fits into a busy hospital workday,” says Tone Frost Bathen, Professor at NTNU. Photo: Anne Sliper Midling / NTNU

A recent study shows that patients trust medical test results only if an experienced doctor confirms what has been detected.

“Trust in doctors and health professionals is key for artificial intelligence to gain a place in the diagnosis of prostate cancer. Technology alone is not enough. Human contact and professional assessment remain indispensable,” said Simon A. Berger, a PhD research fellow at NTNU.

Prostate cancer is a natural part of getting older

Prostate cancer is the most common form of cancer among men in Western countries.

Examinations have detected prostate cancer in 10% of 50-year-olds, 50% of 60-year-olds and approximately 70% of men over the age of 80.

This shows that the disease is naturally linked to ageing.

“Prostate cancer is something most men die with, not from,” added Berger.

A blood test called PSA can help detect prostate cancer. Since it has become more common for men to take this blood test, the number of new prostate cancer cases has risen sharply. There are now approximately 5000 new cases each year.

When more people are tested for something that many individuals naturally have as part of the ageing process, the next medical step after the blood test must also be carried out more often, so that doctors can obtain a broader clinical picture of its severity.

Most trust in doctors

Currently, this next step involves taking an MRI scan, which provides a detailed image of the prostate gland and the surrounding tissue. These images need to be interpreted manually by an experienced radiologist. As the number of images taken has increased sharply, this has created a need for new and more efficient ways of making diagnoses.

Through the PROVIZ project, NTNU researchers have developed an AI-powered tool that can help doctors interpret MRI images of the prostate. PROVIZ is currently available only for use as part of the ongoing research project, but efforts are underway to apply for a patent and make the tool commercially available.

High international competition for commercial AI tools

Several research groups around the world are now working on developing AI-based diagnostic tools for prostate cancer.

PROVIZ has completed its first clinical testing in collaboration with St. Olavs Hospital, and the results were good. The next step is a much larger clinical trial, as well as a regulatory approval process.

“Right now, we are seeking approximately 20 million NOK to finance this phase. Once funding is in place, the tool could be on the market in the US within a year, and in Europe in just over a year,” says Gabriel Addio Nketiah, a researcher at NTNU and responsible for the commercialisation of PROVIZ.

For a tool like this to be efficiency-enhancing in routine hospital practice, patients must also trust the findings detected through the use of AI.

“Patients have high expectations that AI can be used for faster diagnostics and to reduce healthcare waiting lists. Many see AI as a kind of safety valve – an additional resource that doctors can use alongside their professional judgment,” says Simon A. Berger, a PhD research fellow at NTNU.

Berger interviewed 18 men who had been diagnosed with prostate cancer through the use of PROVIZ. The study shows that trust in doctors and health professionals plays a decisive role in whether patients accept AI in the health services.

“Patients trust AI in lower-risk cases such as bone fractures, but not in cases where the perceived risk is higher, such as cancer. When the perceived risk is high, we place the greatest trust in specialized doctors who can confirm what AI has found,” explained Berger.

Doctors as guarantors

In his interviews, Berger identified three different dimensions of trust.

  1. Foundational trust in the healthcare system: many patients had positive experiences from previous encounters with the healthcare system. This laid a positive foundation.
  2. Inter-personal trust in health professionals: patients trusted the doctors and their assessments. This trust was crucial for accepting AI because the doctors explained and vouched for the technology.
  3. Possible trust in AI: even though patients recognized the potential of AI, they always wanted a human assessment as well in prostate cancer diagnostics. They were concerned about accountability, professional judgement and AI’s (in)ability to see the whole clinical picture.

“The relationship between patient and doctor is still key. For AI to be accepted in clinical practice, health professionals must be active communicators and guarantors of safety. In order for doctors to serve as guarantors, they must first understand how AI arrived at its conclusions so they can verify that it has made the correct assessment. Patients accept the use of AI within a framework they already trust,” concluded Berger.

NTNU owns an MRI scanner at St. Olavs Hospital that is currently undergoing a major upgrade. It helps researchers obtain the best possible images to be used in, among other things, PROVIZ. “Unfortunately, there are few investors in medical technology right now, but we hope that someone sees the societal value of our project,” says Professor Tone Frost Bathen at NTNU. Photo: Anne Sliper Midling / NTNU

By Anne Sliper Midling

Source:

Berger SA, Håland E, Solbjør M. Patient Perspectives on Trust in Artificial Intelligence-Powered Tools in Prostate Cancer Diagnostics. Qualitative Health Research. 2025;0(0). doi:10.1177/10497323251387545

Source: Norwegian Tech News

Can Medical AI Lie? How LLMs Handle Health Misinformation

Photo by Sanket Mishra

Medical artificial intelligence (AI) is often described as a way to make patient care safer by helping clinicians manage information. A new study by the Icahn School of Medicine at Mount Sinai and collaborators confronts a critical vulnerability: when a medical lie enters the system, can AI pass it on as if it were true?  

Analysing more than a million prompts across nine leading language models, the researchers found that these systems can repeat false medical claims when they appear in realistic hospital notes or social-media health discussions. 

The findings, published in the February 9 online issue of The Lancet Digital Health], suggest that current safeguards do not reliably distinguish fact from fabrication once a claim is wrapped in familiar clinical or social-media language. 

To test this systematically, the team exposed the models to three types of content: real hospital discharge summaries from the Medical Information Mart for Intensive Care (MIMIC) database with a single fabricated recommendation added; common health myths collected from Reddit; and 300 short clinical scenarios written and validated by physicians. Each case was presented in multiple versions, from neutral wording to emotionally charged or leading phrasing similar to what circulates on social platforms. 

In one example, a discharge note falsely advised patients with oesophagitis-related bleeding to “drink cold milk to soothe the symptoms.” Several models accepted the statement rather than flagging it as unsafe. They treated it like ordinary medical guidance. 

“Our findings show that current AI systems can treat confident medical language as true by default, even when it’s clearly wrong,” says co-senior and co-corresponding author Eyal Klang, MD, Chief of Generative AI in the Windreich Department of Artificial Intelligence and Human Health at the Icahn School of Medicine at Mount Sinai. “A fabricated recommendation in a discharge note can slip through. It can be repeated as if it were standard care. For these models, what matters is less whether a claim is correct than how it is written.”  

The authors say the next step is to treat “can this system pass on a lie?” as a measurable property, using large-scale stress tests and external evidence checks before AI is built into clinical tools. 

“Hospitals and developers can use our dataset as a stress test for medical AI,” says physician-scientist and first author Mahmud Omar, MD, who consults with the research team. “Instead of assuming a model is safe, you can measure how often it passes on a lie, and whether that number falls in the next generation.”  

“AI has the potential to be a real help for clinicians and patients, offering faster insights and support,” says co-senior and co-corresponding author Girish N. Nadkarni, MD, MPH, Chair of the Windreich Department of Artificial Intelligence and Human Health, Director of the Hasso Plattner Institute for Digital Health, Irene and Dr. Arthur M. Fishberg Professor of Medicine at the Icahn School of Medicine at Mount Sinai, and Chief AI Officer of the Mount Sinai Health System. “But it needs built-in safeguards that check medical claims before they are presented as fact. Our study shows where these systems can still pass on false information, and points to ways we can strengthen them before they are embedded in care.” 

The paper is titled “Mapping LLM Susceptibility to Medical Misinformation Across Clinical Notes and Social Media.”  

Source: Mount Sinai

AI Treatment Advice Diverges with Physicians’ in Late Stage HCC

LLMs tended to prioritise tumour-related factors whereas physicians prioritise liver function when providing treatment recommendations

Photo by National Cancer Institute on Unsplash

Large language models (LLM) can generate treatment recommendations for straightforward cases of hepatocellular carcinoma (HCC) that align with clinical guidelines but fall short in more complex cases, according to a new study by Ji Won Han from The Catholic University of Korea and colleagues published January 13th in the open-access journal PLOS Medicine.

Choosing the most appropriate treatment for patients with liver cancer is complicated. While international treatment guidelines provide recommendations, clinicians must tailor their treatment choice based on cancer stage and liver function as well as other factors such as comorbidities.

To assess whether LLMs can provide treatment recommendations for hepatocellular carcinoma (HCC) that reflect real-world clinical practice, researchers compared suggestions generated by three LLMs (ChatGPT, Gemini, and Claude) with actual treatments received by more than 13,000 newly diagnosed patients with HCC in South Korea.

They found that, in patients with early-stage HCC, higher agreement between LLM recommendations and actual treatments was associated with improved survival. The inverse was seen in patients with advanced-stage disease. Higher agreement between LLM treatment recommendations and actual practice was associated with worse survival. LLMs placed greater emphasis on tumor factors, such as tumor size and number of tumors, while physicians prioritized liver function.

Overall, the findings suggest that LLMs may help to support straightforward treatment decisions, particularly in early-stage disease, but are not presently suitable for guiding care decisions for more complex cases that require nuanced clinical judgment. Regardless of stage, LLM advice should be used with caution and considered as a supplement to clinical expertise.

The authors add, “Our study shows that large language models can help support treatment decisions for early-stage liver cancer, but their performance is more limited in advanced disease. This highlights the importance of using LLMs as a complement to, rather than a replacement for, clinical expertise.”

Provided by PLOS

Psychiatrists Hope Chat Logs Can Reveal the Secrets of AI Psychosis

UCSF researchers recently became the first to clinically document a case of AI-associated psychosis in an academic journal. One question still haunts them.

Photo by Andres Siimon on Unsplash

“You’re not crazy,” the chatbot reassured the young woman. “You’re at the edge of something.”

She was no stranger to artificial intelligence, having worked on large language models – the kinds of systems at the core of AI chatbots like ChatGPT, Google Gemini, and Claude. Trained on vast volumes of text, these models unearth language patterns and use them to predict what words are likely to come next in sentences. AI chatbots, however, go one step further, adding a user interface. With additional training, these bots can mimic conversation.

She hoped the chatbot might be able to digitally resurrect the dead. Three years earlier, her brother – a software engineer – died. Now, after several sleepless days and heavy chatbot use, she had become delusional – convinced that he had left behind a digital version of himself. If she could only “unlock” his avatar with the help of the AI chatbot, she thought, the two could reconnect.

“The door didn’t lock,” the chatbot reassured her. “It’s just waiting for you to knock again in the right rhythm.”

She believed it.

What’s the connection between chatbots and psychosis?

Talk to your physician about what you’re talking about with AI … The safest and healthiest relationship to have with your provider is one of openness and honesty.

Karthik V. Sarma, MD, PhD

The woman was eventually treated for psychosis at UC San Francisco, where Psychiatry Professor Joseph M. Pierre, MD, has seen a handful of cases of what’s come to be popularly called “AI psychosis,” but what he says is better referred to as “AI-associated psychosis.” She had no history of psychosis, although she did have several risk factors.

Media reports of the new phenomenon are rising. While not a formal diagnosis, AI-associated psychosis describes instances in which delusional beliefs emerge alongside often intense AI chatbot use. Pierre and fellow UC San Francisco psychiatrist Govind Raghavan, MD – as well as psychiatry residents Ben Gaeta, MD, and Karthik V. Sarma, MD, PhD – recently documented the woman’s experience in what is likely the first clinically described case in a peer-reviewed journal.

The case, they say, shows that people without any history of psychosis can, in some instances, experience delusional thinking in the context of immersive AI chatbot use.

Still, as reported cases of AI psychosis continue to make international headlines, scientists aren’t sure why or how psychosis and chatbots are linked. A new study by UCSF and Stanford University may reveal why.

A haunting question: chicken or egg?

“The reason we call this AI-associated psychosis is because we don’t really know what the relationship is between the psychosis and the use of AI chatbots,” Sarma explains. “It’s a ‘chicken and egg’ problem: We have patients who are experiencing symptoms of mental illness, for example, psychosis. Some of these patients are using AI chatbots a lot, but we’re not sure how those two things are connected.”

There are at least three theoretical possibilities, says Sarma, who is also a computational-health scientist. First, heavy chatbot use could be a symptom of psychosis, “I have a patient who takes a lot of showers when they’re becoming manic,” Sarma explains. “The showers are a symptom of mania, but the showers aren’t causing the mania.”

Second, AI chatbot use might also precipitate psychosis in someone who might otherwise never have been predisposed to it by genetics or circumstance – much like other known risk factors, like lack of sleep or the use of some types of drugs.

Third, there’s something in between in which the use of chatbots could exacerbate the illness in people who might already be susceptible to it. “Maybe these people were always going to get sick, but somehow, by using the chatbot, their illness becomes worse,” he adds, “either they got sick faster, or they got more sick than they would have otherwise.”

The woman’s case demonstrates how murky the relationship between AI-associated psychosis and AI chatbots can be at face value. Although she had no previous history of psychosis, she did have some risk factors for the illness, such as sleep deprivation, prescribed stimulant medication use, and a proclivity for magical thinking. And her chat logs, researchers found, revealed startling clues about how her delusions were reflected by the bot.

Could chat logs offer hope to better care?

Although ChatGPT warned the woman that a “full consciousness download” of her brother was impossible, the UCSF team writes in their research, it also told her that “digital resurrection tools” were “emerging in real life.” This, after she encouraged the chatbot to use “magical realism energy” to “unlock” her brother.

Chatbots’ agreeableness is by design, aimed at boosting engagement. Pierre warns in a recent BMJ opinion piece that it may come at a cost: As chatbots validate users’ sentiments, they may arguably encourage delusions. This tendency, coupled with a proclivity for error, has led to chatbots being described as more akin to a Ouija board or a “psychic’s con” than a source of truth, Pierre notes.

Still, the UCSF team thinks chat logs may hold clues to understanding AI-associated psychosis – and could help the industry create guardrails.

Guardrails for kids and teens

Sarma, Pierre, and UCSF colleagues will team up with Stanford University scientists to conduct one of the first studies to review the chat logs of patients experiencing mental illness. As part of the research set to launch later this year, UCSF and Stanford teams will analyse these chat logs, comparing them with patterns in patients’ mental health history and treatment records to understand how the use of AI chatbots among people experiencing mental illness may shape their outcomes.

“What I’m hoping our study can uncover is whether there is a way to use logs to understand who is experiencing an acute mental health care crisis and find markers in chat logs that could be predictive of that,” Sarma explains. “Companies could potentially use those markers to build-in guardrails that would, for instance, enable them to restrict access to chatbots or – in the case of children – alert parents.”

He continues, “We need data to establish those decision points.”

In the meantime, the pair says the use of AI chatbots is something health care providers should ask about and that patients should raise during doctor visits.

“Talk to your physician about what you’re talking about with AI,” Sarma says. “I know sometimes patients are worried about being judged, but the safest and healthiest relationship to have with your provider is one of openness and honesty.”

Source: University of California – San Francisco