
People with autism are typically diagnosed by clinical observation and assessment. To deconstruct the clinical decision process, which is often subjective and difficult to describe, researchers used a large language model (LLM) to synthesize the behaviours and observations that are most indicative of an autism diagnosis. Their results, publishing in the Cell Press journal Cell, show that repetitive behaviours, special interests, and perception-based behaviours are most associated with an autism diagnosis.
These findings have potential to improve diagnostic guidelines for autism by decreasing the focus on social factors – which the established guidelines in the DSM-5 focus on but the model did not classify among the most relevant in diagnosing autism.
“Our goal was not to suggest that we could replace clinicians with AI tools for diagnosis,” says senior author Danilo Bzdok of the Mila Québec Artificial Intelligence Institute and McGill University in Montreal. “Rather, we sought to quantitatively define exactly what aspects of observed behaviour or patient history a clinician uses to reach a final diagnostic determination. In doing so, we hope to empower clinicians to work with diagnostic instruments that are more in line with their empirical realities.”
The scientists leveraged a transformer language model, which was pre-trained on about 489 million unique sentences. They then fine-tuned the LLM to predict the diagnostic outcome from a collection of more than 4000 reports written by clinicians working with patients considered for autism diagnosis. The reports, which were often used by multiple clinicians, included accounts of observed behaviour and relevant patient history but did not include a suggested diagnostic outcome.
The team developed a bespoke LLM module that pinpointed specific sentences in the reports that were most relevant to a correct diagnosis prediction. They then extracted the numerical representation of these highly autism-relevant sentences and compared them directly with the established diagnostic criteria enumerated in the DSM-5.
“Modern LLMs, with their advanced natural language processing capabilities, are natively suited to this textual analysis,” Bzdok says. “The key challenge we faced was in designing sentence-level interpretability tools to pinpoint the exact sentences, expressed by the healthcare professional themselves, that were most essential to a correct diagnosis prediction by the LLM.”
The researchers were surprised by how clearly the LLM was able to distinguish between the most diagnostically relevant criteria. For example, their framework flagged that repetitive behaviours, special interests, and perception-based behaviour were the criteria most relevant to autism. While these criteria are used in clinical settings, current criteria focus more on deficits in social interplay and lack of communication skills.
The authors note that there are limitations to this study, including a lack of geographical diversity. Additionally, the researchers did not analyse their results based on demographic variables, with the goal of making the conclusions more broadly applicable.
The team expects their framework will be helpful to researchers and medical professionals working with a range of psychiatric, mental health, and neurodevelopmental disorders in which clinical judgement forms the bulk of the diagnostic decision-making process.
“We expect this paper to be highly relevant to the broader autism community,” Bzdok says. “We hope that our paper motivates conversations about grounding diagnostic standards in more empirically derived criteria. We also hope it will establish common threads that link seemingly diverse clinical presentations of autism together.”
Source: ScienceDaily

