UNICA IRIS Institutional Research Information System

Incomplete neuroimaging data remains a major challenge in Alzheimer’s disease diagnosis, as many patients undergo only a subset of recommended imaging protocols. This work addresses this limitation by proposing a generative transformer-based framework designed to support multimodal analysis in the presence of missing modalities. We systematically investigate multimodal performance and fairness within a unified foundation model framework for Alzheimer’s disease classification while introducing a generative approach that combines structural MRI, DTI, and PET data and leverages ControlNet-based diffusion models to synthesize anatomically consistent surrogate modalities when data are unavailable. These synthetic images are used exclusively as a training-time augmentation strategy for incomplete-modality settings, rather than as replacements for clinical acquisitions. Vision transformers adapted via Low-Rank Adaptation are employed for efficient feature extraction, while clinical variables are integrated through a dedicated projection module. Experimental results show that a transformer-based fusion head can improve upon simple aggregation strategies in some complex multimodal settings, achieving an F1-score of 57.8% in multiclass classification when combined with generative augmentation and clinical data. However, these benefits are not uniform since strong unimodal volumetric PET baselines remain superior in the best-case binary setting, and the effect of generative augmentation is strongly configuration-dependent, with some settings benefiting while others degrading substantially under non-selective synthetic augmentation.

Foundation models meet multimodal neuroimaging: A generative transformer-based framework for Alzheimer’s disease diagnosis

Zedda, Luca;Loddo, Andrea;Di Ruberto, Cecilia

2026-01-01

Abstract

Incomplete neuroimaging data remains a major challenge in Alzheimer’s disease diagnosis, as many patients undergo only a subset of recommended imaging protocols. This work addresses this limitation by proposing a generative transformer-based framework designed to support multimodal analysis in the presence of missing modalities. We systematically investigate multimodal performance and fairness within a unified foundation model framework for Alzheimer’s disease classification while introducing a generative approach that combines structural MRI, DTI, and PET data and leverages ControlNet-based diffusion models to synthesize anatomically consistent surrogate modalities when data are unavailable. These synthetic images are used exclusively as a training-time augmentation strategy for incomplete-modality settings, rather than as replacements for clinical acquisitions. Vision transformers adapted via Low-Rank Adaptation are employed for efficient feature extraction, while clinical variables are integrated through a dedicated projection module. Experimental results show that a transformer-based fusion head can improve upon simple aggregation strategies in some complex multimodal settings, achieving an F1-score of 57.8% in multiclass classification when combined with generative augmentation and clinical data. However, these benefits are not uniform since strong unimodal volumetric PET baselines remain superior in the best-case binary setting, and the effect of generative augmentation is strongly configuration-dependent, with some settings benefiting while others degrading substantially under non-selective synthetic augmentation.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno di pubblicazione
	
				2026
			
	Parole chiave
	
				Multimodal learning; Foundation models; Intelligent decision support; Neurodegenerative diseases; Alzheimer’s disease; Diffusion models; ADNI
			
	Tipologia:
	
				1.1 Articolo in rivista

File in questo prodotto:

File	Dimensione	Formato
2026_Neurocomputing_Foundation models meet multimodal neuroimaging.pdf accesso aperto Descrizione: Articolo completo Tipologia: versione editoriale (VoR) Dimensione 3.71 MB Formato Adobe PDF Visualizza/Apri	3.71 MB	Adobe PDF	Visualizza/Apri

I metadati presenti in IRIS UNICA sono rilasciati con licenza Creative Commons CC0 1.0 Universal, mentre i file delle pubblicazioni sono protetti da diritto d'autore, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11584/482307

Citazioni

ND

ND

ND

ND

social impact