di Paolo Soda
An artificial intelligence model capable of generating synthetic, realistic and clinically accurate medical data
consistent, in full respect of patients' privacy: this is the promise ofXGeM, the new system developed in digital medicine byCollege of Engineering UCBMWith its 6,77 billion parameters and over 40.000 hours of training time, the model marks a significant milestone for medical research and data ethics.
To develop effective diagnostic tools, artificial intelligence requires large amounts of clinical data. But sharing this data is often limited by regulatory constraints and the need to protect patient identities. XGeM addresses this challenge by generating synthetic radiological images and clinical reports that are indistinguishable from real ones, useful for training algorithms or simulating rare scenarios. The core technology is an approach called Latent Diffusion, capable of navigating a latent space shared between images and text, trained on over 170.000 x-rays and associated reports. The system also leverages techniques such as contrastive learning to correctly align multimodal information and a novel multi-prompt training methodology, making it flexible to different clinical inputs.
Tested on international benchmarks and subjected to a Visual Turing Test with expert radiologists, XGeM outperformed five competing models for accuracy, consistency, and realism. It is not only a demonstration of computational power, but also an invitation to collaboration: the model is open source, with code, weights, and datasets available to the scientific community.
The future potential is vast: simulation of the temporal evolution of diseases, support for new data modalities such as CT or ECG, active learning with medical feedback. XGem could revolutionize biomedical research, healthcare professional training, and the development of new diagnostic solutions. And not just from a technical standpoint. Indeed, solutions like XGeM reduce radiation exposure for doctors and patients; make diagnoses safer and more precise; and, last but not least, reduce the costs of diagnostic imaging technology, optimizing healthcare resources and opening up access to these tools to developing countries.
A first taste is available online on the interactive simulator (medcodim.unicampus.it), where it is possible to test the generation of x-rays and reports. The medicine of the future already speaks the language of artificial intelligence – and XGeM could be one of the most promising voices.