March 07, 2025

Understanding rare diseases with digital twins

Understanding rare diseases with digital twins

New EMBL-EBI project explores the use of a concept developed in aerospace engineering to support rare disease research, diagnosis, and treatment

Ellie McDonagh (left), Translational Informatics Director of Open Targets and Rahuman Sheriff (right), Senior Project Leader at EMBL-EBI. Credit: Background image EMBL, photography: Jeff Dowling/EMBL-EBI.

Rare diseases are individually uncommon, but together they affect over 300 million people worldwide. There are more than 7,000 conditions that are classed as rare diseases, and many of them don’t have approved treatments. Because each rare disease affects a relatively small number of people, diagnosis is difficult and the data are sparse.

EMBL-EBI scientists are trying to get a greater understanding of the biological mechanisms driving rare diseases by using artificial intelligence (AI) and a concept used in aerospace engineering – digital twins. The project is funded by the Chan Zuckerberg Initiative and set to run over the course of three years.

Rahuman Sheriff, Senior Project Leader at EMBL-EBI and Ellie McDonagh, Translational Informatics Director of Open Targets, are leading this initiative, with support from the Petsalaki research group at EMBL-EBI.

Below, McDonagh and Sheriff explain their objectives, challenges, and next steps.

What are digital twins?

Rahuman Sheriff (RS): Digital twins are virtual models of real-world systems that update with live data. The concept first gained traction in industries like aerospace and automotive engineering, where manufacturers used them to monitor and improve complex machines. For example, using the digital twin of an aircraft engine, engineers can run computer simulations to test different conditions and diagnose issues to prevent failures and improve safety. Digital twins are also useful for making predictions or decisions – for example, whether to keep flying an aircraft or to ground it.

Our project aims to apply the ‘digital twins’ concept to medicine by testing if it’s possible to build digital twins for patients. Developing digital twins of patients will require integration of several cutting-edge approaches from machine learning, advanced mathematical modelling, and bioinformatic analysis of multiomics datasets from patients.

How could digital twins be used in healthcare?

Ellie McDonagh (EM): Creating a digital twin of a patient could help researchers gain a greater understanding of disease mechanisms and simulate disease trajectories and patient response to therapeutics.

Our project focuses on developing digital twins for tissues, with the purpose of understanding the differences between healthy and diseased states. We will be looking specifically at applications for rare disease research, but the models and tools that we develop should also be useful for understanding other diseases, such as cancer.

Why rare diseases?

RS: Rare diseases can be challenging for doctors to diagnose and treat and for scientists to investigate, because each rare disease affects a small number of people. In some cases, there may only be a few known patients in a country, which makes it hard to gather enough data and develop effective treatments.

Because of the lack of data, these conditions aren’t well studied by the pharmaceutical industry. The data are often patchy, meaning that we don’t have all the data types for a patient. Our project aims to use generative AI tools to fill data gaps where possible.

For all these reasons, we need a different approach when studying rare diseases, and digital twins might be a useful tool.

What are the limitations of digital twins?

EM: Unlike engineering devices, digital twins of patients are difficult to build due to the complexity of biological systems and incomplete understanding of disease conditions.

Also, digital twins provide computational predictions that require validation in the real world. We hope that such models can unveil new insights and research avenues into the biology of disease, but these insights would always need to be verified through biological and clinical tests.

How are you going to build the rare disease digital twins?

RS: The first step is data collection, harmonised processing, and curation to ensure we have high-quality, standardised datasets. Based on this, we will begin by modelling one or two tissue types digitally.

Next, we will develop healthy digital twins for these tissues, establishing a baseline for comparison against disease states. From there, we will expand to modelling common diseases, leveraging data from larger patient cohorts to refine and validate our approach.

To build the digital twins for healthy and common disease tissues, we’re planning to use multi-omics data from public databases like the ones managed by EMBL-EBI – for example, Expression AtlasBioStudiesDECIPHER, and PRIDE. We also hope to use controlled-access data such as those made available through the European Genome-phenome Archive, co-managed by EMBL-EBI and the Centre for Genomic Regulation in Barcelona. To enable more data access, we’re hoping to develop collaborations with rare disease consortia and patient advocacy groups.

We will run comparisons of common diseases against healthy tissues to train machine learning models that can predict dysfunction. Finally, we will use data from rare disease patients to develop the rare disease digital twins. To ensure the quality of the digital twins, during each of these steps, we will check that what the machine learning models tell us corresponds to our existing knowledge of mechanisms underlying disease.

What does success look like for your project?

EM: This is unchartered territory, so to some extent, success is creating a proof of concept – showing that these digital twins are feasible as well as building useful datasets and models for the community.

Crucially, we want our work to be openly available for anyone to reuse and improve upon. We aim to create a public collection of mechanistic models and machine learning models that will be accessible in the BioModels public repository. We hope that the models will be reused and adapted beyond the scope of this project, by anyone interested in exploring this technology in research and in the clinic.

What are the next steps?

RS: We’re looking to partner up with organisations working in the rare disease space to ensure the outcomes of this project are useful to the community. If you’re able to help, please get in touch, we would love to hear from you.

EM: We’re also currently recruiting a multi-omics data scientist (applications close on 12 March 2025) and a machine learning scientist (applications close on 16 March 2025). We encourage enthusiastic people who are keen to get involved in our project to apply! Reach out for any questions on these roles. More roles will be advertised in the coming months.

About Chan Zuckerberg Initiative

The Chan Zuckerberg Initiative was founded in 2015 to help solve some of society’s toughest challenges — from eradicating disease and improving education, to addressing the needs of our communities. Through collaboration, providing resources and building technology, our mission is to help build a more inclusive, just and healthy future for everyone. For more information, please visit chanzuckerberg.com.


Our latest News

discover more
HEIDELBERG UNIVERSITY HOSPITAL AMONG THE WORLD’S BEST HOSPITALS IN 2025

HEIDELBERG UNIVERSITY HOSPITAL AMONG THE WORLD’S BEST HOSPITALS IN 2025

The US magazine “Newsweek” has once again named Heidelberg University Hospital (UKHD) one of the best hospitals in the world. In the “World’s Best Hospitals 2025” ranking, which is compiled by “Newsweek” in collaboration with the statistics and data platform “Statista”, the UKHD ranks 14th out of 2,400 hospitals worldwide. In Germany, the UKHD is […]

Scientists discover the function of a mysterious HIV component

Scientists discover the function of a mysterious HIV component

A research team including scientists from Heidelberg University Hospital has gained new insights into HIV-1. Researchers from Martinsried, Heidelberg und Yale have discovered the mechanism behind an important step in the life cycle of HIV. Working together with teams at Heidelberg and Yale Universities, they found that the enigmatic “spacer peptide 2”, one of the […]

Early Excellence in Science Award for Ivana Winkler

Early Excellence in Science Award for Ivana Winkler

The Bayer Foundation’s Early Excellence in Science Award 2024 in the category of Data Science goes to Ivana Winkler of the German Cancer Research Center (DKFZ). Winkler’s work uncovered the unexpected effect of female reproductive capacity: the constantly recurring remodeling of the organs of the female reproductive tract during the sexual cycle leads to fibrosis […]

GET IN TOUCH

Stay Updated with bioRN’s Newsletter

Sign up for our newsletter to discover more!
* required

BioRN (BioRN Network e.V. and BioRN Cluster Management GmbH) will use the information you provide on this form to be in touch with you and to provide updates and marketing. Please let us know all the ways you would like to hear from us:

You can update your subscription preferences or unsubscribe at any time. Just follow the unsubscribe or update link in the footer of automated emails you receive from us, or by contacting us at info@biorn.org. We will treat your information with respect. For more information about our privacy practices please visit our website: www.biorn.org. By clicking below, you agree that we may process your information in accordance with these terms.

We use Mailchimp as our marketing platform. By clicking below to subscribe, you acknowledge that your information will be transferred to Mailchimp for processing. Learn more about Mailchimp's privacy practices.

Intuit Mailchimp