January 23, 2025

BioChatter: making large language models accessible for biomedical research

BioChatter: making large language models accessible for biomedical research

Introducing an open-source large language model (LLM) framework designed for custom biomedical research

BioChatter: making large language models accessible for biomedical research. Credit: Karen Arnott/EMBL-EBI

Summary

  • BioChatter is an open-source Python framework for employing large language models (LLMs) in biomedical research. 
  • BioChatter can support the creation of dedicated LLM-driven solutions for biomedical use cases.
  • AI solutions like those built using BioChatter can streamline workflows for non-computational researchers, advancing areas such as personalised medicine and drug discovery.

Large language models (LLMs) have transformed how many of us work, from supporting content creation and coding to improving search engines. However, the lack of transparency, reproducibility, and customisation of LLMs remains a challenge that restricts their widespread use in biomedical research. 

What are large language models (LLMs)

Large Language Models (LLMs) are artificial intelligence systems designed to process and generate human-like text by leveraging vast amounts of training data. They are capable of performing a wide range of tasks such as text generation, language translation, summarising text, answering questions, and more.

For biomedical researchers, optimising LLMs for a specific research question can be daunting, because it requires programming skills and machine learning expertise. Such barriers have reduced the adoption of LLMs for many research tasks, including data extraction and analysis.

A new publication in Nature Biotechnology introduces BioChatter to help overcome these limitations. BioChatter is an open-source Python framework for deploying LLMs in biomedical research, in line with open science principles. In order to address the concerns of privacy and reproducibility often associated with commercial LLMs, BioChatter offers a framework for researchers seeking transparency and flexibility in their LLM workflows.

“Large language models hold immense potential to transform biomedical research by making complex data and analysis tasks more accessible,” said Julio Saez-Rodriguez, Head of Research at EMBL’s European Bioinformatics Institute (EMBL-EBI), and Professor on leave at Heidelberg University. “However, to make the most of this technology for biomedical research, we need tools that prioritise transparency and reproducibility. BioChatter bridges this gap, allowing researchers to integrate LLM capabilities into many biomedical research tasks.”

Interfacing with biomedical knowledge graphs and software

BioChatter can be adapted to specific research areas to pull data from biomedical databases and literature. Further, instructing LLMs to use external software via the BioChatter API-calling functionality enables real-time access to up-to-date information and integration with bioinformatics tools. 

A key feature of BioChatter is its ability to integrate with BioCypher-built knowledge graphs – networks that link biomedical data such as genetic mutations, drug-disease associations, and other clinical information. These graphs help researchers analyse complex datasets to help identify genetic variations in disease or understand drug mechanisms.

“BioChatter is designed to lower the barriers for biomedical researchers using large language models by providing an open, transparent framework that can be adapted to different research needs,” said Sebastian Lobentanzer, Postdoctoral Researcher at the Heidelberg University Hospital and incoming Principal Investigator at Helmholtz Munich. “Our goal is to help scientists focus on their research while leaving the technical complexities to the platform.”

Real-world applications 

The next step for BioChatter is trialling its integration into life science databases. The team behind BioChatter is working closely with Open Targets, a public-private partnership that includes EMBL-EBI and uses human genetics and genomics data for systematic drug target identification and prioritisation. Integrating BioChatter into the Open Targets Platform could help streamline how users access and use biomedical data from the platform.

The team is also developing BioGather, a complementary system designed to extract information from other clinical data types, including genomics, medical notes, and images. By helping to analyse and align these data types, BioGather will help researchers address complex problems in personalised medicine, disease modelling, and drug development.

Funding

This work was supported by funding from the European Union under grant agreement No. 101057619 and the Swiss State Secretariat for Education, Research and Innovation (SERI) under contract No. 22.00115 (JB), European Union’s Horizon 2020 research and innovation programme under grant agreement No. 965193 for DECIDER (JSR), awards U54AG075931 and R01DK138504 from the National Institutes of Health, and the Pelotonia Institute for Immuno-Oncology (QM).


Source article(s)

A platform for the biomedical application of large language models

Lobentanzer S., et al

Nature Biotechnology 22 January 2025

10.1038/s41587-024-02534-3

Our latest News

discover more
New bioRN Members of 2025

New bioRN Members of 2025

In 2025, the bioRN cluster experienced a strong and dynamic expansion as many new members joined its growing network. This influx of new companies and institutions further strengthened the cluster’s expertise, diversity, and innovative power. The cluster is now pleased to introduce these new members and highlight the diverse capabilities they bring, further reinforcing its […]

ariadne.ai Secures Invest BW Funding to Build “TumorTwin”: The First Digital Twin for Glioblastoma Powered by Morphometric Spatial Multi-Omics

ariadne.ai Secures Invest BW Funding to Build “TumorTwin”: The First Digital Twin for Glioblastoma Powered by Morphometric Spatial Multi-Omics

HEIDELBERG, Germany – January 9, 2026 – ariadne.ai (Germany) GmbH today announced it has been awarded €518,000 in non-dilutive funding from the State of Baden- Württemberg (Invest BW) for its ambitious new project, TumorTwin. The project aims to revolutionize the treatment of Glioblastoma, one of the deadliest and most aggressive brain tumors, by creating a […]

AMGEN acquires Dark Blue Therapeutics, bolstering oncology pipeline

AMGEN acquires Dark Blue Therapeutics, bolstering oncology pipeline

Acquisition Adds Differentiated Investigational Molecule Designed to Treat Acute Myeloid Leukemia THOUSAND OAKS, Calif., Jan. 6, 2026 /PRNewswire/ — Amgen (NASDAQ: AMGN) today announced its acquisition of Dark Blue Therapeutics Ltd., a privately held biotechnology company based in the United Kingdom advancing first-in-class, small molecule-targeted protein degraders for oncology, in a transaction valued at up to $840 million.   The acquisition adds to Amgen’s […]

GET IN TOUCH

Stay Updated with bioRN’s Newsletter

Sign up for our newsletter to discover more!
* required

BioRN (BioRN Network e.V. and BioRN Cluster Management GmbH) will use the information you provide on this form to be in touch with you and to provide updates and marketing. Please let us know all the ways you would like to hear from us:

You can update your subscription preferences or unsubscribe at any time. Just follow the unsubscribe or update link in the footer of automated emails you receive from us, or by contacting us at info@biorn.org. We will treat your information with respect. For more information about our privacy practices please visit our website: www.biorn.org. By clicking below, you agree that we may process your information in accordance with these terms.

We use Mailchimp as our marketing platform. By clicking below to subscribe, you acknowledge that your information will be transferred to Mailchimp for processing. Learn more about Mailchimp's privacy practices.

Intuit Mailchimp