ML4SD: An Active Learning Framework for Boosting DBTL Cycles in Strain Design
Alvaro Gargantilla Becerra (National Centre for Biotechnology (CNB))
| Mon Sep 14, 14:00-14:30 (3 months from now) | |
Abstract: Modern biomanufacturing relies on iterative Design-Build-Test-Learn (DBTL) cycles, yet the "Learn" phase is frequently the weakest link due to the limited predictive power for complex biological systems. Conventional approaches are unable to cope with the "combinatorial explosion" of the genetic design space, particularly when engineering growth-coupled phenotypes through gene knockouts.
In this presentation, we propose ML4SD, a computational framework that integrates machine learning into the DBTL cycle with the objective of accelerating strain design. The present approach utilizes gcSwarms, a novel binary particle swarm optimisation algorithm that generates extensive and diverse design libraries with a view to enhancing machine learning model generalization. The employment of an active learning strategy that balances exploration and exploitation enables ML4SD to iteratively refine knockout recommendations whilst identifying key metabolic interventions through the utilization of Shapley-based explainable AI.
A partial validation of ML4SD was made using Nylon-6 precursor production in Pseudomonas putida, resulting in a yield that was five times higher than that obtained with the wild type. In addition, ML4SD has been shown to exhibit superior data- and resource-efficiency in comparison to its data source generation. Overall, this methodology establishes a robust and user-friendly platform for the autonomous development of high-performing microbial cell factories.
biochemistrybioinformaticssystems biologybiotechnologycheminformaticsmachine learningdynamical systemsoptimization and controlbiophysics
Audience: researchers in the topic
Comments: Álvaro Gargantilla Becerra is a biochemist who has transitioned into the roles of bioinformatician and data scientist, with a specialisation in systems biology. Recently, he was awarded his PhD in Molecular Biosciences from the Universidad Autónoma de Madrid. His academic journey commenced with a degree in Biochemistry, followed by a research stay in the domain of synthetic biology at Cardiff University. During his Master's programme in Industrial Biotechnology, he developed computational screening tools using genetic algorithms and neural networks. The primary focus of his doctoral research at CNB-CSIC was on the utilization of bacterial heterogeneity for the purpose of bioprocess optimisation through the implementation of machine learning (ML)-boosted DBTL cycles. This research resulted in the development of the ML4SD framework and the gcSwarms algorithm, which have the potential to accelerate sustainable biomanufacturing through data-efficient strain design.
Seminar on Microbial Biotechnology: Developing the Conceptual Framework of the DBTL Cycle
Series comments: This seminar series focuses on developing the conceptual framework defining the Design-Build-Test-Learn (DBTL) cycle for microbial biotechnology. Bioeconomies are based on renewable land, soil and marine resources, such as crops, fish, forests and animals. Microbial biotechnology designs and builds new strains of microbes that can produce chemical building blocks, food supplements, novel medicines and fuels to replace fossil fuels and preserve biodiversity; it establishes protocols to test strains performance and develops mechanistic and machine learning models to learn their dynamic behaviour from experimental data.
Topics in the seminar series include, but are not limited to:
- Building scalable microbial Genome-Scale Metabolic Models (GSMMs);
- Developing a knowledge-based infrastructure for biotechnology;
- Developing a unified semantic ontology for microbial biotechnology;
- Developing training and community engagement in microbial biotechnology.
The seminar is scheduled to run once a month, on the second Monday, from 4:00 p.m. to 4:30 p.m. CET (please note that this webpage displays the schedule in your local time zone). Each session will feature a 15–20 minute talk followed by a short Q&A. The seminar will be followed by the ELIXIR microbial biotechnology community's monthly meeting, which runs from 4:30 p.m. to 5:00 p.m. Seminar attendees are welcome to stay, although participation in this follow-up internal meeting is not required.
Please sign up here to be kept updated on the seminar series.
We look forward welcoming you at the seminar series and meeting you over zoom at the address: elixir-europe-org.zoom.us/j/82458754139?pwd=eUt6R2gxdTZ3dU90aWFVak1SNVZtdz09.
The organizers
| Organizers: | Pablo Carbonell*, Vitor Martins dos Santos, Anil Wipat |
| *contact for this listing |
