This webpage provides an introduction to the project “From Category Theory to Enzyme Design: Unleashing the Potential of Computational Systems Chemistry”. The project is funded by the Novo Nordisk Foundation (2020-2022) as a grant under the Exploratory Interdisciplinary Synergy Programme. The project combines the expertise from the Algorithmic Cheminformatics Group at IMADA (Daniel Merkle, Rolf Fagerberg, Jakob L. Andersen), from the Fontana Lab at Harvard Medical School (Walter Fontana), and from the Department of Theoretical Chemistry (Christoph Flamm).
Project Title: | From Category Theory to Enzyme Design: Unleashing the Potential of Computational Systems Chemistry |
Applicant: | Prof. Daniel Merkle |
Amount: | 4.997.700 DKK |
Period: | 07/2020 - 12/2023 |
Grant Number: | NNF19OC0057834 |
Many important questions and challenges in research, industry, and society involve large and complex networks of chemical reactions. Some examples are: understanding the regulation of metabolic networks in humans; planning and optimizing chemical synthesis in industry and research labs; modeling the fragmentation of molecular ions inside mass spectrometers; developing personalized medicine; probing hypotheses of the origins of life; and monitoring environmental pollution in air, water and soil [1,2]. In this interdisciplinary and international project we will combine existing approaches from concurrency theory (Harvard Medical School, HMS) and graph rewriting (University of Southern Denmark) in order to develop ground-breaking new computational methods for analyzing networks of chemical reactions and thereby pave the way to applying these methods in research and industry. Together with chemists (University of Vienna) we will explore the applicability based on a carefully chosen flagship project: enzyme design.
Classical approaches to synthesis typically focus on small sets of molecules and often consider one reaction at a time in a planned sequence. Systems chemistry is a new emerging field that addresses the need to study networks of reactions, i.e., systems in which many distinct types of molecules can participate in many distinct chemical transformations. In such networks, reactions occur autonomously and asynchronously-without central coordination-whenever reactants are available. Systems of this kind are ubiquitous and include metabolic networks within organisms, networks of chemical production and exchange between organisms on an ecological scale, atmospheric chemistry and geochemistry, industrial production networks, and one-pot approaches to chemical synthesis. Due to the size and combinatorial complexity of these systems, it is unfeasible or outright impossible to manually analyze their properties and explore their design space. The field is therefore in strong need of new computational approaches and formal analytical methods to assist in modeling and design.
Any modeling approach must choose some level of abstraction, which determines the type of physical detail to be represented explicitly. E.g., a highly detailed level of resolution is provided by quantum chemistry. While the methods are highly accurate, they are also computationally very intensive, which restricts their practical deployment to single reactions between molecules of moderate size. The analysis of several interconnected reactions, let alone of large chemical reaction networks (CRNs), is presently out of reach. It is also unclear whether the questions that arise in the context of large reaction networks necessitate such a high level of resolution. At the same time, many extant computational methods for systems chemistry and biology are formulated at the other abstraction extreme, in which the structure of molecules is represented either not at all (molecules being simply assigned proper names) or in a very rudimentary fashion that does not permit the tracking of individual atoms across a series of reactions.
The proposed project builds on a new and powerful methodology that strikes a balance between chemical detail and computational efficiency. The approach lies at the intersection of classical chemistry, present-day systems chemistry and biology, computer science, and category theory. It adapts techniques from the analysis of actual (mechanistic) causality in concurrency theory to the chemical and biological setting. Because of this blend of intellectual and technical influences, we name the approach computational systems chemistry (CSC). The term “computational” emphasizes both the deployment of computational tools in the service of practical applications and of theoretical concepts at the foundation of computation in support of reasoning and understanding. The goal of this exploratory project is to provide a proof-of-concept toward the long-term goal of tackling many significant questions in large and combinatorially complex CRNs that could not be addressed by other means. In particular, CSC shows promise for generating new technological ideas through theoretical rigor. This exploratory project is to be considered as initial steps towards establishing this highly promising area through the following specific objectives:
Integrate and unify algorithmic ideas and best practices from two existing platforms. One platform was conceived, designed, and implemented for organic chemistry by the lead PI and his group in Denmark as well as the chemistry partner from University of Vienna. The other platform draws on the theory of concurrency and was designed and implemented for protein- protein interaction networks supporting cellular signaling and decision-making processes by the partner from Harvard Medical School and his collaborators. The combination is ripe with potential synergies as both platforms are formally rooted in category theory.
Demonstrate a proof-of-concept (PoC) using a biochemical driving project. The goal of this exploratory project is the analysis and design of enzymes whose catalytic site is viewed as a small (catalytic) reaction network in its own right. Such enzymes can then be used in the design of reaction networks.
Train the next generation of scientists for CSC: This will enable the transition towards a large-scale implementation of our approaches to tackle key societal challenges, such as the development of personalized medicine, the monitoring of pollution, and the achievement of a more environmentally friendly and sustainable network of industrial synthesis.
We argue that CSC is in a position today similar to where bioinformatics and computational biology were a few decades ago and that it has similarly huge potential. The long-term vision is to unleash that potential.