Case Study
MetaP – Metabolism Prediction

Summary

Metabolites may well play an important role in adverse effects of parent drug or other xenobiotic compounds. In this case study VU (CS leader), HITeC/HHU (associate partner and implementation challenge winner), JGU, and UU have worked together on making methods and tools available for metabolite and site-of-metabolism (SOM) prediction. For that purpose we integrated and used ligand-based metabolism predictors (e.g. MetPred, enviPath, FAME, SMARTCyp) and we incorporated protein-structure and -dynamics based approaches to predict SOMs by Cytochrome P450 enzymes (P450s). P450s metabolise ~75% of the currently marketed drugs and their active-site shape and plasticity often play an important role in determining the substrate’s SOM. It is expected that this work will be continued after the end of the project to make services available for the prediction of microbial biotransformation pathways by integrating the enviPath data and software developed in part by JGU.

During method development, model calibration and validation we used databases such as XMetDB and other open-access databases for drugs, xenobiotics and their respective metabolites. To facilitate the combined use of the metabolite prediction approaches and their outcomes, we benefited of ongoing development in workflow management systems and we made Jupyter Notebooks available to facilitate collection and visualization of results from the different available services. We illustrated the added value of having multiple predictors and our Jupyter notebooks available, in a pilot study on retrospective consensus predictions of known SOMs for drug compounds for which possible metabolite-associated toxicity was previously reported.

Objectives

The objective of this case study was to enable and facilitate metabolite prediction within the OpenRiskNet infrastructure and to evaluate and demonstrate the added value of it. For that purpose we integrated different tools for metabolism prediction, including tools for:

  • Ligand-based site-of-metabolism (SOM) prediction using reaction SMARTs, circular fingerprints and/or atomic reactivities;
  • QSBR (quantitative-structure biotransformation relationship) modeling of microbial biotransformation;
  • Protein-structure and -dynamics based prediction of CYP450 isoform specific binding and SOMs;
  • Predicting probabilities for specific reaction type events.

Combined use of the tools has been made possible and compared using Jupyter notebooks that gather and visualize results from the available case-study services.

See the “Databases and tools” subsection for more details on the corresponding tools. For our comparisons of predictive (and consensus) performance we used selected compounds from literature for which SOMs and metabolite-associated toxicity have been reported. We anticipate to present our results in an upcoming manuscript on tool integration, which will illustrate how using several tools can have additional value (when compared to individual tools) to (site-of-)metabolism prediction.

Risk assessment framework

Prediction outcomes can serve as input for other molecular structure-based AO predictors, which relates to Tier 0 (Step 1: identification of molecular structure) and Tier 1 (Step 6: mechanism of action).

Databases and tools

The table below gives an overview of metabolite prediction tools that are integrated and have been used in this case study. During method development, model calibration, and validation, advantage was taken of data from XMetDB (ref.) and other databases for drugs, xenobiotics and their respective metabolites, as available in ZINC, ChEMBL, DrugBank, EAWAG-BBD and/or the SMARTCyp and FAME suites. Integration of enviPath is still ongoing, which is a database and prediction system for microbial biotransformation of organic environmental contaminants.

Tool Input Ouput Method
MetPred 2D chemical structure of ligand SOMs with Reaction Types for Phase I reactions Preprocess Metabolite reaction database (>100K biotransformations) using MCS. For each query compound, look up similar atom environments based on circular fingerprints and use ReactionSMARTS to identify reaction types. See (ref).
FAME 3 2D chemical structure of ligand SOMs for Phase I, Phase II, or combined Phase I/II metabolism Machine learning using 2D-circular-environment based atomic descriptors. See (ref).
SMARTcyp 2.0 2D chemical structure of ligand Rank atoms (SOMs) for P450-isoform specific reactions Combining reactivity (from database on QM calculated transition state energies) with simple 2D molecular accessibility descriptors for SOM prediction. See (ref).
Plasticity tools 3D Chemical structure of ligand Prediction of most probable SOMs for P450-isoform specific reactions Protein-structure and dynamics based prediction of substrate binding orientations and corresponding SOM in the active site of CYP isoforms (1A2, 2D6, 3A4). See (ref).

Technical implementation

As summarised in the table above, several services have come available in the MetaP case study. The listed services offer their functionality through RESTful APIs that are formalised according to OpenAPI specifications. The APIs are build using the Swagger toolchain and subsequently enable direct user interaction with the API endpoints using a browser-based User Interface (the Swagger UI). In addition, MetPred and SMARTCyp offer a custom browser-based interface to their service. The APIs enable access to the core features of the services as summarised above, and typically accept submissions of chemical structures in common file formats.

API endpoint input and output data exchange is standardised to a machine-readable JSON format. Together with the OpenAPI data type definitions and JSON-LD data annotation it ensures seamless integration of the containerised services in the OpenRiskNet infrastructure and data exchange with other services.

Service API use and interoperability of the listed services is demonstrated using a Jupyter Notebook freely available in GitHub. Single 3D ligand structures in Tripos MOL2 format are used as input to the various services and the standardised JSON output are aggregated into a Pandas DataFrame demonstrating interoperability. Predicted SOMs are visualized on the 2D ligand depiction using the RDKit package.

Outcomes

In addition to the service integration of the metabolite prediction tools listed above, we have evaluated the added value of having multiple tools and their combined use available (via Jupyter Notebooks). The different predictors give complementary types of output. MetPred, FAME 3, and SMARTCyp tools predict SOMs related to Phase I, Phase I/II, and Cytochrome P450 isoform specific conversion, respectively. Per (heavy) atom, normalized propensities are written out to indicate the likelihood of the atom to be a SOM. In addition, MetPred also gives back most probable reaction types at predicted SOMs. Facilitated by the Jupyter Notebook that supplies and visualizes output from the different predictors (see the case study report linked below), the MetaP tools can thus aid experts in guiding decision making on metabolite formation and/or in obtaining input for subsequent case studies.

The added value of having the multiple complementary tools available for metabolite prediction is illustrated by the Jupyter-notebook output presented in the case study report, which collects SOM predictions and MetPred predictions of Phase I reaction types (and which color-highlights atoms as predicted SOM if propensities are larger than a preset cutoff) for the three compounds (see the Figure below). These compounds were selected because possible toxicological effects have been related with their metabolites, and their metabolism is extensively studied in literature.

MetaP

Currently available services:

  • Python client for Squonk REST API
    Service type: Software
  • Chemical similarity using the Fragment Network
    Service type: Database / data source, Service
  • Predict ADME/PK with Confidence
    Service type: Application, Software, Service
  • Machine learning models for site-of-metabolism prediction
    Service type: Application, Software, Trained model, Model, Service
  • Webservice to WEKA Machine Learning Algorithms
    Service type: Trained model, Model generation tool, Model, Service
  • Interactive computing and workflows sharing
    Service type: Visualisation tool, Helper tool, Software, Analysis tool, Processing tool, Workflow tool
  • Service type: Trained model, Service
  • Service type: Application, Software, Processing tool, Trained model, Service
  • Computation research made simple and reproducible
    Service type: Database / data source, Visualisation tool, Software, Analysis tool, Service, Workflow tool

Related resources

Report
Finalization of case studies and analysis of remaining weaknesses (Deliverable 1.5)
Paul Jennings (VU), Philip Doganis, Pantelis Karatzas, Periklis Tsiros, Haralambos Sarimveis (NTUA), Lucian Farcal, Thomas Exner, Tomaz Mohoric (EwC), Atif Raza (JGU), Celine Brochot, Cleo Tebby (INERIS), Marvin Martens, Egon Willighagen, Danyel Jennen (UM)
2 Mar 2020
Abstract:
The OpenRiskNet case studies (originally outlined in Deliverable 1.3) were developed to demonstrate the modularised application of interoperable and interlinked workflows. These workflows were designed to address specific aspects required to inform on the potential of a compound to be toxic to humans and to eventually perform a risk assessment analysis. While each case study targets a specific area including data collection, kinetics modelling, omics data and Quantitative Structure Activity Relationships (QSAR), together they address a more complete risk assessment framework. Additionally, the modules here are fine-tuned for the utilisation and application of new approach methodologies (NAMs) in order to accelerate the replacement of animals in risk assessment scenarios. These case studies guided the selection of data sources and tools for integration and acted as examples to demonstrate the OpenRiskNet achievements to improve the level of the corresponding APIs with respect to harmonisation of the API endpoints, service description and semantic annotation.

Publisher: OpenRiskNet
Target audience: Risk assessors, Researchers, Nanosafety community, OpenRiskNet stakeholders, Regulators, Data modellers, Bioinformaticians
Open access: yes
Licence: Attribution 4.0 International (CC BY 4.0)
Organisations involved: EwC, JGU, UM, NTUA, VU, INERIS
Report
Report
Case Study report - Metabolism Prediction [MetaP]
Daan Geerke (VU)
11 Dec 2019
Abstract:
Metabolites may well play an important role in adverse effects of parent drug or other xenobiotic compounds. In this case study VU (CS leader), HITeC/HHU (associate partner and implementation challenge winner), JGU, and UU have worked together on making methods and tools available for metabolite and site-of-metabolism (SOM) prediction. For that purpose we integrated and used ligand-based metabolism predictors (e.g. MetPred, enviPath, FAME, SMARTCyp) and we incorporated protein-structure and -dynamics based approaches to predict SOMs by Cytochrome P450 enzymes (P450s). P450s metabolise ~75% of the currently marketed drugs and their active-site shape and plasticity often play an important role in determining the substrate’s SOM. It is expected that this work will be continued after the end of the project to make services available for the prediction of microbial biotransformation pathways by integrating the enviPath data and software developed in part by JGU. During method development, model calibration and validation we used databases such as XMetDB and other open-access databases for drugs, xenobiotics and their respective metabolites. To facilitate the combined use of the metabolite prediction approaches and their outcomes, we benefited of ongoing development in workflow management systems and we made Jupyter Notebooks available to facilitate collection and visualization of results from the different available services. We illustrated the added value of having multiple predictors and our Jupyter notebooks available, in a pilot study on retrospective consensus predictions of known SOMs for drug compounds for which possible metabolite-associated toxicity was previously reported.
Additional materials:
Report
Related services:
JGU WEKA REST Service

Publisher: OpenRiskNet
Target audience: Risk assessors, Researchers, Students, OpenRiskNet stakeholders, Bioinformaticians
Open access: yes
Licence: Attribution-ShareAlike 4.0 International (CC BY-SA 4.0)
Organisations involved: JGU, UU, VU
Report
Presentation
Site-of-metabolism prediction in OpenRiskNet
Daan Geerke (VU)
30 Oct 2019
Abstract:
Metabolites can play an important role in adverse effects of parent drug (or other xenobiotic) compounds. During the EU-H2020 OpenRiskNet project, several partners (VU Amsterdam, HHU/HITeC Hamburg, Uppsala University, JGU Mainz) have worked together on making methods and tools available within the OpenRiskNet platform for metabolite and site-of-metabolism (SOM) prediction. For that purpose we have integrated ligand-based metabolite predictors (e.g., MetPred, FAME 3, SMARTCyp) and protein-structure and -dynamics based models to predict SOMs of Cytochrome P450 (CYP450) substrates. CYP450s metabolize ~75% of the currently marketed drugs and their active-site shape and plasticity often play an important role in determining the substrate's SOM. To facilitate the combined use of the metabolite prediction approaches and their outcomes, we made Jupyter notebooks available that gather and visualize results from the integrated services. Here we illustrate the possible added value of their combined use in the context of a pilot study on SOM prediction for compounds with known metabolite-associated toxicity. Finally we shortly discuss related work from our laboratory, on predicting Cytochrome P450 binding affinity prediction.

Published in: OpenTox Euro Conference 2019
Publisher: OpenTox Association
Target audience: Risk assessors, Researchers, OpenRiskNet stakeholders, Bioinformaticians
Organisations involved: VU
Presentation