TGX – Toxicogenomics-based prediction and mechanism identification
In this case study a transcriptomics-based hazard prediction model for identification of specific molecular initiating events (MIE) will be applied based on (A) top-down and (B) bottom-up approaches.
The MIEs can include, but are not limited to: (1) Genotoxicity (p53 activation), (2) Oxidative stress (Nrf2 activation), (3) Endoplasmic Reticulum Stress (unfolded protein response), (4) Dioxin-like activity (AhR receptor activation), (5) HIF1 alpha activation and (6) Nuclear receptor activation (e.g. for endocrine disruption).
- Creation of prediction models based on differentially regulated genes (top-down approach);
- Using knowledge of stress response pathways to integrate data sets for their activation or inhibition (bottom-up approach).
Risk assessment framework
This case study is associated with all 3 tiers of the selected framework and in particular the following steps:
- Collection of support data;
- Identification of analogues / suitability assessment and existing data;
- Mode of Action hypothesis generation.
Use Cases Associated
This case study is associated with UC1 - Merge existing data by a common structure identifier and UC2 - Building and using a prediction model.
These two use cases are relevant for the top-down approaches:
- Reproducing the prediction models published by Herwig et al., 2016 using data from the EU-project carcinoGENOMICs;
- Advanced predictions using as much data as possible from the diXa data warehouse and other repositories giving free access to the data.
Databases and tools
- diXa (carcinoGENOMICs, Predict-IV), TG-GATEs, EU-ToxRisk (nascent), HeCaToS (nascent), ArrayExpress/GEO BioStudies.
- top-down: Data normalisation tools, prediction tools such as Caret;
- bottom-up: ToxPi.
Service integration will be needed for the omics databases; knowledge bases and data mining; processing and analysis.
Currently available services:
A database for curated toxicogenomic datasetsService type: Database / data source, Application, Visualisation tool, Software
Discover your variants of interest in human omics datasetsService type: Application, Software, Service
Programmatically retrieve metadata from the European Genome-phenome ArchiveService type: Application, Service
Service to run Nextflow pipelinesService type: Workflow, Software, Service
Interactive computing and workflows sharingService type: Workflow, Visualisation tool, Helper tool, Software, Analysis tool, Processing tool
Computation research made simple and reproducibleService type: Workflow, Database / data source, Service
This report details the work involved in the federation of compute and data resources between the OpenRiskNet e-infrastructure and external resources. The reference environment has been designed to be capable of handling the majority of requirements for users’ wishes to deploy and run services. However specific situations demand solutions where either the computation, the data or both reside outside the OpenRiskNet e-infrastructure. This deliverable is related to Tasks 2.7 (Interconnecting virtual environment with external infrastructures) and Tasks 2.8 (Federation between virtual environments). Resource intensive analyses, such as those performed in toxicogenomics, can have CPU, memory or disk requirements that cannot be assumed to be available across all deployment scenarios. Human sequencing data may have restrictions on where it can be processed and the vast quantity of this data often predicates that it is more efficient to “bring the computation to the data”. In achieving Tasks 2.7 and 2.8, we can demonstrate how the virtual environment can utilise external infrastructure including commercial cloud providers and data stores.