Case Study
ModelRX – Modelling for Prediction or Read Across

Summary

A training data set will be obtained from an OpenRiskNet data source. The model has then to be trained with OpenRiskNet modelling tools and the resulting model has to be packaged into a container, documented and ontologically annotated. The model will be validated using OECD guidelines. Finally, a prediction can be run.

Objectives

  • Support similarity identification in the DataCure case study (by providing tools for calculating theoretical descriptors of substances);
  • Fill gaps in incomplete datasets and use in silico predictive modelling approaches (read across, QSAR) to support final risk assessment.

Risk assessment framework

The ModelRX case study contributes in two tiers:

  • On the one hand, provides computational methods to support suitability assessment of existing data and identification of analogues (Tier 0);
  • Secondly, provides predictive modelling functionalities, which are essential in the field of final risk assessment (Tier 2).

Use Cases Associated

The ModelRX CS is associated with UC2 - Building and using a prediction model including the pseudocode:

  1. The user selects an algorithm and a dataset within the system to induce a model. The selection of the algorithm is based on the intended uses within a supervised setting (classification or regression problem, size of dataset, ability to select descriptors, etc.)
  2. The algorithm possesses a number of default parameters, which can be adjusted to the users’ specifications
  3. The user starts the induction process
  4. Upon termination of the algorithm the user receives the result in the form of a model
  5. The user can supply an already existing second dataset and apply the model to it
  6. The results are returned as a novel dataset URI

Databases and tools

  • JaqPot Quattro (NTUA), CPSign (UU), JGU WEKA Rest service (JGU), (Nano-)Lazar (JGU/IST).

Service integration

Modelling APIs need a high level of integration into the OpenRiskNet ecosystem. Integration with the DataCure case study is vital. On the semantic interoperability layer, the training datasets should be compatible with an algorithm and the prediction datasets should be compatible with a prediction model. Additionally, the generated models and datasets need to be accompanied with semantic metadata on their life cycle, thus enforcing semantic enrichment of the dynamically-created entities.

Currently available services:

  • Generate, store and share predictive statistical and machine learning models
    Service type: Service, Data mining tool, Model, Model generation tool, Trained model, Processing tool, Analysis tool
  • Generate, store and share predictive statistical and machine learning models
    Service type: Visualisation tool, Application, Data mining tool, Model, Model generation tool, Trained model, Processing tool, Analysis tool, Workflow
  • Interactive computing and workflows sharing
    Service type: Helper tool, Visualisation tool, Processing tool, Analysis tool, Software, Workflow
  • Scientific workflows make simple
    Service type: Database / data source, Service, Workflow