Cryo-electron microscopy (Cryo-EM) has evolved extremely rapidly in the recent years with great advances in the quality of the instrumentation and the methods used for data analysis and has become the technique of choice to determine the structure of macromolecules. The aim of this demonstrator project is to increase the FAIR principles in Cryo-EM microscopy from the moment of data acquisition. For doing so, this demonstrator from the CSIC Instruct-ERIC Cente proposes a new method to handle Cryo-EM data that includes deposition of data and workflows from the facility to a centralised public repository (such as EMPIAR) with the possibility of updating data and workflows throughout the data analysis before submitting the final results to the Electron Miscrospopy Data Bank (EMDB). This data acquisition and dissemination process allows full traceability and improves reproducibility of results by making available raw and intermediate data, as well as a JSON file fully describing the executed workflow (including the different steps of the workflow, the parameters used and the results). To allow the dissemination of the executed workflow, a Scipion plugin was generated capable of depositing this workflow at EMPIAR public database.
The transfer of the data and associated workflow to the EMPIAR repository is a streaming transfer and is associated with a transfer of ownership of the data from the facility to the lab user that can continue working on the data analysis and make updates directly.
A viewer was integrated in EMPIAR that allows users to view the Scipion workflow used for the analysis of the data. A data viewer was also created and should be functional very soon that allow to view how data looks like at different steps of the analysis and provides some evaluation regarding the quality of the data.
This demonstrator was granted funding from the EOSC-Life WP1 open call to continue working on this project and thus increase the FAIRness and the functionality of the deposition system.
This ambitious project participates in making structural biology data and analysis tools more standardised, FAIRer and available for re-use by the scientific community.
CSIC (Carlos Oscar Sorzano, Laura del Caño, Pablo Conesa, José-María Carazo)
Diamond (Alun Ashton)
There is continuation of our project into [EOSC-life] work package one, from the last call for work package one projects, and the idea is to extend the workflow description … to produce a Common Workflow Language output. This output will point to classes defined by an Ontology.
– Carlos Oscar Sorzano