The internal Digital Life Science Call for Sensitive Data invited EOSC-Life partners to submit project proposals that offered solutions for managing sensitive data and/or sharing workflows and tools that were specifically designed to deal with sensitive data.
Why sensitive data? Data sensitivity may arise due to its personal nature, referring to personal data in the sense of the GDPR (e.g. health data, biological samples and associated personal data, genetic data, individual research data), but this sensitivity can also be caused by biohazard concerns (e.g. Dual Use Research of Concern) or the application of the Nagoya Protocol. Research Infrastructures need to be able to determine whether their data are sensitive, manage these data appropriately, and develop and/or use workflows and tools to deal with these data.
In the selection process, EOSC-Life technical experts first reviewed all the proposals to assess their technical impact, feasibility, and project maturity; the proposal then underwent scientific assessment. As final step, a selection panel of both internal and external experts chose the winning project from a shortlist of proposals based on transparent selection criteria. Nine (9) proposals were submitted in response to the sensitive open call track, and one (1) was selected for funding:
Project team: Sven Twardziok (Charité), Philipp Strubel (Charité) Philip R. Kensche (DKFZ), Ivo Buchhalter (DKFZ)
Project summary:
In this project, a cloud-ready data management and processing platform was designed to send analysis workflows to the data, avoid transferring data between sites, and increase the efficiency of data management and data security. This platform reduces possible security threats posed by whole genome and clinical information data sharing.
This project specifically deployed a cloud platform for processing sensitive cancer-genomics data based on OTP, WESKit, and relevant ICGC cancer genomics workflows. The platform was designed to be able to connect to federated data repositories, such as the Federated EGA and an ICGC mirror, both of which are already being implemented on the Charité and DKFZ cloud sites.
The project demonstrated how sensitive genomics data could be processed in the cloud based on publicly available cell culture datasets and by using the platform. The software stack was integrated into relevant EOSC-life services, and the stack itself is now provided as an EOSC service to other researchers who can use it to deploy their own sensitive genomics workspaces in the cloud. An overview of the ICGC workflows is given in the OTP documentation: https://one-touch-pipeline.gitlab.io/otp/users/Workflows.html.
RIs Involved: ELIXIR