This EOSC-Life demonstrator project aims to develop an online marine genomic resource to aid community driven annotation of marine eukaryotes and help to provide a focus for post-assembly genomic workflows and data access of specific (closely related) groups of marine organisms. The more closely two taxa are evolutionarily related, the more they are expected to share genomic structure and synteny, and it is therefore of benefit to compare and contrast genome annotations between closely related organisms. This is especially important for communities that work on specific genomes and provide manual annotations via community platforms (e.g. Orcae). This demonstrator project addresses the lack of tools to compare and transfer annotations and features between the sequenced genomes of closely related species.
A software tool for comparison of genome annotations stored in General Feature Format (GFF) has been designed, implemented, and integrated into the Galaxy platform. In addition a Snakemake workflow has been designed and implemented for cloud deployment and the tool source code and Docker module will soon be made available from a Github repository . The tool is currently being used to develop a public-facing portal for the community annotation of species belonging to the pelagic herring fish family, Clupideae. The genomes of eight clupideae will be available for comparison initially, but the tool could also be used to compare any two genome annotation libraries. It is also intended that the tool will be able to automate the updating of annotations to the community annotation platform, Orcae, through the platform API. FAIR data and privacy issues surrounding data usage have been addressed.
Especially… the emphasis on open science and FAIR data was something that I think myself, mainly dealing with DNA sequence data where we just upload it to these major databases and that’s all we do, we don’t need to do anything else; so it was important to understand better about what the principles were and how the EU need to try and maintain the metadata, and for it to be machine-readable.
– Cymon J. Cox