本视频属于openHPI课程clean-IT: Towards Sustainable Digital Technologies。你想看更多吗？
Cheaper, faster and scalable DNA sequencing is causing an exponential growth of genomic data in the tree of life. Large and diverse datasets increase the possibilities of biological research but are difficult to be fully used due to their size and complexity. Ganon is a software to perform approximate DNA-to-DNA matching, that is ganon enables searching very small ‘DNA-needles’ in very large ‘DNA-haystacks’. This is computationally challenging, but becomes feasible applying a specialized probabilistic data structure to process large amounts of DNA sequences, enabling efficient matching. Additionally, ganon can update already indexed data incrementally, meaning that new genome sequences can be incorporated into existing indices in a fraction of time needed to re-index them, drastically reducing its computational cost. Compared to similar approaches, ganon indexes up to 50 times faster and is the only software able to update indices, reducing redundant, energy consuming computations from hours to minutes. More information...
Vitor C. Piro is a postdoctoral researcher at the Hasso-Plattner-Institut in Potsdam, Germany with a PhD in Bioinformatics from the Freie Universität Berlin. He has experience in biological data analysis and scientific software development for bioinformatics, with focus on microbial communities analysis, environmental data, taxonomic classification and metagenomics. Piro is interested in all areas and steps related to microbiome and DNA analysis, but especially in the development of efficient algorithms to bride the fields of computer science, bioinformatics application and data analysis. A summary of his work and publications can be found at Github.