Dear all,
the next seminar will take place on Wednesday, please see the
details below. We look forward to seeing you all!
Title: Efficient Search of Microbial Genomes via Phylogenetic
Compression
Speaker: Karel Břinda (INRIA)
Date and time: Wednesday 20/12/2023 - 17:20
Location: MFF UK, Malostranské nám. 25, lecture hall S3 (3rd floor)
https://bioinformatics.cuni.cz/seminar/
Best wishes,
Petr Danecek
-------------------------------
Karel Břinda (INRIA)
Efficient Search of Microbial Genomes via Phylogenetic
Compression
Comprehensive collections approaching millions of sequenced genomes
have become central information sources in the life sciences.
However, the rapid growth of these collections makes it effectively
impossible to search these data using tools such as BLAST and its
successors. In this talk, we will present a new technique called
phylogenetic compression, which uses evolutionary history to guide
compression and efficiently search large collections of microbial
genomes using existing algorithms and data structures. We will show
that, when applied to modern diverse collections approaching
millions of genomes, lossless phylogenetic compression improves the
compression ratios of assemblies, de Bruijn graphs, and k-mer
indexes by one to two orders of magnitude. Additionally, we will
present a pipeline for a BLAST-like search over these
phylogeny-compressed reference data, and demonstrate it can align
genes, plasmids, or entire sequencing experiments against all
sequenced bacteria until 2019 on ordinary desktop computers within a
few hours. Phylogenetic compression has broad applications in
computational biology and may provide a fundamental design principle
for future genomics infrastructure.