Center for Microbiome Innovation Faculty Member Pavel Pevzner and research team unveil new algorithm designed for metagenome assembly in long-read DNA sequencing
Pavel Pevzner, who is a University of California San Diego computer science professor and faculty member of the UC San Diego Center for Microbiome Innovation (CMI), and a team of researchers at UC San Diego, the Dairy Research Center, St. Petersburg State University and the Bioinformatics Institute (St. Petersburg) unveiled in a paper published in Nature Methods, a new algorithm designed for metagenome assembly in long-read DNA sequencing.
Studying complex microbial communities, such as bacteria in the human gut, is important to understand the mechanisms of various human diseases, and may help to discover new antibiotic treatments. Today, the standard short-read sequencing approaches provide only a limited view of the environmental bacteria, as the recovered bacterial genomes are highly fragmented. Long-read sequencing technologies have substantially improved the ability of researchers to sequence isolate bacterial genomes in comparison to previous fragmented short-read assemblies. However, assembling complex metagenomic datasets remains difficult even for state-of-the-art long-read assemblers.
In “metaFlye: scalable long-read metagenome assembly using repeat graphs,” Pavel Pevzner and team unveil the metaFlye algorithm which addresses important long-read metagenomic assembly challenges, such as uneven bacterial composition and intra-species heterogeneity. Researchers were able to perform long-read sequencing of the sheep microbiome and applied metaFlye to reconstruct over 60 complete or nearly-complete bacterial genomes. Additionally, they were able to demonstrate the long-read assembly of the human microbiomes, which enables the discovery of novel biosynthetic gene clusters that encode biomedically important natural products.
Researchers were able to benchmark the metaFlye algorithm against existing state-of-the-art long-read assemblers using simulated, mock, and real bacterial community datasets. Results demonstrated that the recently introduced long-read technologies overcome limitations of the short-read sequencing, and the metaFlye algorithm is capable of reconstructing complete bacterial genomes from complex environmental communities, providing significant improvements over the other popular existing methods.
Additional co-authors include: Mikhail Kolmogorov, Derek M. Bickhart, Bahar Behsaz, Alexey Gurevich, Mikhail Rayko, Sung Bong Shin, Kristen Kuhn, Jeffrey Yuan, Evgeny Polevikov, and Timothy P.L. Smith.
The full paper is available in Nature Methods here.
About Center for Microbiome Innovation at University of California San Diego:
The UC San Diego Center for Microbiome Innovation leverages the university’s strengths in clinical medicine, bioengineering, computer science, the biological and physical sciences, data sciences, and more to coordinate and accelerate microbiome research. We also develop methods for manipulating microbiomes for the benefit of human and environmental health. Learn more at cmi.ucsd.edu/ and follow @CMIDigest