Oral Presentation Australian Society for Microbiology Annual Scientific Meeting 2022

Using assembly graphs to identify bacteriophages associated with gastrointestinal diseases (82504)

Vijini Mallawaarachchi 1 , Michael Roach 1 , Bhavya Nalagampalli Papudeshi 1 , Robert Edwards 1
  1. Flinders Accelerator for Microbiome Exploration, Flinders University, Bedford Park, SA 5042, Australia

Microbial communities found within the human gut have a strong influence on human health. Gastrointestinal diseases such as inflammatory bowel disease (IBD) are driven by intestinal bacteria and viruses [1]. Viruses infecting bacteria, known as bacteriophages, play a key role in modulating bacterial communities residing within the human gut [2,3]. However, the identification and characterization of novel bacteriophages in the gut microbiome remains a challenge. 

High-throughput sequencing technologies have paved the way for metagenomics to study uncultivated microbial and viral communities. There are a variety of tools to identify viral sequences from metagenomic data [4-7]. These tools often make use of similarity between sequences, nucleotide composition, and the presence of viral genes/proteins. Most existing tools consider the individual sequences and determine whether they are of viral origin. Due to the challenging nature of viral assembly, their genomes can be fragmented [8], and per-sequence based viral identification tools may not produce optimal results [9]. 

Metagenomic assemblers build a structure known as the assembly graph by overlapping reads to produce longer sequences called contigs [10]. Genomes typically correspond to long paths within the assembly graph, and contigs of connected components are more likely to belong to the same genome [11]. Hence, the assembly graph retains connectivity information and neighbourhood information within fragmented assemblies. Previous studies have made use of assembly graphs in taxonomy-independent metagenomic binning [12-14], mostly to identify bacterial species. This work explores the use of assembly graphs and machine learning techniques to identify viral sequences from fragmented assemblies. Specifically, we demonstrate the identification of bacteriophage genomes within samples collected from patients with IBD. 

  1. Kostic A. D. et al. (2014) The microbiome in inflammatory bowel disease: current status and the future ahead. Gastroenterology, 146(6), 1489-1499.
  2. Maronek M. et al. (2020) Phages and Their Role in Gastrointestinal Disease: Focus on Inflammatory Bowel Disease. Cells, 9(4), 1013.
  3. Gogokhia, L. et al. (2019) Expansion of bacteriophages is linked to aggravated intestinal inflammation and colitis. Cell Host Microbe, 25, 285–299.e8.
  4. Akhter S. et al. (2012) PhiSpy: a novel algorithm for finding prophages in bacterial genomes that combines similarity- and composition-based strategies. Nucleic Acids Research, 40(16), e126.
  5. Jurtz, V. I. at al. (2016) MetaPhinder-identifying bacteriophage sequences in metagenomic data sets. PLoS ONE, 11, e0163111.
  6. Ren, J. et al. (2020) Identifying viruses from metagenomic data using deep learning. Quantitative Biology, 8, 64–77.
  7. Nayfach, S. et al. (2021) CheckV assesses the quality and completeness of metagenome-assembled viral genomes. Nature Biotechnology, 39, 578–585.
  8. Smits, S. L. et al. (2014) Assembly of viral genomes from metagenomes. Frontiers in Microbiology, 5, 714.
  9. Johansen J. et al. (2022) Genome binning of viral entities from bulk metagenomics data. Nature Communications, 13, 965.
  10. Nurk S. et al. (2017) metaSPAdes: a new versatile metagenomic assembler. Genome Research, 27, 824–834.
  11. Barnum T.P. et al. (2018) Genome-resolved metagenomics identifies genetic mobility, metabolic interactions, and unexpected diversity in perchlorate-reducing communities. ISME Journal, 12, 1568–1581.
  12. Mallawaarachchi V. et al. (2020) GraphBin: refined binning of metagenomic contigs using assembly graphs. Bioinformatics, 36(11), 3307–3313.
  13. Xue H. et al. (2022) RepBin: Constraint-based Graph Representation Learning for Metagenomic Binning. AAAI Conference on Artificial Intelligence (AAAI 2022).
  14. Mallawaarachchi V. et al. (2022) MetaCoAG: Binning Metagenomic Contigs via Composition, Coverage and Assembly Graphs. 26th International Conference on Research in Computational Molecular Biology (RECOMB 2022).