Recovery of gene haplotypes from a metagenome

Sam Nicholls, Wayne Aubrey, Arwyn Edwards, Kurt de Grave, Sharon Huws, Schietgat Leander, Andre Soares, Christopher Creevey, Amanda Clare

Research output: Working paper

338 Downloads (Pure)

Abstract

Elucidation of population-level diversity of microbiomes is a significant step towards a complete understanding of the evolutionary, ecological and functional importance of microbial communities. Characterizing this diversity requires the recovery of the exact DNA sequence (haplotype) of each gene isoform from every individual present in the community. To address this, we present Hansel and Gretel: a freely-available data structure and algorithm, providing a software package that reconstructs the most likely haplotypes from metagenomes. We demonstrate recovery of haplotypes from short-read Illumina data for a bovine rumen microbiome, and verify our predictions are 100% accurate with long-read PacBio CCS sequencing. We show that Gretel’s haplotypes can be analyzed to determine a significant difference in mutation rates between core and accessory gene families in an ovine rumen microbiome. All tools, documentation and data for evaluation are open source and available via our repository: https://github.com/samstudio8/gretel
Original languageEnglish
PublisherbioRxiv
DOIs
Publication statusPublished - 13 Jan 2018

Keywords

  • metagenome
  • haplotypes
  • long read sequencing
  • algorithm

Fingerprint

Dive into the research topics of 'Recovery of gene haplotypes from a metagenome'. Together they form a unique fingerprint.

Cite this