Metagenomics is the study of environmental samples where multiple microbial genomes are analyzed at the same time. It is in direct contrast to isolating and cultivating individual species prior to sequencing their genomes. The primary advantage of metagenomics is that it allows for the discovery and study of microbial genomes which would otherwise be intractable due to cultivation difficulties. In a single gram of environmental sample (e.g., soil, seawater, bovine stomach contents, etc.) there can be hundreds to thousands of unique microbial species. Its popularity and utility has increased dramatically with the advent of NGS.
While the source of material can be extremely varied, the procedure is relatively simple in that microbial DNA is isolated directly from the environmental sample without any intervening laboratory cultivation steps. The DNA is then used to generate standard sequencing libraries. These libraries are then sequenced to generate as broad coverage as possible across the entire ‘metagenome’ that was present in the sample. The main challenge of this application is the subsequent alignment of sequences across the multiple genomes present in the sample. As an alternative to this ‘comprehensive coverage’ method, it is also possible to focus on just the ribosomal RNA genes (16S rDNA and 18S rDNA) with the use of sequence-specific PCR primers. By concentrating the sequencing power on this relatively small region of the genome it is possible to get a more accurate picture of what species are present in the sample (at the expense of the more comprehensive view of the genomic sequence from the standard method).
Primer/review: Wolley, JC et al. (2010) A Primer on Metagenomics. PloS Computational Biology 6(2): e1000667. doi:10.1371/journal.pcbi.1000667
Seawater example: Venter, JC et al. (2004) Environmental Genome Shotgun Sequencing of the Sargasso Sea, Science 304(5667): 66-74
Cow rumen example: Hess, M et al. (2011) Metagenomic Discovery of Biomass-degrading Genes and Genomes From Cow Rumen, Science 331(6016): 463-467
Key Platform Characteristics
|# of reads||Very Important||Unlike other “genomic sequencing” applications, metagenomics is also a “counting” application as a large number of reads will help determine the relative abundances of the various microbes.|
|Read length||Very Important||Longer reads can help with alignment of the sequence and differentiating between closely related species. However, significant results have been achieved with the shorter read technologies that are capable of 100-150b reads.|
|Error rate||Important||Lower error rates are especially critical for de novo sequencing assembly and for differentiating closely related species.|
|Paired-end reads||Important||Paired-end reads can increase the effective read length which can aid in sequence assembly.|
|Mate-pair reads||Nice to have||Like paired-end reads, mate-pair reads can increase the effective read length, but the larger insert sizes aren’t as relevant for the smaller microbial genomes.|
|Multiplexing||Nice to have||As the number of reads per run increases on the leading platforms,multiplexing can be helpful in trying to maximize the number of samples per run.|
Please contact us at firstname.lastname@example.org if you have any information or opinions you’d like to share about this page.