Whole Genome vs Exome Sequencing
For the past few years there has been a debate raging over the use of whole genome sequencing vs. exome sequencing. Whole genome sequencing (WGS), as the name implies, attempts to sequence the entirety of the genome. Due to the difficulty in sequencing technically challenging regions of the genome with current sequencing platforms (high GC content, large repeat regions, centromeres, telomeres, etc), in reality WGS only covers 95% to 98% of the genome. Exome sequencing, sometimes called ‘whole exome sequencing’ (WES), instead focuses on just the protein coding sequences.
Advantages of exome sequencing
If WES offers less coverage than WGS, why would anyone ever choose it over WGS? Primarily to save money and time. Even though WES samples are typically sequenced to a higher depth (100X vs 30X), the reads are focused on only ~2% of the genome, so less overall sequence is needed, leading to lower costs. This is achieved through an enrichment or pulldown process where DNA or RNA baits are used to hybridize with the protein-coding portion of the genome, isolating it from the non-coding portion. The amount of sequence needed for a 100X exome sample is ~5-6Gb, substantially less than the ~90Gb needed for WGS. This also leads to lower data storage costs and quicker, cheaper and easier data analysis. And since the coding region of the genome has been characterized to a substantially higher degree, advocates of WES feel there’s a better chance of interpreting variants in a meaningful way.
Advantages of whole genome sequencing
Despite the cost and speed advantages of WES, there are many proponents of WGS. The enrichment steps involved in WES lead to non-uniform coverage, generating both ‘hot spots’ with too much coverage (a waste of sequencing power) and regions with too little coverage (leading to missed variant calls). For example, a region dense with SNPs can interfere with the capture process, as the enrichment baits may not hybridize as efficiently. Because WGS doesn’t require an up front enrichment step, it generates much more uniform coverage of the genome. Another benefit of WGS is that it can take advantage of longer reads. Since the majority of human exons are <200b, anything longer than 2×100 paired end reads for WES will essentially be wasted. The longer reads available for whole genome sequencing allows for better determination of copy number variations, rearrangements and other structural variations, attributes that are especially important in cancer studies.
What does the future hold?
While there is no clear consensus as to what’s best, with strong proponents on both sides, Illumina has added new life to the debate with the launch of their HiSeq X Ten sequencing platform and the substantial reduction in the cost to generate whole genome sequencing data that it brings. With real world prices of whole genome sequencing ranging between $1500 and $2000 on AllSeq’s marketplace, WES doesn’t have as strong of a price advantage anymore. This is leading to a rise in the popularity of WGS. Once Illumina lifts the restrictions on what samples and applications can be run on the HiSeq X Ten (something for which they have not yet given a timetable), perhaps public opinion will swing back in favor of WES.