There’s been a lot of (justifiable) excitement recently about the arrival of the $1000 genome, the drop in sequencing costs in general, and what it means in the research and clinical space. The focus has mostly been on genome sequencing, but over time the transcriptome is likely to yield a much more complex and interesting picture.
Apart from Illumina’s New HiSeq X platform (due to brilliant/evil marketing – your choice), pretty much all sequencing platforms are capable of measuring the transcriptome using one of the many forms of RNA-Seq. But they’re not all equally good at it.
The (non-existent) ideal RNA-Seq platform would have the following characteristics, more or less in order of importance:
- Lots of reads – at least 10s of millions to 100s of millions; more reads equals greater sensitivity
- Long reads – the longer the read, the easier it is to measure splice isoforms
- Inexpensive platform to operate
- Inexpensive platform to buy
- Quick runs
- High quality reads
The MiSeq, PGM and 454 aren’t great as they really don’t produce enough reads. The various iterations of the HiSeq seem like decent candidates and a lot of great transcriptome studies have been completed on these platforms. But the latest iteration, HiSeq 2500 with v4 chemistry, generates around 4 billion reads per run. Unless you’ve got a LOT of samples, it’s probably a bit of overkill.
But two platforms seem to fall in the sweet spot – the Ion Torrent Proton PI and the Illumina NextSeq 500. The running costs for the two platforms are roughly the same – $1000 will give a bit over 80M reads on the PI and ~100M on the NextSeq. The NextSeq has the edge in read length (2X150 rather than 100-200b reads on the PI), but the PI wins out on runtime, generating results in 3-4 hours. If Thermo ever launches the PII chip, the cost per 100M reads would drop to ~$300. Given how well the Proton competes in this space, it’s a little surprising that it isn’t marketed more heavily for transcriptome sequencing.
The Honorable Mention
PacBio is kind of a special case – it doesn’t produce anywhere near enough reads for ‘counting’ applications, but the ultra long reads (approaching an average of 10kb) are uniquely positioned to read entire transcripts in one go, making it easy to tell exactly what isoform is present. The method, which is still being rolled out, is called ISO-SEQ. The shorter read platforms can attempt this, but it involves creating a mini-assembly of each transcript and there’s only so much info you can glean from this method.