Guest Post: Sequencing for the microbiological masses
Despite being in its relative infancy, genome sequencing (and the technologies that drive it) have become central to much of the molecular biology that we take for granted. In this guest post, Nick Tucker takes a closer look at the past, present and potential future of DNA sequencing as it becomes cheaper and more readily available.
Mining databases of microbial genomes has rapidly become a routine part of experimental work for microbiologists the world over. It must be impossible for this year’s new intake of PhD students to imagine a world without them. But let’s reminisce for a moment, just to see how far we’ve come.
If you needed to know the sequence of a piece of DNA in the 1970s, you had two choices: you could either use Maxam and Gilbert’s chemical sequencing or Sanger and Coulson’s chain termination methods. These produced read lengths of around 100 base pairs (bp), so you can imagine how long it took to complete the first sequenced genome, that of bacteriophage Φ174, which has 5386 bp (and was published ten months before I was born).
Fluorescence-based sequencing, developed from the Sanger method, has been the mainstay of DNA sequencing for over 30 years. It was used to produce the first genome sequence of a free-living organism, Haemophilus influenzae, in 1997 and the first draft of the human genome in 2001. DNA sequencing on this scale is expensive and labour intensive, so these projects were generally restricted to a few centres dotted around the world, such as the Wellcome Trust Sanger Institute in Cambridgeshire.
As a result of these large projects, everything began to change, as these graphs show. In 2001, the cost of 1 Mbp (1 million bp) of DNA sequence was approximately $5000 USD; now, it’s less than ten cents. The driver for this cost reduction has been competition between companies, each with their own sequencing methods, such as Illumina, Roche 454, PacBio, IonTorrent and Solid.
These can generate millions of DNA reads between 50 and 1000 bp in length in just a few hours. It is now possible to generate data for ~100 draft bacterial genome assemblies on an Illumina GA2 sequencer in a single run. However, given the high cost of these machines, the old model of having a few large, well-funded sequencing hubs still applies. BBSRC clearly agrees with this, as demonstrated by their investment in The Genome Analysis Centre (TGAC). Having said that, many smaller university departments are also investing in these technologies, as can be seen in this excellent map, maintained by Nick Loman at the University of Birmingham.
So where do we go from here? The answer to this appears to be the advent of scaled-down benchtop sequencers. There are at least three offerings in labs around the world; Illumina’s MiSeq, Life Technologies’ Ion Torrent and Roche’s GS Junior instruments. These instruments are much cheaper to buy and run than their bigger brothers and have the ability to put the genomics revolution directly into the hands of individual research groups: sequencing for the microbiological masses.
Imagine that you isolate a new strain of Streptomyces that produces a novel antibiotic compound – you can now sequence that strain in less than a week and start identifying the genes required to make this new compound (just like Microbelog’s Matt has been doing). It remains to be seen what impact these benchtop machines may have on clinical medicine – we’re a long way from having sequencers in clinical labs and surgeries. That said, the recent crowdsourcing effort to sequence and analyse the recent E. coli O104:H4 outbreak in Germany demonstrates how quickly this data becomes available to clinicians and researchers alike.
One of the big questions is this: is the traditional model of funding a few centralised genomics hubs still valid? In 1995, sequencing the H. influenzae genome was a large genomics project. It required many highly trained people, from microbiologists and molecular biologists to bioinformaticians. This is now a relatively small task that can be run by one or two people in less than a week using a benchtop sequencer. It should be noted, however, that much of the expertise (both lab and computational) required for this work has been discovered and disseminated by the large sequencing centres like the Sanger Institute.
At the recent SGM Autumn meeting in York, it became clear that many projects now contend with sequencing as many as 100 strains in a single study. This is clearly beyond the capability of benchtop instruments and individual labs. However, if you just want to sequence a couple of unusual strains, you could be in for a bit of a wait (and a lot of paperwork) to get these done at a large sequencing centre. There’s no hanging about with a benchtop instrument.
So, do I think the future of DNA sequencing is in benchtop machines? Well, we’re taking delivery of a GS Junior shortly, so ask me in a couple of months…
Image credit: Matt Hutchings