The $999 Genome Is Less Important Than What We Do With It

For years, cheap genome sequencing has been rendered useless by the fact that computers couldn’t do anything helpful with the data. That is starting to change, and a real genomics revolution is at hand.

Innovation in DNA sequencing methods have advanced so rapidly since the end of the Human Genome Project in 2003 that, last month, Illumina Inc. and Life Technologies Inc. (the leading companies in the field) each announced new products that can sequence a genome for $1,000 in a single day. That’s 3 million times cheaper than during the Human Genome Project.

The announcements from these companies were followed by healthy gains in their stock prices. After all, the more productive you make these machines, the fewer machines customers need. The more productive you make the reagents to drive these machines, the fewer times customers reorder. The only guarantee in this story is that the customer will happily enjoy plummeting prices while sequencing companies quietly compete themselves out of business.

The industry has heralded the promise of the genome for years, but if the bellwether companies are commoditizing themselves into oblivion, where is the value?

The human genome is big: just about 3 billion of what are called bases. When companies sequence a genome, they try to read all those bases at least 30 times, if not 45, or even 100. That’s to be sure they’ve read each base enough times to correct for errors in sequencing. Once you have this raw data, you need to identify the places where this genome is different from a so-called “normal” genome. That’s a massive “big data” problem that combines parallel cloud computing and storage with biology. And when you’re done with that, you’re left with somewhere around 3 million genetic “abnormalities” per genome--that is, places where you see variation as compared to the canonical human genome that was sequenced in the 1990s.

So which of those abberations matter? How does one go from 3 million genetic variants to a diagnosis that drives a single choice about patient treatment; one that does so accurately, reliably, and that can be reproduced every time?

You may have heard that this problem, dubbed “genome interpretation,” is the bottleneck in genome sequencing technology: We can sequence the genes, but that doesn’t mean we know what they mean. The data sets are so large and unwieldy that the pace of computer technology can’t keep up with the pace of sequencing. The $1,000 genome requires a “$1 million interpretation."

But that was last year’s problem. Enabled by a new breed of cloud-based, big-data software companies built for clinical diagnostics, molecular labs across the country are using whole and partial genome sequencing to automate and operationalize real diagnostics. Physicians now have access to accurate diagnoses twice as fast and 20 times as cheap. All of these changes are driven by advances from software companies, whose innovation has begun to outpace that of the sequencing companies in the last 18 months.

While legitimate questions about reimbursement and regulatory pathways exist, software that interprets this data is the bridge to bringing sequencing to the clinic. Powered by cheap sequencing and high-octane software, the promise of genomics is about to come true: a revolution in the diagnostics industry with faster, cost-effective tests that benefit physicians and the patients they treat.

Add New Comment


  • Eebbjj

    That case is an exciting glimpse of what the future may hold, but the process was far from reliable and reproducible... According to the JAMA, in 2011 90% of doctors reported being uncomfortable dealing with genomic data. Until we begin to bridge that gap, stories like the twins' will continue to be limited to research environments like Baylor.

  • Jesse Paquette

    For whole-genome sequencing to provide an advantage over small, targeted molecular tests, there needs to be at least one clinical-quality database that connects -omic variants (i.e. "abnormalities" in gene/exon expression, SNPs, somatic mutations, translocations, etc.) with patinent outcomes (disease, prognosis, therapeutic response), with copious links to original research literature.  

    The tools on the user interface side are a simple issue compared to the variant->outcome database.  And if the variant->outcome DB were publicly-funded and open-access, (fat chance), the rates of innovation in genome-wide clinical interpretation/personalized medicine would explode.

    One more interesting issue: if a certain -omic variant increases drug response in cell lines and/or mice, is there enough evidence to assume it would also increase drug response in a human with that variant?  There will be MDs (oncologists) soon faced with that choice for cancer patients for no good therapy options.  And lawyers that might love to sue the pants off developers of software that recommend the therapy...