The office of Jeff Hammerbacher at Mount Sinai's Icahn School of Medicine sits in the middle of one of the most stark economic divides in the nation. To Hammerbacher’s south are New York City’s posh Upper East Side townhouses. To the north, the barrios of East Harlem.
What's below in the basement may be what's most interesting: Minerva, a humming supercomputer installed last year that's named after the Roman goddess of wisdom and medicine.
It’s rare to find a supercomputer in a hospital, even a major research center and medical school like Mount Sinai. But it’s also rare to find people like Hammerbacher, a sort of human supercomputer who is best known for launching Facebook’s data science team and, later, co-founding Cloudera, a top Silicon Valley "big data" software company where he is chief scientist today. After moving to New York this year to dive into a new role as a researcher at Sinai’s medical school, he is setting up a second powerful computing cluster based on Cloudera’s software (it’s called Demeter) and building tools to better store, process, mine, and build data models. "They generate a pretty good amount of data," he says of the hospital’s existing electronic medical record system and its data warehouse that stored 300 million new "events" last year. "But I would say they are only scratching the surface."
Combined, the circumstances make for one of the most interesting experiments happening in hospitals right now—one that gives a peek into the future of health care in a world where the amount of data about our own health, from our genomes to our Jawbone tracking devices, is exploding.
"What we’re trying to build is a learning health care system," says Joel Dudley, director of biomedical informatics for the medical school. "We first need to collect the data on a large population of people and connect that to outcomes."
To imagine what the hospital of the future could look like at Mount Sinai, picture how companies like Netflix and Amazon and even Facebook work today. These companies gather data about their users, and then run that data through predictive models and recommendation systems they’ve developed—usually taking into account a person’s past history, maybe his or her history in other places on the web, and the history of "similar" users—to make a best guess about the future—to suggest what a person wants to buy or see, or what advertisement might entice them.
Through real-time data mining on a large scale—on massive computers like Minerva—hospitals could eventually operate in similar ways, both to improve health outcomes for individual patients who enter Mount Sinai’s doors as well as to make new discoveries about how to diagnose, treat, and prevent diseases at a broader, public health scale. "It’s almost like the Hadron Collider approach," Dudley says. "Let’s throw in everything we think we know about biology and let’s just look at the raw measurements of how these things are moving within a large population. Eventually the data will tell us how biology is wired up."
Dudley glances at his screen to show the very early inklings of this vision of what "big data" brought to the world of health care and medical research could mean.
On it (see the figure above) is a visualization of the health data of 30,000 Sinai patients who have volunteered to share their information with researchers. He points out, in color, three separate clusters of the people who have Type 2 diabetes. What we're looking at could be an entirely new notion of a highly scrutinized disease. "Why this is interesting is we could really be looking at Type 2, Type 3, and Type 4 diabetes," says Dudley. "Right now, we have very coarse definitions of disease which are not very data-driven." (Patients on the map are grouped by how closely related their health data is, based on clinical readings like blood sugar and cholesterol.)
From this map and others like it, Dudley might be able to pinpoint genes that are unique to diabetes patients in the different clusters, giving new ways to understand how our genes and environments are linked to disease, symptoms, and treatments. In another configuration of the map, Dudley shows how racial and ethnic genetic differences may define different patterns of a disease like diabetes—and ultimately, require different treatments.
These are just a handful of small examples of what could be done with more data on patients in one location, combined with the power to process it. In the same way Facebook shows the social network, this data set is the clinical network. (The eventual goal is to enroll 100,000 patients in what’s called the BioMe platform to explore the possibilities in having access to massive amounts of data.) "There’s nothing like that right now—where we have a sort of predictive modeling engine that’s built into a health care system," Dudley says. "Those methods exist. The technology exists, and why we’re not using that for health care right now is kind of crazy."
While Sinai’s goal is to use these methods to bring about more personalized diagnoses and treatments for a wide variety of diseases, such as cancer or diabetes, and improve patient care in the hospital, there are basic challenges that need to be overcome in order to making this vision achievable.
Almost every web company was born swimming in easily harvested and mined data about users, but in health care, the struggle has for a long time been more simple: get health records digitized and keep them private, but make them available to individual doctors, insurers, billing departments, and patients when they need them. There’s not even a hospital’s version of a search engine for all its data yet, says Hammerbacher, and in the state the slow-moving world of health care is in today, making predictions that would prevent disease could be just the icing on the cake. "Simply centralizing the data and making it easily available to a broad base of researchers and clinicians will be a powerful tool for developing new models that help us understand and treat disease," he says.
Sinai is starting to put some of these ideas into clinical practice at the hospital. For example, in a hint of more personalized medicine that could come one day, the FDA is beginning to issue labels for some medicines that dictate different doses for patients who have a specific genetic variant (or perhaps explain that they should avoid the medicine altogether). The "Clipmerge" software that the hospital is beginning to now use makes it easier for doctors to quickly search and be notified of these kinds of potential interactions on an electronic medical record form.
On the prediction side, the hospital has already implemented a predictive model called PACT into its electronic medical record system. It is used to predict the likelihood that a discharged patient will come back to the hospital within 90 days (the new health care law creates financial incentives for hospitals to reduce their 90-day readmission rate). Based on the prediction, a high-risk patient at the medical center now might actually receive different care, such as being assigned post-care coordinator.
Eventually, there will be new kinds of data that can be put in mineable formats and linked to electronic patient records, from patient satisfaction surveys and doctors’ clinical notes to imaging data from MRI scans, Dudley says.
Right now, for example, the growing volumes of data generated from people’s fitness and health trackers is interesting on the surface, but it’s hard to glean anything meaningful for individuals. But when the data from thousands of people are mined for signals and links to health outcomes, Dudley says, it’s likely to prove valuable in understanding new ways to prevent disease or detect it at the earliest signs.
A major limitation to this vision is the hospital’s access to all of these new kinds of data. There are strict federal laws that govern patient privacy, which can make doctors loathe to experiment with ways to gather it or unleash it. And there are many hoops today to transferring patient data from one hospital or doctor to another, let alone from all the fitness trackers floating around. If patients start demanding more control over their own health data and voluntarily provide it to doctors, as Dudley believes patients will start to do, privacy could become a concern in ways people don’t expect or foresee today—just as it has on the Internet.
One thing is clear: As the health care system comes under pressure to cut costs and implement more preventative care, these ideas will become more relevant. Says Dudley: "A lot of people do research on computers, but I think what we’re hoping for is that we’re going to build a health care system where complex models ... are firing on an almost day-to-day basis. As patients are getting information about them put in the electronic medical record system there will be this engine in the background."