Can machine-learning techniques identify disease-carrying species and predict epidemics?
In April 2014, just after world health officials identified a series of suspicious deaths in Guinea as an outbreak of Ebola, 10 ecologists, 4 veterinarians, and an anthropologist traveled to a Guinean village named Meliandou. Theirs was a detective mission to determine how this outbreak began. How had "patient zero," a 2-year-old boy named Emile, contracted the Ebola virus?
Because we believe people catch Ebola through contact with infected animals, ecologists have long sought the animal "reservoirs" that harbor the virus and pass it along (often without getting sick themselves). With every new outbreak of a zoonotic disease like Ebola, scientists race to identify the reservoirs so that public health officials can determine the method of transmission and perhaps prevent more "spillover events," in which the disease flows from animal reservoirs to people. Such is today's post hoc, reactive model of dealing with outbreaks.
In Meliandou, the Ebola detectives interviewed villagers, studied primate populations in nearby forests, and collected bats in nets. In December 2014, they published a paper hypothesizing that little Emile had contracted Ebola from a colony of insect-eating bats that lived in a hollow tree, near where the local children often played. But the tree had caught fire before the team arrived in the village and the bats were gone, so the investigators couldn't say for sure.
As most previous research on Ebola reservoirs has focused on fruit bats, the team's findings may prompt scientists to study this insectivorous bat species, and may cause health officials to stay alert in areas where these bats live in close proximity to people. But these are rearguard maneuvers against a brutal opponent: The current Ebola epidemic has killed more than 11,200 people in West Africa to date, and health officials are still fighting to end it. Is there a way to go on the offense against Ebola and other zoonotic diseases? Can we predict outbreaks before they occur?
In my research as a disease ecologist at the Cary Institute of Ecosystem Studies, in Millbrook, N.Y., I use computer modeling and machine learning to predict which wild species are capable of causing future outbreaks. My models create "caricatures" of likely reservoirs, revealing the suite of features that distinguish the unusual species that can harbor microbes dangerous to humans. I then use algorithms to sort through hundreds or thousands of species that have never been checked for zoonotic diseases, and calculate the probability that any given species is a disease reservoir based on its similarity to that caricature. The models give us a list of suspects.
My colleagues and I do this work in the spirit of scientific inquiry, and also with an urgent sense of purpose. Infectious diseases are on the rise around the world, and the U.S. Agency for International Development reckons that about 75 percent of new diseases are zoonotic. If we can predict which species may carry infections capable of jumping to humans, we can monitor the potential hot spots where people interact with these creatures. One day, I hope that biologists will forecast disease outbreaks in the same way meteorologists forecast the weather. With one major difference: A meteorologist can't stop a storm front, but we may be able to prevent outbreaks.