Saturday, November 1, 2014

Anonymity Proves Elusive as DNA Analysis Gains Precision and Speed

Animation of the structure of a section of DNA...
Animation of the structure of a section of DNA. The bases lie horizontally between the two spiraling strands. (Photo credit: Wikipedia)

BY GINA KOLATA


Not so long ago, people who provided DNA for research were told their privacy was assured. Their DNA sequences were on available Web site, yes, but they did not include names or other identifiers. These were research databases, scientists said, not like the forensic ones kept by the F.B.I.

But lately geneticists have been given hints that subjects in fact could sometimes be identified by their DNA alone. In January, a researcher at the Whitehead Institute, which is affiliated with the Massachusetts Institute of Technology, managed to track down five people selected at random from a database using only their DNA, ages and the states in which they lived. And he did it in just hours. He also found relatives – a total of close to 50 people.

This month an international group of nearly 80 researchers, patient advocates, universities and organizations like the National Institutes of Health in the United States announced that it wants to consolidate the world’s databases of DNA and other genetic information, making data easier for researchers to retrieve and share. But the security and privacy of the study subjects are paramount concerns, said Dr. David Altshuler of the Broad Institute of Harvard University and M.I.T., a leader of the group.

“The problems are not yet solved in any general way,” Dr. Altshuler said.

In 2008, David W. Craig, a geneticist at TGen, a research institute in Phoenix, Arizona, theoretically proved that a particular person’s DNA could be found amid a mass of other samples. His method involved using combinations of hundreds of thousands of DNA markers.

The N.I.H. quickly responded, moving all genetic data from the studies it financed behind Internet firewalls.

But another sort of genetic data – so-called RNA expression profiles that show patterns of gene activity – was still public. Such data could not be used to identify people, or so it was thought.

Then Eric E. Schadt of Mount Sinai School of Medicine in New York Discovered that RNA expression data could be used not only to identify someone but also to learn a great deal about that person. “We can create a profile that reflects your weight, whether you are diabetic, how old you are,” Dr.Schadt said.
He and a colleague also were also were able to tell if a person is infected with viruses, like HPV or H.I.V. that change the activity of genes.

Then, this year, Yaniv Erlich, a genetics researcher at the White head Institute, used a new computational tool he had invented to identify by name five people from their DNA, which he had randomly selected from a research database containing the genes of 1,000 people.

Experts were startled. “We are in what I call an awareness moment,” said Eric D. Green of the National Genome Research Institute at the National Institutes of Health in Bethesda, Maryland.

Research subjects who share their DNA may risk a loss of not just their own privacy but also that of their children and grandchildren, who will inherit many of the same genes, said Mark B. Gerstein, a Yale University professor who studies large genetic databases.

George Church, a Harvard geneticist, said there appears to be no technical solution to the issue of DNA privacy.” If you believe you can just encrypt terabytes of data or anonymize them, there will always be people who hack through that,” Dr. Church said.

People who provide genetic information, he said, should simply be informed that a loss of privacy is likely, rather than unlikely.


Taken from TODAY Saturday Edition, June 29, 2013