New Delhi, Jan. 18: A promise of anonymity in genomic research may be misleading.
A new study has shown that it is possible to determine the identities of people who have anonymously contributed genetic material for research, exposing gaps in existing security structures for personal genetic information.
A team of US scientists has combined genomic information with Internet search tools and public databases to identify nearly 50 individuals in America who had provided their genetic material as participants in genomic studies.
The researchers used genetic markers found on the Y-chromosome, which moves from father to son to first infer the surnames of the individuals and then used several public databases to determine their identities. The study appears today in the US journal Science.
“Our results show it is possible to identify some individuals who may have wished to remain anonymous,” said Yaniv Erlich, principal investigator at the Whitehead Institute for Biomedical Research at the Massachusetts Institute of Technology and senior author of the study.
Erlich and his colleagues have not revealed the identities of these individuals but shared their methodology of breaching privacy with senior staff at the US National Institutes of Health before the publication of the research.
The researchers hope the study will stimulate debate and research on policy guidelines and security procedures to protect the privacy of genetic information that can potentially reveal health information about individuals.
“We don’t want people to stop donating genetic material and we want the public sharing of data to continue,” Erlich told The Telegraph. “This study was intended to help policy makers and individuals appreciate the benefits and the risks of releasing genetic data.”
The Whitehead researchers used genetic markers called short-tandem — repeats on the Y-chromosome or Y-STRs — that can provide information on paternal lineages when combined with recreational genetic genealogy databases.
The scientists used the two largest public genetic genealogy databases —www.ysearch.org and www.smgf.org — that are free-of-charge and are equipped with search engines.
The Y-STR genetic markers correlate with surnames because surnames, in most human cultures, move down from fathers to sons — just as the Y chromosome does.
Erlich and his colleagues first tested their methodology on the genome from a known, identified individual — the US genome sequencing pioneer Craig Venter — combining Y-STR data and tracing surnames and other public databases to correctly identify Craig Venter.
In the next phase of the study, the researchers traced back the identities of nearly 50 individuals — first through inferring surnames, then using databases of state records, dates of birth, addresses and other demographic news.
Indian scientists involved in genomic research said India lacked rich public databases with similar demographic and personal information about its citizens.
“This (breach of privacy) can be done only when highly organised databases with information about the population is available,” said a senior scientist at the Institute of Genomics and Integrative Biology, New Delhi, who was not connected with the US study.
Biologists caution that the utility of genomic information for predicting health outcomes of individuals is still limited.