Genomic Privacy Project

Re-Identification of DNA through an Automated Linkage Process

by Bradley Malin and Latanya Sweeney


This work demonstrates how seemingly anonymous DNA database entries can be related to publicly available health information to uniquely and specifically identify the persons who are the subjects of the information even though the DNA information contains no accompanying explicit identifiers such as name, address, or Social Security number and contains no additional fields of personal information. The software program, REID (Re-Identification of DNA), iteratively uncovers unique occurrences in visit-disease patterns across data collections that reveal inferences about the identities of the patients who are the subject of the DNA. Using real-world data, REID established identifiable linkages in 33-100% of the 10,886 cases explicitly surveyed over 8 gene-based diseases.

Keywords: DNA privacy, genetic privacy, privacy technology

Malin, B. and Sweeney, L. Re-Identification of DNA through an Automated Linkage Process. Proceedings, Journal of the American Medical Informatics Association (AMIA). Washington, DC: Hanley & Belfus, Inc. Nov 2001; 423-427. Also available on MEDLINE. (

Related Links

Copyright © 2011. President and Fellows Harvard University.   |   IQSS   |    Data Privacy Lab   |    []