Data Anonymization

Foundations of Privacy Protection from a Computer Science Perspective

by Latanya Sweeney


Organizations often release and receive person-specific data with all explicit identifiers, such as name, address and telephone number, removed on the assumption that privacy is maintained because the resulting data look anonymous. However, in most of these cases, the remaining data can be used to reidentify individuals by linking or matching the data to other data bases or by looking at unique characteristics found in the fields and records of the data base itself. When these less apparent aspects are taken into account, each released record can be altered to map to many possible people, providing a level of anonymity that the record-holder determines. The greater the number of candidates per record, the more anonymous the data.

L. Sweeney Foundations of Privacy Protection from a Computer Science Perspective. Proceedings, Joint Statistical Meeting, AAAS, Indianapolis, IN. 2000. (PDF)

Related Links


Copyright © 2011. President and Fellows Harvard University.   |   IQSS   |    Data Privacy Lab   |    []