Datafly Anonymization

Datafly: a system for providing anonymity in medical data

by Latanya Sweeney



We present a computer program named Data y that maintains anonymity in medical data by automatically generalizing, substituting, inserting and removing information as appropriate with- out losing many of the details found within the data. Decisions are made at the eld and record level at the time of database access, so the approach can be used on the y in role-based security within an institution, and in batch mode for exporting data from an institution. Often organiza- tions release and receive medical data with all explicit identi ers, such as name, address, phone number, and Social Security number, removed in the incorrect belief that patient con dentiality is maintained because the resulting data look anonymous; however, we show that in most of these cases, the remaining data can be used to re-identify individuals by linking or matching the data to other databases or by looking at unique characteristics found in the elds and records of the database itself. When these less apparent aspects are taken into account, each released record can be made to ambiguously map to many possible people, providing a level of anonymity which the user determines.

L. Sweeney Datafly: a system for providing anonymity in medical data. Database Security, XI: Status and Prospects, T. Lin and S. Qian (eds), Elsevier Science, Amsterdam, 1998. (PDF)

Related Links

Copyright © 2011. President and Fellows Harvard University.   |   IQSS   |    Data Privacy Lab   |    []