We present a computer program named
Datafly that maintains anonymity in medical data by
automatically generalizing, substituting, and
removing information as appropriate without losing
many of the details found within the data. Decisions
are made at the field and record level at the time of
database access, so the approach can be used on the
fly in role-based security within an institution, and in
batch mode for exporting data from an institution.
Often organizations release and receive medical data
with all explicit identifiers, such as name, address
and phone number, removed in the incorrect belief
that patient confidentiality is maintained because the
resulting data look anonymous; however, we show
the remaining data can often be used to re-identify
individuals by linking or matching the data to other
databases or by looking at unique characteristics
found in the fields and records of the database itself.
When these less apparent aspects are taken into
account, each released record can be made to
ambiguously map to many possible people, providing
a level of anonymity determined by the user.
L. Sweeney Guaranteeing anonymity when sharing medical data, the Datafly system. Proceedings, Journal of the American Medical Informatics Association. (AMIA). Washington, DC: Hanley & Belfus, Inc., 1997. (PDF)