Keywords: data anonymity, data privacy, text de-identification
Citation:
Abstract
We define a new approach to locating and replacing personally-identifying information in unrestricted text that extends beyond straight search-and-replace procedures, and we provide techniques for minimizing risk to patient confidentiality. The straightforward approach of global search and replace properly located no more than 30-60% of all personally-identifying information that appeared explicitly in letters between physicians and notes written by clinicians within a pediatric database. On the other hand, our Scrub system found 99-100% of these references. Scrub uses detection algorithms that employ templates and specialized knowledge of what constitutes a name, address, phone number and so forth.
L. Sweeney, Replacing Personally-Identifying Information in Medical Records,
the Scrub System. In: Cimino, JJ, ed. Proceedings,
Journal of the American Medical Informatics Association.
Washington, DC: Hanley & Belfus, Inc, 1996:333-337.
(This paper was awarded First Prize at AMIA 1996.)
Paper: 5 pages in PS or PDF.
Related Links
Related Publications
Latanya Sweeney's Home Page
Selected publications by Latanya Sweeney
Last modified 2/2003 by latanya@dataprivacylab.org