De-identifying Text |
Citation:
Abstract
We define a new approach to locating and replacing personally-identifying information in unrestricted text that extends beyond straight search-and-replace procedures, and we provide techniques for minimizing risk to patient confidentiality. The straightforward approach of global search and replace properly located no more than 30-60% of all personally-identifying information that appeared explicitly in letters between physicians and notes written by clinicians within a pediatric database. On the other hand, our Scrub system found 99-100% of these references. Scrub uses detection algorithms that employ templates and specialized knowledge of what constitutes a name, address, phone number and so forth.
Latanya Sweeney.
Replacing Personally-Identifying Information in Medical Records, the Scrub System.
In: Cimino, JJ, ed. Proceedings, Journal of the American Medical Informatics Association (AMIA).
Washington, DC: Hanley & Belfus, Inc, 1996:333-337. (This paper was awarded First Prize at AMIA 1996.)
(PDF).
Related Links