Data Anonymization Project

Towards the optimal suppression of details when disclosing medical data, the use of sub-combination analysis

by Latanya Sweeney, Ph.D.

Abstract

Sharing medical data with researchers, economists, policy makers, administrators and other secondary viewers, immediately summons for consideration the dichotomy between the recipient’s needs and disclosure risk. Finding the optimal balance between the suppression of details within the data needed to maintain confidentiality on the one hand, and the specificity required by the recipient in order for the data to remain useful on the other hand, is quite difficult. We present a new computational technique based on stepwise consideration of all sub-combinations of sensitive fields. This technique can be used within the Datafly or m-Argus architectures to help achieve optimal disclosure and we show that doing so provides more specific data than Datafly would normally release and improves the confidentiality of results from m-Argus.

Keywords: data anonymity, data privacy, re-identification, data fusion, privacy

Citation:
Latanya Sweeney. Towards the optimal suppression of details when disclosing medical data, the use of sub-combination analysis. Proceedings, MEDINFO 98. International Medical Informatics Association. Seoul, Korea. North-Holland, 1998. (PDF).

Related Publications


Summer 2003 Data Privacy Lab [De-identification Project]