Carnegie Mellon University
School of Computer Science

 People
 Research
 About us

 Education
    Workshops
    Talks
    Courses

 Privacy
    Issues
    Introduction

 Publications
    Papers
    Presentations

Introduction to Data Privacy

Privacy

In the most general sense, privacy reflects the ability of a person, organization, government, or entity to control its own space, where the concept of space (or privacy space) takes on different contexts and therefore leads to different interpretations of privacy. For example, privacy space can be one’s physical property, such as enjoying the privacy of one’s home against invasion. Privacy space can refer to one’s body, such as bodily privacy requiring informed consent for medical treatment. Privacy space can relate to one’s computer, such as protection from email spam. Privacy space can relate to one’s interactions on the Web, such as enforcing promises made in stated privacy policies.

Data Privacy

When privacy space refers to the fragments of data one leaves behind as a person moves through daily life, the notion of privacy is called data privacy , and many of the other perspectives about privacy and privacy space fail to apply. One of the biggest differences stems from the fact that the originator of the data lacks the ability to control how the data are used even though they are the subject of the data. This conflict historically has been resolved by policies and laws. But today’s technically empowered society can capture and share unprecedented amounts of person-specific data such that what was once considered private information, not by policy but by the absence of technology, may now be publicly available. Old policies seem outdated and new uses for data are emerging. Society feels the pressure of being forced to choose privacy or not. The goal of data privacy is to ease this tension, so that a binary choice is not the only option, but in fact, a continuum of choices are realized through technical options known as privacy technology.

Data privacy is divided into two major areas: data anonymity, which involves methods for de-identifying data such that the results can be shared with scientific assurances of anonymity while the data remain practically useful for many worthy purposes. Data anonymity is a compromise position. The other major area is policy specification and enforcement (becoming known as privacy rights management), which involves developing languages for specifying privacy preferences and uses for sharing data and methods for enforcing specified terms. This latter area is commonly associated with Internet privacy, and is typically similar to consent models.


Data privacy has two major areas: anonymity and privacy rights management. These are enacted on a single data source (database), or a network of data sources (ubiquitous). Shown are the contributions of traditional areas of computer science and IS.

Data Privacy is not Computer Security

Authentication and encryption are traditional areas of computer security. These kinds of methods help provide privacy by making sure no intruder breaks into a system to receive data. Encryption is also a common tool in computer security to additionally make sure no eavesdropper on communications receives data.

Privacy problems causing the largest commotion in the press these days are not related to intruders breaking in to computers or eavesdroppers grabbing communication packets. Public concerns over privacy relate to the volume of information captured and shared freely. Solutions to these kinds of problems require examining stored values and the direct and indirect inferences learned when the information is combined with other available data. These are data privacy problems.

Example. A profile of the subject is stored in the computer by way of a person or another computer that has been authenticated by login with password and is authorized to read/write data on the machine. The communication pipeline is encrypted to avoid eavesdropping. The process is reversed for reading the profile from the machine. All of these were computer security protections, but the content of the profile can re-identify individual, and that is a data privacy problem.


Summer 2003 [webmaster@dataprivacylab.org] Privacy Technology Center