Topics in Privacy

Topics in Privacy (TIP) consists of weekly discussions and brainstorming sessions on all aspects of privacy. Discussions are often inspired by a real-world privacy problem being faced by the lead discussant, who may be from industry, government, or academia. Practice talks and presentations on specific techniques and topics are also common.

The following schedule and descriptions are tentative. Topics are usually not posted earlier than the week before.

[Summer 2003] [Summer 2004] [Data Privacy Lab Projects] [Data Privacy Lab]

Schedule (Most Recent Only)

DateSpeaker/DiscussantDescription
19/14Daniel Wilson
Intel Corporation (CMU PhD 2004)
Simultaneous Tracking and Activity Recognition (STAR) Using Simple Sensors
29/21Orna Raz
Post Doc (CMU PhD 2004)
Helping Everyday Users Find Anomalies in Data Feeds
39/28Jason Hong
Carnegie Mellon Faculty
Research Directions in Privacy-Sensitive Ubiquitous Computing
410/5Alessandro Acquisti
Carnegie Mellon Faculty
Privacy, Economics, and Immediate Gratification: Theory and Evidence
510/12 Michael Shamos
Carnegie Mellon Faculty
Two Weeks to E Day: the state of e-voting
610/21* Cynthia Dwork
Microsoft Research
Towards Privacy in Public Databases
710/26Attendees Brainstorming about Privacy and Privacy Technology
811/2 Mike Reiter
Carnegie Mellon Faculty
Recent Progress in Anonymous Communication
911/9 Stephen E. Fienberg
Carnegie Mellon Faculty
Statistical vs. Privacy-preserving Approaches to Confidentiality
1011/16 David Farber
Carnegie Mellon Faculty
National Security and Privacy Technology
1111/23 Lea Kissner
Carnegie Mellon
Private and Threshold Set-Intersection
1212/3* Jon Kleinberg
Cornell University
Cascading Behavior in Social Networks
* Special date change that week.

Abstracts of Talks and Discussions

  1. Simultaneous Tracking and Activity Recognition (STAR) Using Simple Sensors
    Automatic health monitoring helps enable independent living for the elderly by providing specific information to caregivers -- an important goal as an unprecedented portion of the world population enters old age. In past research I have introduced the simultaneous tracking and activity recognition (STAR) problem, whose solution provides this key information. I have shown that it is possible to provide simultaneous room-level tracking and recognition of simple activities using information from many minimally invasive sensors commonly found in home security systems. Future work (in collaboration with Intel Research Seattle) will focus on advanced activity recognition using gloves that can recognize what they touch. Tools such as intelligent environments and wearable sensors can help keep the elderly from being institutionalized, improving overall quality of life for millions of people.
    [Presenter:
    Daniel Wilson]

  2. Helping Everyday Users Find Anomalies in Data Feeds
    Much of the software people use for everyday purposes incorporates elements developed and maintained by someone other than their integrator. These elements include not only code but also data feeds. Although everyday information systems are not mission critical, they must be dependable enough for practical use. This is limited by the dependability of the incorporated elements. It is particularly difficult to evaluate the dependability of data feeds. The specifications of data feeds are often even sketchier than the specifications of software components, the data feeds may be changed by their proprietors, and everyday users of data feeds only have enough knowledge about the application domain to support their own usage. These factors inhibit many dependability enhancement techniques, which require a model of proper behavior for failure detection, preferably in the form of specifications.

    In this talk I present a partial solution to this problem---CUES, Checking User Expectations about Semantics. CUES is a method and a prototype implementation for making user expectations precise and for checking these precise expectations. CUES treats the precise expectations as a proxy for missing specifications. It checks the precise expectations to detect semantic anomalies---data feed behavior that does not adhere to these expectations.

    Three case studies and a validation study, all with real-world data, provide evidence of the practicality and usefulness of CUES.  The studies indicate that a user of CUES gets substantial benefit for a modest investment of time and effort. In addition to automated detection of anomalies, the benefit often includes a better understanding of the user's own expectations, of the data feeds, and of existing and missing documentation.
    [Presenter:
    Orna Raz]

  3. Research Directions in Privacy-Sensitive Ubiquitous Computing
    In the first part of this talk, Dr. Hong will go over some of his past work in understanding and developing privacy-sensitive ubicomp systems, primarily focusing on location privacy. He will also draw on interesting themes and connect them with his current research here at CMU in developing and deploying privacy-sensitive ubicomp systems.

    The rest of the talk will be devoted to brainstorming and discussion of some potentially new areas in the intersection of HCI, systems, privacy, and security.
    [Presenter:
    Jason Hong]

  4. Privacy, Economics, and Immediate Gratification: Theory and Evidence
    Dichotomies between privacy attitudes and actual behavior have been noted in the literature but not yet fully explained. Individuals often claim to be highly concerned about their personal privacy, but few adopt technologies to protect it, and many release personal information in exchange for small rewards. We apply lessons from the research on behavioral economics to understand the individual decision making process with respect to privacy. We show that it is unrealistic to expect individual rationality in this context. Models of self-control problems and immediate gratification offer more realistic descriptions of the decision process and are more consistent with currently available data. In particular, we show why individuals who may genuinely want to protect their privacy might not do so because of psychological distortions well documented in the behavioral literature; we show that these distortions may affect not only ‘naive’ individuals but also ‘sophisticated’ ones; and we show that this may occur also when individuals perceive the risks from not protecting their privacy as significant. Finally, we present results from a survey of individual privacy attitudes and behavior that we conducted in May 2004 to test our theoretical model.
    [Presenter:
    Alessandro Acquisti]

  5. Two Weeks to E-Day: The State of E-Voting
    It is two weeks before the U.S. presidential election; what is the state of electronic voting machines?  Will there be another Florida disaster or will the U.S. be lucky if just one state has a problem?  In this talk, Michael Shamos will bring us up-to-date on the state of affairs regarding legal actions and challenges to electronic voting machines and how these problems are likely to impact the upcoming election.

    Dr. Shamos' widely cited paper on voting system examination, written in 1993, is available at: https://www.cpsr.org/conferences/cfp93/shamos.html

    Information about Dr. Shamos' E-voting class is available at: https://euro.ecom.cmu.edu/program/courses/tcr17-803/
    [Presenter: Michael Shamos]

  6. Towards Privacy in Public Databases
    We will describe definitions and algorithmic results for the ``census problem''.  Informally, in a census individual respondents give private information to a trusted (and trustworthy) party, who publishes a sanitized version of the data.  There are two fundamentally conflicting requirements: privacy for the respondents and utility of the sanitized data.  Unlike in the study of secure function evaluation, in which privacy is preserved to the extent possible given a specific functionality goal, in the census problem privacy is paramount; intuitively, things that cannot be learned "safely" should not be learned at all.

    The definition of privacy and the requirements for a safe sanitization are important contributions of this work.  Our definition of privacy formalizes the notion of protection from being brought to the attention of others -- one's privacy is maintained to the extent that one blends in with the crowd.  Our definition of a safe sanitization emulates the definition of semantic security for a cryptosystem and says, roughly, that an adversary given access to the sanitized data is not much more able to compromise privacy than an adversary who is not given access to the sanitization.

    Joint work with Shuchi Chawla, Frank McSherry, Adam Smith, and Hoeteck Wee.
    [Presenter:
    Cynthia Dwork, Microsoft Research]

  7. Brainstorming about Privacy and Privacy Technology
    At least once each semester, we devote a TIP session to brainstorming and reflecting on the hottest issues in privacy areas and the many new and provocative ideas posed by speakers in earlier TIPs during the semester.  In this session, open and candid discussion is particularly welcomed. Among the topics are: privacy-preserving data mining, privacy in electronic voting, economics of privacy, privacy in ubiquitous computing, formal protection models, privacy-preserving surveillance, and/or any other topic of interest to those in attendance.  All are welcomed to come, listen, and participate.

  8. Recent Progress in Anonymous Communication
    We summarize three recent results in anonymous communication technologies.
    1. A technique, called "fragility", that discourages mix administrators from selectively disclosing input-to-output message correspondences (e.g., for profit), presuming that the mix administrators are motivated to protect the privacy of some communications through the mix (e.g., their own).  This is joint work with X. Wang (Indiana).
    2. A simulation-based analysis of timing attacks on mix-based systems, and new techniques for better defending against them.  This is joint work with B. Levine (UMass), C. Wang (CMU), and M. Wright (UMass).
    3. A new method for implementing an anonymous message service using a technique called "private push and pull".  This technique permits the oblivious insertion of labeled messages into a distributed database, and the oblivious retrieval of messages that match a provided keyword.  This is joint work with L. Kissner, A. Oprea, D. Song (CMU) and K. Yang (Google).
    The talk will assume far less prior knowledge in anonymous communication    than this abstract does.
    [Presenter:
    Mike Reiter]

  9. Statistical vs. Privacy-preserving Approaches to Confidentiality
    We begin by describing some of the principles underlying the "statistical approach" to confidentiality  and disclosure limitation, especially the risk-utility tradeoff, and we explain by example our statistical notion of utility.  We then contrast this with "privacy preserving" methods now popular in the data mining literature.  This  will serve as a prelude to a brainstorming discussion about the viability of the notion of "selective revelation" that has been proposed as a fundamental component of the DARPA-led Total Information Awareness Program, or Terrorism Information Awareness program as it is now known.
    [Presenter:
    Stephen E. Fienberg]

  10. National Security and Privacy Technology
    For the past two years, the Markle Foundation has undertaken a series  of studies on National Security through it's panel on National  Security. The results have had significant impact on the structure of  DHS and the current Intelligence debate. Three key recent documents  and the url to the main  web page:

    The proposed direction raised a great deal of privacy issues which we  will brainstorm in this meeting.
    [Presenter: David Farber]

  11. Private and Threshold Set-Intersection
    In this talk we consider the problem of privately computing the set-intersection (private matching) of sets, as well as several variations on this problem: cardinality set-intersection, threshold set-intersection, and over-threshold set-intersection. Cardinality set-intersection is the problem of determining the size of the intersection set, without revealing the actual set. In threshold set-intersection, only the elements which appear at least a threshold number t times in the players' private inputs are revealed. Over-threshold set-intersection is a variation on threshold set-intersection in which not only the threshold set is revealed, but also the popularity of each element in the set. Let there be n >= 2 players, c < n dishonestly colluding, each with a private input set of k elements.

    Our protocols for two parties are comparable to those of Freedman, Nissim, and Pinkas~, except in the malicious case, in which we achieve security without the use of the expensive cut-and-choose technique. Freedman, Nissim, and Pinkas also address the problem of private set-intersection for multiple parties, giving a protocol secure against honest-but-curious parties with O(n^2k) total communication complexity, but no protocol secure against malicious adversaries. We give a protocol for this scenario secure against honest-but-curious parties with O(cnk) total communication complexity, and against malicious parties with communication complexity O(n^2k^2).

    There was no previous efficient protocol for the problems of cardinality set-intersection (for n >= 3 players or n >= 2 malicious players), threshold set-intersection, or over-threshold set-intersection. We give the first efficient protocols for these problems without generic secure circuit computation.

    Note: This is a mathematically-oriented talk. Background math and cryptography will be covered at the beginning of the talk. A draft of the paper is available here.
    [Presenter: Lea Kissner]

  12. Cascading Behavior in Social Networks
    The working metaphor for the Web in its early days was that of a `universal encyclopedia,' a repository containing vast amounts of human knowledge.  More recently, our view has shifted to incorporate more dynamic forms of information and a more explicit `time axis' -- the Web as a current-awareness medium, supporting on-line news, chat, discussion, commentary, and emerging media such as weblogs.

    Many factors have contributed to this evolution of Web information; a crucial one is the way in which technological and social networks have become intertwined, enabling new topics, ideas, and behaviors to cascade along an underlying social network of interaction that parallels the hyperlinks of the Web itself.  In this talk, I will discuss some of the current approaches to the challenges that are raised by these issues, including new algorithms to analyze the rich array of temporal phenomena that abound in current information networks, and search techniques that can handle the temporal fluctuations and bursty dynamics of on-line topics. 
    [Presenter:
    Jon Kleinberg]

Related Data Privacy links



Fall 2004 Data Privacy Laboratory [LIDAP@dataprivacylab.org]