Topics in Privacy

Topics in Privacy (TIP) consists of weekly discussions and brainstorming sessions on all aspects of privacy. Discussions are often inspired by a real-world privacy problem being faced by the lead discussant, who may be from industry, government, or academia. Practice talks and presentations on specific techniques and topics are also common.

The following schedule and descriptions are tentative. Topics are usually not posted earlier than the week before.

Schedule Fall 2011

Date  Speaker/Discussant  Description
9/21  Alessandro Acquisti
Carnegie Mellon
  Privacy and Behavioral Economics: The Control Paradox and Other Studies
  
9/28  Dean Gallant, Harvard
Micah Altman, Harvard
Bob Gellman, Washington,DC
  Revisions to the Federal Policy for the Protection of Human Subjects ("Common Rule"): a discussion of proposed changes and implications
  
10/5  Salil Vadhan, Harvard  Data Privacy Responses to Proposed Changes to the Common Rule
  
10/12  Mary McThomas
Brown University
  The Dual System of Privacy Rights in the United States
  
10/19  group discussion  Harmonizing Responses to Proposed Changes to the Common Rule
  
10/26  no meeting, remote joint work  Submission of Data Privacy and Related Responses to Proposed Changes to Common Rule
  
11/2  Micah Altman, Harvard  Recommendations to Census on Improving Researcher Access to Census Data
  
11/9  John Wilbanks, Sage Bionetworks  Commons-based Approaches to Privacy
  
11/16  Herbert Burkert, University of St. Gallen, Switzerland  EU-Directive 95/46/EC - The Mechanics
  
11/23  no meeting  Thanksgiving Break
  
11/30  Charles Allen, Harvard
Joe Pato, HP and MIT
  India's National ID Program
12/14  Latanya Sweeney, Harvard University  Reidentification of a Research Dataset

Abstracts of Talks and Discussions

  1. Privacy and Behavioral Economics: The Control Paradox and Other Studies

    How do we make decisions about the privacy and security of our personal information? In this talk, I will first highlight how research in behavioral economics can help us make sense of apparent inconsistencies in privacy (and security) decision-making. Then, I will present results from a number of experiments conducted at Carnegie Mellon University, including a recent study of the complex relationship between privacy and control. Normatively, privacy is often associated with an individual's control over her personal information. However, in a series of experiments, we investigated the effect that granting more control over information revelation has on individuals' propensities to share sensitive data. We found that real or perceived control over information publication increases the likelihood that individuals will disclose sensitive information, even when the objective risks associated with such disclosures actually increase. We call this the privacy control paradox. Our findings highlight that merely granting users more "control" over their personal information or privacy settings does not guarantee that they will be able to find a desirable balance between information disclosure and information protection. In fact, technologies that make individuals feel more in control over the publication of personal information may potentially carry the unintended consequence of eliciting more risky disclosures.

    Discussant: Alessandro Acquisti, Visiting Scholar in the Data Privacy Lab; Associate Professor at Carnegie Mellon University


  2. Revisions to the Federal Policy for the Protection of Human Subjects ("Common Rule"): a discussion of proposed changes and implications

    The U.S. Office for Human Research Protections (OHRP) has proposed a set of sweeping changes to the federal regulations that govern research involving human subjects (the "Common Rule"), in the form of an Advance Notice of Proposed Rule Making (ANPRM) and is soliciting comments from investigators, Institutional Review Boards (IRBs), and any other interested parties.

    These comments are due by October 26, 2011.

    This TIP session will provide an opportunity to discuss the ANPRM, its potential effect on the research community, and possible constructive suggestions that could be offered. While some of these changes will be viewed favorably by the research community, particularly by researchers in the social and behavioral sciences, others (including increased consent and data security requirements) could potentially make some kinds of research more difficult or even impossible.

    Comments are solicited, and encouraged, from individuals. Harvard's office of the Vice Provost for Research is also working on a response, as are other entities at the university and many scientific and professional organizations.

    The areas addressed in the ANPRM include

    • Changes to the IRB review process including
      • changes in types of research considered "exempt" or "expedited"; and
      • changes to continuing review requirements for low-risk research
      • changes to recommendations regarding the administrative review of exempt studies
    • Changes to criteria for IRB approval of proposed research
    • A mandate for single IRB review for multi-site studies
    • A mandate to require consent for use of non-identified bio specimens
    • Greater specificity regarding the content and form of informed consent
    • Extension of HIPAA data security requirements to all human subjects data, not just "protected health information"
    • Extension of Common rule requirements to all research at any institution receiving federal funding for human subjects research
    • Creation of a single web site for mandated reporting of adverse events and unanticipated problems in human subjects research
    • Coordination of guidance from federal regulatory entities regarding human subjects research

    (More details of the ANPRM can be found at http://www.hhs.gov/ohrp/humansubjects/anprm2011page.html )

    For those particularly interested, the following commentary may be useful as background.

    Discussants:
      Dean Gallant, Assistant Dean for Research Policy and Administration, Harvard FAS
      Micah Altman, Social and Information Scientist, Institute for Quantitative Social Science, Harvard University
      Bob Gellman,, Privacy and Information Policy Consultant, Washington, DC


  3. Data Privacy Responses to Proposed Changes to the Common Rule

    As discussed last week, the U.S. Office for Human Research Protections has proposed sweeping changes to the rules that govern human subjects research. The Data Privacy Lab and those involved in data privacy research can offer unique and critical insight into proposed rules by sharing and operationalizing what we know about de-identification and re-identification risks and remedies and making sure proposed changes build on and not pave over data privacy research. In this TIP session, we will look at proposed changes specific to de-identification and re-identification and offer some sample responses for open discussion and brainstorming. Final responses to the proposed changes are due October 26, 2011 but no further TIPs are currently reserved for this topic.

    Discussant: Salil Vadhan, Vicky Professor of Computer Science and Applied Mathematics, Harvard University


  4. The Dual System of Privacy Rights in the United States

    The right to privacy in the United States covers an array of issues. Although based on the same right, these different issues are treated dissimilarly by the courts. I argue that the distinguishing feature that gives us the greatest foothold into understanding the right to privacy in the United States is whether the activity or protection being claimed is justified in terms of property or liberty. The first category, proprietary privacy, covers such issues as medical records, wiretapping and owning one's own image. The second category, decisional privacy, involves making decisions about intimate matters and covers such issues as the right to die, same-sex marriage, and abortion. The two categories are treated differently, giving greater or lesser protection to individuals based on the type of privacy. Decisional privacy has evolved more slowly towards constitutionalization. By constitutionalization, I mean the transition away from group-based social norms to an individual-based and universally-applied "right." The slower evolution of decisional privacy rights has led to a situation in which those matters involving the liberty to determine intimate matters are much more likely to be limited by community standards and social norms. To examine these differences, I track higher court cases in two subject areas: marital privacy (as a representative of decisional privacy) and conversational privacy (as a representative of proprietary privacy). The resulting comparison demonstrates how discrepancies in the way the two categories of privacy are handled impact the protections granted by the right to privacy, thus leading to a dual system of privacy rights in the United States.

    Discussant: Mary McThomas, Assistant Professor of Political Science at Mississippi State University and a Visiting Assistant Professor at Brown University.


  5. Harmonizing Responses to Proposed Changes to the Common Rule

    Comments are due in response to proposed revisions to the federal policy for the protection of human subjects ("Common Rule") on 10/26/2011. The proposed changes are quite sweeping and could have a freezing effect on privacy research. We have devoted two TIP sessions already to this topic and this will be our final session on the topic. In the first session, Dean Gallant, Bob Gellman, and Micah Altman introduced the critical issues and sweeping changes proposed; see above. In the second session, Salil Vadhan outlined particular questions relevant to data privacy and differential privacy; see above. Finally, in this session we will look at some draft responses and discuss submission strategies for various professionals and professional groups. We will review a wiki setup for harmonizing responses related to data privacy.


  6. Submission of Data Privacy and Related Responses to Proposed Changes to Common Rule

    Responses were submitted! In terms of contributions related to TIP sessions: About 50 data privacy researchers aand supporters primarily from computer science, medical informatics, public policy and law, from insitutions across the United States, joined the Data Privacy Lab in submitting a joint response to the proposed changes to the Common Rule. Two national privacy groups, the Electronic Privacy Information Center (EPIC) and Patient Privacy Rights also supported the submission. These responses broadly address the fitness and appropriateness of HIPAA and the emerging field of data privacy. A third complementary and coordinated response, led by Salil Vadhan and supported by additional researchers, drew on recent advances in understanding data privacy from a theoretical computer science perspective. Harvard University had its own submission, and it was harmonized with the data privacy submissions. Micah Altman and Dean Gallant were active on numerous other submissions from research and professional organizations. Copies of all these submitted responses are available here.


  7. Recommendations to Census on Improving Researcher Access to Census Data

    The purpose of this session is to generate ideas for improving access to research data produced, collected or disseminated by the U.S. Bureau of the Census ("Census"). Highlights from the discussion will inform ongoing conversations with Census, and may result in a future paper or research report.

    KEY QUESTIONS

    On the privacy side...
    Are there ways in which Census could make confidential data available in a more usable form and still provide adequate protection? Do current mechanisms offer adequate protection for privacy? What are the vulnerabilities? What are privacy solutions?

    On the utility side...
    How has the diversity of historical ways Census has shared data affected research using census data? How can Census better support the scientific need for replication, integration of multiple data sources, citation and provenance? Are current practices sufficient to ensure long-term (archival) access? What can Census do to make data, and especially confidential data more easily discoverable, accessible, and usable in the short and long term?

    MOTIVATION FOR INCREASED DATA USE BY SOCIAL SCIENTISTS

    Social scientists in the US have relied heavily on data collated and distributed by the U.S. Census Bureau. Some of the Census bureau's best known data products include the Decennial Census, American Community Survey, Census of Government, Economic Census and Tiger geography. In addition, the Census distributes data for other agencies such as the Bureau of Labor Statistics and the center for disease control, and provides data collection that supports public data produced by other government agencies including the National Science Foundation, National Center for Science and Engineering Statistics.

    However, the evidence base in social science research is shifting rapidly. As Gary King writes in Science: "Massive increases in the availability of informative social science data are making dramatic progress possible in analyzing, understanding, and addressing many major societal problems. Yet the same forces pose severe challenges to the scientific infrastructure supporting data sharing, data management, informatics, statistical methodology, and research ethics and policy, and these are collectively holding back progress." (PDF) Research that requires integrating large census collections with many other forms of data from different sources is increasingly important. And policies, mechanisms and infrastructure for data discovery, attribution, provenance, access, and privacy are pivotal.

    BACKGROUND ON CENSUS DATA ACCESS

    The census offers access to confidential and private information on businesses and people in a number of ways:

    1. Public use files are created for some studies. (See links to distribution channels below.)

    2. Synthetic data is experimentally used to provide access to one data product (link).

    3. Interactive disclosure limitation is used on American Factfinder for some user-driven tabulations. (link)

    4. Research Data Centers (RDCs) (link) offer a restricted environment in which some confidential data held by the Bureau can be directly analyzed. However, confidential data is not guaranteed to be available in any of these channels. Most of the confidential data that is made available, is only available through an RDC, which requires large usage fees, detailed proposals, an extensive vetting process, and is currently limited to only seven physical locations in the U.S.

    Census distributes data through a number of different channels and interfaces, including:

    1. American FactFinder (link), which offers a novice-friendly user interface to dynamically generated tables and maps for many of the largest Census products

    2. The DataWeb (link) - which offers access maps, tables, analyses, and extracts from a wide variety of census products and data from other agencies

    3. Center for Economic Studies Research Data Centers (link), offering remote desktop access to selected restricted data.

    4. Web-ftp (link) offering file download capability

    5. Dynamic custom visualization and analysis systems, such as the LED/LEHD tools (link) and the Synthetic Longitudinal Business Database (link)

    In addition, organizations also repackage and distribute many Census data collections, see:

    1. Inter-University Consortium for Political and Social Researc (ICPSR) hosted at the University of Michigan (link)

    2. National Historical Geographical Information System (NGHIS) (link)

    Discussant: Micah Altman, Social and Information Scientist, Institute for Quantitative Social Science, Harvard University


  8. Commons-based Approaches to Privacy

    This talk will cover the background and alpha release of Portable Legal Consent (PLC). PLC is being developed as a tool to allow patients to tell the doctors, researchers, and companies that are experimenting on them, that they, the patients own the rights to the data generated from their bodies. When a patient chooses PLC, it's a statement that what the patient desires is for the data to be shared broadly in the public domain, to serve scientific progress as a whole, regardless of the particular individual or institution that makes the breakthrough. PLC is voluntary, opt-in, and will be submitted for IRB approval in the first quarter of 2012.

    For a sample of an informed consent form that uses the commons-based approach, see http://weconsent.us/consentform. Archived txt and PDF.

    Discussant: John Wilbanks, Senior Fellow at the Kauffman Foundation and a Fellow at Lybba.org. John Wilbanks works on open content, open data, and open innovation systems. He's worked at Harvard Law School, MIT's Computer Science and Artificial Intelligence Laboratory, the World Wide Web Consortium, the US House of Representatives, and Creative Commons, as well as starting and selling a bioinformatics company. He sits on the Board of Directors for Sage Bionetworks, iCommons, and 1DegreeBio.


  9. EU-Directive 95/46/EC - The Mechanics

    The presentation attempts at providing an insight into the conceptual world of the EU-Directive, the normative-analytical framework that follows from the concept and the way the framework drives the applications of data protection laws in the Member States.

    The EU Directive (PDF) and the EU Commission's Communication on a reform of the Directive (PDF).

    Discussant: Herbert Burkert, currently Visiting Professor at Harvard Law School and a fellow at the Berkman Center for Internet and Society. He teaches Internet Law, Media Law and Constitutional Law at the University of St.Gallen (Switzerland) where he is also the President of the Research Center for Information Law. He has been working as an adviser to the German and the Swiss Governments, to the European Commission, the Council of Europe and OECD on issues of access to government information, data protection, and internet policies.


  10. India's National ID Program

    Charles Allen, a Harvard Law Student, is investigating the current and potential legal consequences of Aadhar, India's new national ID program. In preparation for his visit to India, he would like to use a TIP session to brainstorm about issues. The TIP discussion will begin with an introduction to the system by Joe Pato, senior architect and researcher at HP and MIT.

    Aadhar (meaning "foundation") is a revolutionary effort to provide 12-digit identification numbers, linked to a database of biometric fingerprint and retina records, to all of India's 1.2 billion citizens. While the effort is still in its infancy, its proponents see potential to provide India's underclass of over 400 million with the benefits of a rapidly-growing globalized economy, including access to bank accounts and credit, cell phones, and more efficiently-distributed government benefits. The planned system is intended to be both accurate—using biometric data to create unique and difficult-to-forge records—and scalable, using dispersed field teams to reach even the country's poorest and most remote citizens.

    While Aadhar's goals are noble and far-reaching, the program raises significant legal questions. First, India's has comparatively weak privacy laws, and has had a history of pushing even those limits with extensive citizen surveillance. Putting the world's largest database of personal information in its hands raises significant civil liberty concerns. Second, India's system of government welfare, currently managed largely through jobs and in-kind grants of food and supplies, stands to be replaced cash transfers. While the ID system will lead to more accountability for these payments, it will create questions of what individuals are actually guaranteed by their government, and where the boundaries of state intervention in local markets actually lie. Third, as the effort is being run by a group of private partners under government supervision, there will be ongoing conflicts between the more entrepreneurial managers of the program, and the entrenched bureaucracy within which it is intended to operate. If Aadhar reaches its intended scale, India will require fundamental changes to its regulatory state.

    Charles Allen's resulting paper will explore these anticipated issues, as well as potential challenges to hypothetical applications of Aadhar in election law, human resources management, and taxation.

    Related links

    Discussants: Charles Allen, Harvard Law School and Joe Pato, HP and MIT.

    Charles Allen is a second-year student at Harvard Law school, focusing on international business, trade, and development law. Prior to starting at Harvard, I worked for two years for an international development consulting firm in Washington, DC, managing projects for USAID and the US State Department. While my projects were global and multidisciplinary in focus, my most intensive work was on a monitoring and evaluation effort in Baghdad, Iraq. I completed my undergraduate studies at Georgetown University's School of Foreign Service, with a concentration in International Politics and a certificate in International Development.

    Joe Pato is HP Labs Distinguished Technologist and a Visiting Scientist at MIT. Among numerous outstanding achievements, he chaired an advisory committee to the U.S. government that provided a comprehensive assessment of biometrics and the role of government in its development. Results appear in the National Academy Press report, Biometric Recognition: Challenges and Recognition


  11. Reidentification of a Research Dataset

    Harvard researchers recorded Facebook social network data and cultural preferences from a cohort of undergraduates over their 4-year time in the College, and linked it with student room assignments, demographics and other school information. This longitudinal enriched network dataset could prove an invaluable asset for social scientists studying social networks, social stratification, homophily, and culture. The researchers intended to make an anonymized version of his dataset publicly available in the IQSS Dataverse Network so that other social science researchers could benefit from it. Indeed, this was also an expectation of the National Science Foundation, who funded the work in part. Working with the Harvard Institutional Review Board (IRB), the IQSS, and the Berkman Center Cyberlaw Clinic, the researcher engaged in due diligence, making reasonable but ad hoc decisions about how to anonymize the data and how to control the data flow. The resulting data had no explicit identifiers and no explicit dates or ZIP codes, which were known vulnerabilities. However, Prof. Sweeney expressed concerns of the possibility of student re-identifications because the dataset did not use a scientific approach to de-identification. She vowed to work with the researchers to assess actual reidentification risks and to then develop remedies so the data can be shared widely with privacy guarantees. The researchers have not shared the data widely, pending a satisfactory long-term resolution for sharing useable data with a community of scholars. This talk reports on a series of experiments to answer the question ”how many Harvard students could actually be re-identified in the data?”. The answer is more than 85% of the 1600 students. These findings offer significant insight into reidentification risks and de-identification standards used for research, including HIPAA.

    Latanya Sweeney, PhD accomplishments include: Visiting Professor and Scholar at Harvard University, Distinguished Career Professor of Computer Science, Technology and Policy in the School of Computer Science at Carnegie Mellon University. and Director and founder of the Data Privacy Lab. The Lab works with real-world stakeholders to solve today’s privacy technology problems. Her work involves creating technologies and related policies with guarantees of privacy protection while allowing society to collect and share person-specific information for many worthy purposes. Her work has received awards from numerous organizations, including the American Psychiatric Association, the American Medical Informatics Association, and the Blue Cross Blue Shield Association. The American College of Medical Informatics inducted her as a Fellow in 2006. Dr. Sweeney received her PhD in computer science from the Massachusetts Institute of Technology in 2001. She joined the faculty of Carnegie Mellon as an Assistant Professor in 1998. More information about Dr. Sweeney is available at her website dataprivacylab.org/people/sweeney.




Copyright © 2014. President and Fellows Harvard University.   |   IQSS   |    Data Privacy Lab   |    [info@dataprivacylab.org]