Data Privacy Lab  

Harvard University

The Politics of Personal Data

Gov 1430

Home | Schedule | Syllabus

Lab 2. Social Security Numbers Online


In this lab, you will work to gather and report information for an overall class experiment related to Social Security numbers online.

Suggested Readings


A goal of this is to locate resumes on the Web having Social Security Numbers, and then, for each resume found, send an automated email message alerting the subject that they are needlessly placing themselves at risk to identity theft.

Years earlier, Professor Sweeney had an automated system that did this named the Identity Angel Project. Professor Sweeney asserts that back then, the system, with the help of the media, was able to rid the Web of thousands of online resumes having Social Security numbers. Over time, Professor Sweeney has noticed the return of these resumes, accessible from simple search queries.

Part 1. Locating Resumes Online with Social Security Numbers (Individual Activity)

The first part of this lab enlists your help in tracking down resumes and notifying the subject of the resume about its presence. Go to the course wiki for information about this activity. Contact Sean Hooley or Professor Sweeney if you cannot access the course wiki.

Activity. Individual survey (Due Tuesday, 2/24)

Submit your survey answers for Part 1. This is an individual submission. Send your individual answer to Professor Sweeney. You also have a group assignment this week, see Part 2 below.

Part 2. Assessing the Situation (Group Activity)

Assume we have a modernized version of Professor's old program that accomplishes the following.

  1. Identifies "real" resumes from an online search

  2. Harvests email address from resumes, if an email address is present.

  3. Locates a Social Security number in a resume, if present.

  4. Works with resumes stored in PDF and HTML formats, not just text.

  5. Sends an email messages to the person who is the subject of the resume alerting him of the location where the resume with his SSN was found.

Meet professors who have hypotheses that question Professor Sweeney's vision.

System Not effective

Professor Alice asserts you can never tell the public because then identity thieves will build the same system.

Professor Bob asserts that the effort is useless because permanent archives will always exist anyway.

Professor Cathy asserts that the effort is useless because no one will respond to the email because it will be like spam.

Don't bother

Professor David asserts that online resumes, more than any other form of identity theft, makes the country vulnerable to economic warfare.

Professor Ellen asserts that Google alone can solve the problem by not answering these kinds of search queries.

Professor Frank asserts that it is illegal to have a repository of Social Security numbers so no one will use them for fear of being caught.

Social Security Numbers

Professor Gail asserts that Social Security numbers are just like fingerprints and we don't have a problem with fingerprints.

Professor Harry asserts that no other country has a similar problem, even those countries with national ID numbers.

Professor Irene asserts that the problem can be solved by making everyone's name and Social Security number public.

Professor Jack asserts that the problem can be solved by replacing uses of Social Security numbers with fingerprints.

Professor Sweeney wonders whether you can prove or disprove the claims of these professors. Your task for next week is to design an experiment that when it is done, will reveal the answer. You do not have to do the experiment you design, just describe it sufficiently for evaluation as to its credibility and utility and effectiveness in answering the question.

Activity. Experimental Design

Divide into groups of 4-5 people. Select an assertion made by the Professors. The goal of your group is to report on an experiment that could test your assigned professor's assertion. Gather in your group and design a way you might design an experiment. Choose a group leader who will be responsible for writing your experimental design and making the presentation next week. In your group, brainstorm over possible approaches. You do not have to actually do the experiment for Tuesday, just design it. You should include enough credible evidence in your write-up to make an expected outcome for your experiment. Then, on Tuesday, your group leader will make a 5 minute presentation describing your group's plan. Your group will also submit a short (3-5 page) document describing your experiment.

Be sure to be in a group of people that you have not been with previously. Be sure that the group leader is someone who has not been a group leader previously. Be sure to send the names of the people in your group, and the name of the group leader to Sean Hooley by the end of class today.

Submit a short paper (3-5 pages) from your group describing the experiment you will do. Make a 5 minute presentation of your experimental design. Be sure to include the names of all group members on the paper. (This is in addition to the individual submission in Part 1.)

In this course you will write scientific papers. The basic parts of the scientific paper you will write for this assignment will have four parts: an Abstract, Introduction, Methods, Results, and Discussion. These should be the headings in your writeup. The Abstract section should be a one paragraph summary stating what you intend to show with the experiment, how and the expected outcome. The Introduction describes why your experiment is important. You should not assume the person reading the paper knows anything about this assignment. Include references and use authoritative sources to make your points. You should not make any sweeping or unsubstantiated statements in your writing. End the Introduction with a description of the hypothesis you will test. The Methods section is where you describe the experiment you would do to test the hypothesis. You will not likely have results for the experiment, though you may likely have preliminary results that provide an expected outcome for your experiment. Use the Discussion section to explain why this test would prove or disprove the hypothesis. Include at least one statement about the limits of your approach.


The supplement contains make-up material for this Lab and optional information.

Copyright © 2013-15 President and Fellows Harvard University | Data Privacy Lab