The Thinkers: Data privacy drives CMU expert's work
The Thinkers [This monthly series will highlight people from Western Pennsylvania who are on the forefront of new ideas in their fields.]
Monday, December 26, 2005
By Mark Roth
When Latanya Sweeney clicked on a computer link recently that she thought would take her to a student newspaper Web site, she quickly realized with a groan that she had been scammed.
Her computer had been infected by a virus, and she had to purge every file before she could use it again.
That's aggravating enough for an everyday user, but for a leading expert on computer privacy, it's embarrassing.
Dr. Sweeney runs the Data Privacy Lab at Carnegie Mellon University and teaches in the school's prestigious Institute for Software Research International.
The invasion of her computer shows how vigilant people must be these days in the never-ending struggle to benefit from the Internet's life-changing technology while also protecting their privacy.
That challenge has sent her lab in two parallel directions.
One is to demonstrate the vulnerability of personal information and give people tools to protect their data. The other is to find ways that the government can conduct surveillance for potential terrorists without invading citizens' privacy.
One of her early experiments showed that it was easy to identify people based on limited information.
When the federal Health Information Portability and Accountability Act, called HIPAA, was being formulated, many experts thought it could keep medical records confidential by removing names, addresses and Social Security numbers.
But that often still left dates of birth, genders and ZIP codes on the records, and Dr. Sweeney was able to show that 87 percent of people could be "re-identified" by comparing that information with publicly available databases like voter registration lists.
That led to a software tool called Identity Angel, which figures out how much personal information about a person is on the Internet, and whether someone else could use that information to fraudulently obtain a credit card in that person's name.
All that most major credit card companies require is an accurate name, address, Social Security number and date of birth.
Identity Angel showed that many people provide the first three pieces of information, and sometimes birth dates as well, on resumes that can be found online. In many cases, they may not be aware that these resumes, which they might have created for job searches while in college, have been posted on the Web, Dr. Sweeney said.
Once her program showed how vulnerable people were, she said, "we then wanted to see if we could make things better. So every time we would find people we would try to e-mail them and say, 'Hey, you've got a resume online; maybe you should try to take it off.' "
Software like Identity Angel shows that "it's very difficult for most people to understand the risk of information they may give away, and not just on the Internet."
Rapid rise toward CMU
When Latanya Arvette Sweeney gives away information about herself, it reveals an unusual upbringing, a meteoric rise through elite schools, a 10-year side trip and a pioneering place as one of the first African-American women in her field.
Dr. Sweeney was born in Tennessee to a very young mother and a father who soon left for the armed services. She was adopted and raised by her great-grandparents in Nashville.
As the only child in the household, "I got their undivided attention," she said, "and they had a strong belief in education."
When she was just a teenager, her great-grandparents died, but she was able to get a scholarship to the elite Dana Hall Schools in Wellesley, Mass.
There, she graduated with honors, and encountered her first computer, a Wang word processor.
That led to her admission in 1979 to the Massachusetts Institute of Technology, where she began to study computer science and electrical engineering.
"It was a tremendous culture shock," Dr. Sweeney recalled. "I was used to being the only black person in my classes, but MIT's social norms were so different."
She didn't have any problems with fellow students, but she did suffer a pattern of discrimination from one professor.
He consistently gave her subpar marks, sometimes saying she provided too little explanation for her answers and other times saying she provided too much.
Then the professor taught the "resistor code" -- the meaning of the color bands that mark electrical resistors in a circuit. To help students memorize the order, the professor provided an old mnemonic phrase in which the first letter of each word corresponded to a color.
The phrase? "Black boys rape only young girls but Violet gives willingly."
Experiences like that led Dr. Sweeney to drop out of MIT. She then formed her own computer software and sales company, which she operated for a decade.
After that real-world experience, she returned to school at Harvard University, graduating cum laude in computer science in 1995. She then went back to MIT to earn both her master's and doctoral degrees.
On that second trip through MIT, "I had a good time," she said. "By then, I wasn't taking any more crap from anyone, and they really seemed to take note of that."
She was hired by Carnegie Mellon in 2002.
New surveillance techniques
When Dr. Sweeney was attending MIT as a graduate student, she got a fellowship from the National Library of Medicine, and in gratitude, she volunteered to help several Boston hospitals protect the privacy of their medical records.
That led her down a path that has resulted in her Carnegie Mellon lab developing several privacy-protecting software programs that could prove useful in America's war on terror.
She said much of the debate on this issue, like the fierce arguments recently over extending the USA Patriot Act, portrays terrorism surveillance and privacy as an either-or proposition. But it doesn't have to be that way, Dr. Sweeney said.
The Data Privacy Lab has developed several computer programs that would allow the government to watch for suspicious behavior without compromising citizens' privacy.
After Sept. 11, 2001, for instance, some government officials touted the use of video surveillance technology to monitor suspicious behavior and identify potential terrorists. But that raised concern about the government spying on people who were doing nothing wrong.
Dr. Sweeney's lab has developed a program that can alter the images of people's faces so that investigators can conduct surveillance without initially being able to identify who they are watching.
The program creates a brand new facial image, either by combining the features of two similar faces, or by converting the features into a "clown face," neither of which can be recognized by facial detection software.
If authorities then spot suspicious behavior, they can seek a court order to reveal the actual faces.
Another program developed by the lab would allow government agents to track possible acts of bioterrorism, such as a spike in people seeking treatment for certain diseases, but it would keep the information about individual patients anonymous until something suspicious had been detected.
At this point, the government is not funding full-scale development of these technologies, but Dr. Sweeney is convinced that day will come.
"It's inevitable for society to adopt anti-terrorism provisions," she said. "And our type of technology is the only way for society to enjoy the benefit of those increased pursuits while at the same time maintaining the civil liberties that are the cornerstone of the country."
Position: Associate professor of computer science, technology and policy; director, Data Privacy Lab, Carnegie Mellon University
Education: Harvard University, bachelor's degree in computer science, cum laude, 1995; Massachusetts Institute of Technology, master's and doctorate in computer science, 1997 and 2001
Previous position: Owner and president, CESS Inc. computer software and support firm, 1981 to 1991
Awards: Recognition award for anonymity-preserving software, 2004 Workshop on Privacy-Enhancing Technologies, Toronto; privacy leadership award, 2002, Blue Cross Blue Shield Association of Michigan; privacy advocate award, 2001, American Psychiatric Association
Publications: Nearly 30 refereed papers for journals and workshops, including awards for papers on how DNA can be used to re-identify individuals and how computer software can make individual information anonymous.