Talks on Technology Science (ToTS) and Topics in Privacy (TIP)

Spring 2016

This is the schedule of weekly talks on Technology Science from expert researchers, public interest groups, and others on the social impact of technology and its unforseen consequences.

Join us on most Monday afternoons 2:30-4 PM at CGIS Knafel, Room K354 (1737 Cambridge St, Cambridge, MA).

Date Speakers Topic
1/25 No talks
2/1 Dave Choffnes, Northeastern University ReCon: Identifying and Controlling Privacy Leaks from Mobile Devices
2/8 Cathy O'Neil [Rescheduled due to weather] Weapons of Math Destruction: How Big Data Increases Inequality and Threatens Democracy
2/16 (Tue) Eran Toch, Tel-Aviv University [Room K262] Inferential Privacy: Definition and Applications to Genetic and Location Analysis
2/22 Azer Bestavros, Boston University Web-based Multi-Party Computation with Application to Anonymous Aggregate Compensation Analytics
2/29 Meredith Broussard, New York University Artificial Intelligence for Investigative Reporting
3/7 Susan Crawford, Harvard Law School, and Holly St. Clair, Commonwealth of Massachusetts [Cancelled] How Data and Technology Can Help Improve Government
3/14 Break
3/21 TBD TBD
3/28 Aran Khanna, Harvard University How to Scrape the Internet: Uncovering Privacy Leaks and Holding Tech Companies Accountable
4/4 Luke Stark Visceral Privacy and Intuitive Toxicology
4/11 Alex Rosenblat, Data & Society A Case Study of Uber’s Drivers: How employment structures and hierarchies emerge through software
4/18 Chris Vickery Down the Rabbit Hole: My Foray into the World of Data Breaches, Security, and Privacy
4/25 Cathy O'Neil Weapons of Math Destruction: How Big Data Increases Inequality and Threatens Democracy
5/3 (Tue) Students in Data Science to Save the World Save the World Poster Session [9:30 - 11AM @ CGIS Knafel 1st floor]
5/10 (Tue) Karissa McKelvey [Cancelled] PublicBits: Breaking Down Open Data Silos [10 - 11:30 AM]
5/16 Alexandra Mateescu Police Body-Worn Cameras: Negotiating Between Accountability and Privacy
5/23 Virgilio Almeida The Role of Data Science in Cyberspace Governance

Descriptions

   
1/25 No talks
2/1

ReCon: Identifying and Controlling Privacy Leaks from Mobile Devices

Mobile systems have become increasingly popular for providing ubiquitous Internet access; however, recent studies demonstrate that software running on these systems extensively tracks and leaks users' personally identifiable information (PII). I argue that these privacy leaks persist in large part because mobile users have little visibility into PII leaked through the network traffic generated by their devices, and have poor control over how, when and where that traffic is sent and handled by third parties.

In this talk, I describe ReCon, a cross-platform system that reveals PII leaks and gives users control over them without requiring any special privileges or custom OSes. Specifically, our key observation is that PII leaks must occur over the network, so we implement our system in the network using a software middlebox. We then use a machine learning approach to to efficiently and accurately detect users' PII without knowing a priori the content that is PII. Further, we develop techniques to block, obfuscate, or ignore the PII leak, by displaying leaks via a visualization tool and letting the user decide how the system should act on transmitted PII. I discuss the design and implementation of the system and evaluate its methodology with measurements from controlled experiments and flows from a user study with more than 100 participants. In addition to revealing and controlling PII leaks, we are using our machine-learning-based techniques to automatically identify and block malware based on network behaviors.

Speakers: David Choffnes is an assistant professor in the College of Computer and Information Science at Northeastern University. His research is primarily in the areas of distributed systems and networking, with a recent focus on mobile systems and privacy. Much of his work entails crowdsourcing measurement and performance evaluation of Internet systems by deploying software to users at the scale of tens or hundreds of thousands of users. He earned his PhD from Northwestern (not in the northwest), and completed a postdoc at the University of Washington (in the northwest) prior to joining Northeastern (both in the northeast and northwest). He sees no reason why this should at all be confusing. He is a co-author of three textbooks, and his research has been supported by the NSF, Google, the Data Transparency Lab, VidScale, M-Lab, and a Computing Innovations Fellowship.

2/8

[Rescheduled due to weather] Weapons of Math Destruction: How Big Data Increases Inequality and Threatens Democracy

Far too many algorithms in use today are being used as weapons against populations, whether they are consumers, workers, prisoners, or teachers. I'll talk about a few which I consider the worst kind - and which I call Weapons of Math Destuction - namely, those that are opaque, widespread, and powerful enough to cause tremendous destruction through feedback loops. I will also discuss suggestions for data scientists, policy makers, and the public for how to combat them.

Speakers: Cathy O’Neil earned a Ph.D. in math from Harvard, was a postdoc at the MIT math department, and a professor at Barnard College where she published a number of research papers in arithmetic algebraic geometry. She then switched over to the private sector, working as a quant for the hedge fund D.E. Shaw in the middle of the credit crisis, and then for RiskMetrics, a risk software company that assesses risk for the holdings of hedge funds and banks. She left finance in 2011 and started working as a data scientist in the New York start-up scene, building models that predicted people’s purchases and clicks. She wrote Doing Data Science in 2013 and launched the Lede Program in Data Journalism at Columbia in 2014. She is a weekly guest on the Slate Money podcast and is currently writing a book about the dark side of big data called Weapons of Math Destruction: how big data increases inequality and threatens democracy coming out September 2016 with Random House.

2/16 (Tue)

Inferential Privacy: Definition and Applications to Genetic and Location Analysis

We will have this talk at CGIS Knafel, Room K262.

Massive data sets and sophisticated learning algorithms are routinely used by companies and governments to predict people’s behavior and to deduce key properties of individuals. While the ability to infer detailed knowledge about people can be tied to many privacy threats, existing models of statistical disclosure are limited in modeling the threat and in suggesting effective solutions. In many cases, the data about each individual is processed separately, ruling out solutions such as anonymization and attribute obfuscation. Therefore, we generalize the concept of inferential privacy, and suggest Decision Theory as a solid foundation that can model inferential attacks. In addition, we use Principle Component Analysis (PCA) to quantify the potential privacy loss through the ability of an attacker to infer an individual’s properties. We show initial results from two studies, one in the field of location databases and one in the field of genomic databases, in which we empirically describe an upper bound to privacy loss. Finally, the talk will discuss several privacy- enhancing technologies that rely on inferential privacy, and demonstrate how they can be applied to real-world problems.

Speakers: Prof. Eran Toch is a member of the faculty of Engineering at Tel-Aviv University, at the department of Industrial Engineering. Eran’s research group is working on usable privacy and security, currently running several projects funded by agencies such as the Israeli Science Foundation (ISF), Horizon 2020, DARPA, and Israel Ministry of Science. Prior to joining Tel Aviv University, Eran was a post-doc fellow at the Carnegie Mellon University, School of Computer Science, and he has a Ph.D. from the Technion – Israel Institute of Technology.

2/22

Web-based Multi-Party Computation with Application to Anonymous Aggregate Compensation Analytics

In this talk, I will describe the definition, design, implementation, and deployment of a simple multi-party computation protocol and supporting web-based infrastructure. The protocol and infrastructure constitute a software application that allows groups of cooperating parties, such as companies or other organizations, to perform computation over their respective data assets for the purpose of obtaining aggregate analytics without the need to communicate or share their private data sets. The application was developed specifically to support a Boston Women’s Workforce Council (BWWC) study of the gender wage gap among employers within the Greater Boston Area. The application was deployed successfully to collect aggregate statistical data pertaining to compensation levels across genders and demographics at a number of participating organizations. Time permitting, I will summarize our experience with the rollout of this application, lessons learned, and future plans to extend the capabilities we developed by incorporating them into a programming environment that leverages popular cloud platforms for computing analytics over large private data sets.

This work was done at Boston University in collaboration with Andrei Lapets, Eric Dunton, Kyle Holzinger, and Frederick Jansen. It was funded in part by NSF awards #1414119 and #1430145 and by the Initiative on Cities at Boston University.

Speakers: Azer Bestavros (PhD 1992, Harvard University) is a Professor in the CS Department at Boston University, which he joined in 1991 and chaired from 2000 to 2007. He is the Founding Director of the BU Hariri Institute for Computing, which was set up in 2010 to "create and sustain a community of scholars who believe in the transformative potential of computational perspectives in research and education." His research contributions include pioneering the web push content distribution model adopted years later by industry, seminal work on Internet traffic characterization, game-theoretic approaches to cloud resource management, and safety certification of networked systems and software. Supported by over $20M of awards from government and industry, his work yielded 18 PhD theses, 6 issued patents, 3 startups, and hundreds of refereed papers that are cited over 15,000 times. His current research is focused on mechanism design for efficient and secure cloud computing. Azer served as chair of the IEEE Computer Society TC on the Internet and as chair or member of the editorial boards of major computer systems and networking conferences and journals. Currently, he co-chairs the Research, Education, and Outreach Committee of the MGHPCC, and is a board member of the Cloud Computing Caucus, a non-profit, non-partisan coalition of industry and key government stakeholders, focused on raising awareness and educating lawmakers and the public on issues associated with cloud computing. Azer received a number of distinguished service awards and best papers awards, most notably the ACM Sigmetrics Inaugural Test of Time Award for technical contributions "whose impact is still felt 10-15 years after initial publication." In 2010, he received the United Methodist Scholar Teacher Award in recognition of "outstanding dedication and contributions to the learning arts and to the institution" at Boston University.

2/29

Artificial Intelligence for Investigative Reporting

Data journalism is the practice of finding stories in numbers and using numbers to tell stories. It’s also one of the most cutting-edge fields in journalism, popularized by sites such as Fivethirtyeight.com, Vox, and The Upshot. In this talk, Meredith Broussard will demystify data journalism. She will show some examples of simple and complex data storytelling projects, including data visualizations, infographics, and news apps. She will discuss how she used artificial intelligence to investigate textbook shortages in Philadelphia public schools and will outline her new tool for investigating campaign finance data. Finally, she will discuss how journalists leverage big data for social good.

Speakers: Meredith Broussard is an assistant professor at the Arthur L. Carter Journalism Institute of New York University and a research fellow at Columbia University’s Tow Center for Digital Journalism. Her current research focuses on artificial intelligence in investigative reporting. A former features editor at the Philadelphia Inquirer, she has also worked a software developer at AT&T Bell Labs and the MIT Media Lab. Her features and essays have appeared in The Atlantic, Harper’s, Slate, The Washington Post, and other outlets. She holds a BA from Harvard University and an MFA from Columbia University. Follow her on Twitter @merbroussard or contact her via meredithbroussard.com.

3/7

[Cancelled] How Data and Technology Can Help Improve Government

This will be a brainstorming session on the ways data and technology can improve local government. The session begins with a presentation by Susan Crawford. Her recent book The Responsive City highlights the promising intersection of government and data through vivid case studies featuring municipal pioneers and big data success stories from Boston, Chicago, New York, and more. She explores topics including:

  • Building trust in the public sector and fostering a sustained, collective voice among communities
  • Using data-smart governance to preempt and predict problems while improving quality of life
  • Creating efficiencies and saving taxpayer money with digital tools
  • Spearheading these new approaches to government with innovative leadership

Holly St. Clair will respond and provide a few words about her thoughts and vision for the State of Massachusetts.

Then, the remainder of the session will be spent brainstorming ideas for how data and technology can help improve government. What are some low-hanging opportunities?

Speakers: Susan Crawford is a professor at Harvard Law School and a co-director of the Berkman Center. She is the author of Captive Audience: The Telecom Industry and Monopoly Power in the New Gilded Age, co-author of The Responsive City: Engaging Communities Through Data-Smart Governance, and a contributor to Medium.com’s Backchannel. She served as Special Assistant to the President for Science, Technology, and Innovation Policy (2009) and co-led the FCC transition team between the Bush and Obama administrations. She also served as a member of Mayor Michael Bloomberg’s Advisory Council on Technology and Innovation and is now a member of Mayor Bill de Blasio’s Broadband Task Force. Ms. Crawford was formerly a (Visiting) Stanton Professor of the First Amendment at Harvard’s Kennedy School, a Visiting Professor at Harvard Law School, and a Professor at the University of Michigan Law School (2008-2010). As an academic, she teaches Internet law and communications law. She was a member of the board of directors of ICANN from 2005-2008 and is the founder of OneWebDay, a global Earth Day for the internet that takes place each Sept. 22. One of Politico’s 50 Thinkers, Doers and Visionaries Transforming Politics in 2015; one of Fast Company’s Most Influential Women in Technology (2009); IP3 Awardee (2010); one of Prospect Magazine’s Top Ten Brains of the Digital Future (2011); and one of TIME Magazine’s Tech 40: The Most Influential Minds in Tech (2013). Ms. Crawford received her B.A. and J.D. from Yale University.

Holly St. Clair is the Director of Enterprise Level Data Management overseeing the Commonwealth of Massachusett's activities in data management, data analysis, research, and public access to data. Ms. St. Clair has pioneered the use of advanced decision support tools in Metropolitan Boston, managing a variety of projects that use scenarios modeling, community indicators, and innovative meeting formats to engage stakeholders in dialogue about policy choices. She has a excellent track record in public sector innovation and is recognized by Planetizen as one of the Leading Thinkers and Innovators in the field of Urban Planning and Technology.

3/14 Break
3/21 TBD
3/28

How to Scrape the Internet: Uncovering Privacy Leaks and Holding Tech Companies Accountable

What’s really going on with your data at tech companies like Facebook or Venmo? Learn to build software that empower users to discover for themselves the consequences of the digital footprint they are leaving behind. Over the course of the seminar, Aran Khanna will teach you how he was able to build simple computer programs to expose privacy risks at major tech companies, which resulted in a viral public response and changes in popular products. You can build your own tools to uncover vulnerabilities in your favorite website or app and use that knowledge to hold tech companies accountable.

Speakers: Aran is currently a member of the class of 2016 at Harvard College, where he studies computer science and mathematics. On campus Aran conducts independent research in data privacy and security under the guidance of the former Federal Trade Commission CTO, Latanya Sweeney. He has had multiple papers published in Harvard’s Technology Science journal, one of which has been taught in university classrooms across the U.S. Aran’s work has also been featured in media outlets around the world, including The Guardian, Time, Forbes, The BBC and many more. Additionally, Aran is a contributor to the Huffington Post and Business Insider.

4/4

Visceral Privacy and Intuitive Toxicology

Much current scholarship examining online privacy and surveillance suggests either legal and/or narrowly technical solutions to contemporary anxieties around data tracking, aggregation, and analytics. In these discussions, human privacy is not always understood as an embodied, subjective phenomenon. In this talk, I will argue for a new focus on the visceral aspects of privacy as a prompt for both research and design, and explore how online privacy can be supported by what I term “data visceralizations”: representations of information that engage multiple senses including touch, smell, taste, and emotional arousal. Through comparisons with an analogous concept from environmental science, “intuitive toxicology” — interventions for mobilizing visceral responses to warn against contaminated air, water, and food - I argue for making data more visceral in novel privacy-preserving and enhancing technologies.

Speaker: Luke Stark is completing his PhD in the Department of Media, Culture, and Communication at New York University under the supervision of Helen Nissenbaum. His dissertation project, “That Signal Feeling: Psychology and Interaction Design from Smartphones to the Anxious Seat,” explores the genealogy of how psychological theories of emotion have been built into the design of interactive systems. Luke holds both an Honours BA in History & English and an MA in History from the University of Toronto. Luke is a 2015-16 Doctoral Fellow at the NYU Center for the Humanities, and has been generously funded by the National Science Foundation, New York University’s Provost’s Global Research Initiative, the Intel Science and Technology Center for Social Computing, and Microsoft Research. Learn more at https://starkcontrast.co.

4/11

A Case Study of Uber’s Drivers: How employment structures and hierarchies emerge through software

Photo: Lucy Nicholson / Reuters

Uber manages a large, disaggregated workforce that delivers a relatively standardized experience to passengers while simultaneously promoting drivers as entrepreneurs whose work is characterized by freedom, flexibility, and independence. Uber, like other companies in the on-demand economy, uses its identity as a platform and a technology company to elude legal responsibility for its drivers as a traditional employer. It claims to provide a “lead generation application” for drivers to connect with passengers, but this neutral branding of its role as an intermediary belies the important employment structures and hierarchies that emerge through its software application. Through a 9-month empirical research study of Uber driver experiences, myself and my colleague, Luke Stark, found that Uber does leverage significant control over how drivers do their jobs, but this control is structured to be indirect. The opacity of control is achieved through a range of semi-automated managerial functions, but foremost amongst these are: algorithmic labor logistics management; driver surveillance and the rating system; and performance targets and policies that limit the choices drivers can make to optimize their individual earnings on the system. Our conclusions are two-fold: first, that the information and power asymmetries and opacities produced by the Uber application are fundamental to its ability to structure indirect control over its workers; second, our analysis illustrates how the rhetorical invocation of technology and algorithms are used within the on-demand economy to structure corporate relationships to labor. Our study of the Uber driver experience points to the need for greater attention to the role of platform disintermediation in shaping power relations and communications between employers and workers.

Speakers: Alex Rosenblat is a researcher and technical writer at Data & Society. Her areas of research include socio-technical systems, the intersection of technology and labor, and the social and civil rights implications of emergent technologies, such as police body-worn cameras. She currently examines the relationship between semi-automated systems and labor management, with a particular focus on the on-demand economy. She holds an MA in Sociology from Queen’s University in Canada, and a BA in History from McGill University. Twitter: @mawnikr

4/18

Down the Rabbit Hole: My Foray into the World of Data Breaches, Security, and Privacy

From the voter registration records of 191 million Americans to the passwords of 3.3 million Hello Kitty fans and the private litigation strategies of California’s largest insurance companies, I’ve found it all. Locating publicly exposed databases is my specialty, and it has led to some interesting situations. Whether you think I’m a hacker or a hero, I guarantee you’ll enjoy hearing about my adventures.

Speaker: Chris Vickery, hacker / hero

4/25

Weapons of Math Destruction: How Big Data Increases Inequality and Threatens Democracy

Far too many algorithms in use today are being used as weapons against populations, whether they are consumers, workers, prisoners, or teachers. I'll talk about a few which I consider the worst kind - and which I call Weapons of Math Destuction - namely, those that are opaque, widespread, and powerful enough to cause tremendous destruction through feedback loops. I will also discuss suggestions for data scientists, policy makers, and the public for how to combat them.

Speakers: Cathy O’Neil earned a Ph.D. in math from Harvard, was a postdoc at the MIT math department, and a professor at Barnard College where she published a number of research papers in arithmetic algebraic geometry. She then switched over to the private sector, working as a quant for the hedge fund D.E. Shaw in the middle of the credit crisis, and then for RiskMetrics, a risk software company that assesses risk for the holdings of hedge funds and banks. She left finance in 2011 and started working as a data scientist in the New York start-up scene, building models that predicted people’s purchases and clicks. She wrote Doing Data Science in 2013 and launched the Lede Program in Data Journalism at Columbia in 2014. She is a weekly guest on the Slate Money podcast and is currently writing a book about the dark side of big data called Weapons of Math Destruction: how big data increases inequality and threatens democracy coming out September 2016 with Random House.

5/3 (Tue) Save the World Poster Session [9:30 - 11AM @ CGIS Knafel 1st floor]
5/10 (Tue)

[Cancelled] PublicBits: Breaking Down Open Data Silos [10 - 11:30 AM]

Dat is an open source, decentralized data-sharing tool for versioning and syncing data. Inspired by the best parts of Git and BitTorrent, Dat shares data through a free, redundant network that assures the integrity and openness of data. In this talk, we will introduce Dat along with the “PublicBits” distributed network, inspired by BitTorrent but adapted for open data, which will increase the availability and redundancy of data by connecting people who want to download data with those who already have it. PublicBits users may offer their unused bandwidth to mirror their previously-downloaded datasets for others as a public service. This way, if the original source goes offline, the data will still be downloadable. We have started building a search engine called PublicBits.org that will crawl data sources and convert the world's data sources from centralized and siloed servers to a new decentralized data network.

Speakers: Karissa McKelvey is a software developer, speaker, inventor, and activist supporting an equitable web. Formerly a research scientist at Indiana University, her work studying online political communication resulted in multiple peer-reviewed papers and press in outlets such as NPR and the Wall Street Journal. In addition to an experienced software and web developer, she has successfully led and implemented diverse projects throughout her career in academia, non-profits, and industry. In her spare time, she teaches beginning programmers, volunteers her time in social movements, and plays the trumpet.

5/16

Police Body-Worn Cameras: Negotiating Between Accountability and Privacy

Since early 2014, the rapid adoption of police body-worn cameras across the country has garnered significant attention as the cameras became the latest police reform technology, built on promises of greater accountability and transparency even as civil rights groups have cautioned that “police-operated cameras are no substitute for broader reforms of policing practices.” Body-worn cameras, it has been argued, would eliminate the need for the fortuitous happenstance of a cell-phone wielding bystander by recording continuously and approximating the field of view of police officers on the ground. But premised on the assumption of constant or near-constant recording, body-worn cameras have raised concerns around the interlacing of accountability tools with surveillance, and the array of practical considerations regarding the management, storage, and redaction of audio-visual data, as well as the implications of making that data public. According to one estimate, major city police departments that regularly deploy body-worn cameras are likely generating more than 10,000 hours of video data a week. This has prompted the question, are the costs and attendant risks to privacy worth it? In this talk, I will lay out the current landscape of body-worn camera programs, to ask: if accountability is the primary justification for camera adoption, how should individual and social costs, particularly towards marginalized people, be weighed and assessed in relation to the as-yet unknown benefits?

Speaker: Alexandra Mateescu is a researcher at Data & Society, working under its Data & Fairness initiative with a focus lately on the on-the-ground impact of criminal justice technologies. In February 2015, she co-authored a primer that brought together literature and existing empirical work on body-worn cameras. She has also worked on issues of police body-worn cameras, social media surveillance, and incarceration technologies for the 2015 Data & Civil Rights: A New Era of Policing and Justice conference hosted by Data & Society together with the Leadership Conference and Upturn. She holds a BA and MA in Cultural Anthropology from the University of Chicago.

5/23

The Role of Data Science in Cyberspace Governance

In this talk I discuss problems related to cyberspace governance. I show how data-oriented techniques can be used to analyze public and private policies for different platforms, such as Uber and Facebook. Data analysis can be an important tool for policy-makers to evaluate and propose evidence-based policies for cyberspace governance.

Speaker: Virgilio Almeida is a full professor of the Computer Science Department at the Federal University of Minas Gerais (UFMG), Brazil. His areas of research interest include large scale distributed system, Internet governance, social computing, autonomic computing and performance modeling and analysis. He received a Ph.D. degree in Computer Science from Vanderbilt University, an MS in Computer Science, from the Pontifical Catholic University in Rio de Janeiro and a BS Electrical Engineering from UFMG, Brazil. He was a visiting professor at Boston University, Technical University of Catalonia (UPC) in Barcelona, Polytechnic Institute of NYU and held visiting appointments at Santa Fe Institute, Hewlett-Packard Research Laboratory and Xerox Research Center.

He is a former National Secretary for Information Technology Policies of the Ministry of Science, Technology and Innovation of Brazil (2011 to 2015). He is the chair of the Brazilian Internet Governance Committee (CGI.br). He was the chair of NETmundial, Global Multistakeholder Conference on the Future of Internet Governance, that was held in Sao Paulo in 2014.

He published over 150 technical papers and co-authored five books on performance modeling od computer systems, including "Performance By Design" (2004) "Capacity Planning for Web Services" (2002), and "Scaling for E-business" (2000) published by Prentice Hall. He has supervised more 50 PhD theses and MSc dissertations. Prof. Almeida is a full member of the Brazilian Academy of Sciences.

He is currently a Visiting Professor at School of Engineering and Applied Sciences at Harvard University and a Fellow at Berkman Center for Internet & Society.

Prior Sessions

Fall 2015 | Spring 2014 | Fall 2013 | Spring 2013 | Fall 2012 | Spring 2012 | Fall 2011

Copyright © 2012-2015. President and Fellows Harvard University.   |   IQSS   |   Data Privacy Lab   |   Technology Science