Technology in Government (TIG) | Topics in Privacy (TIP) |
Topics are usually not posted earlier than the week before.
Michael Fertik founded Reputation.com with the belief that people and businesses have the right to control and protect their online reputation and privacy. Credited with pioneering the field of online reputation management (ORM), Fertik is lauded as the world's leading cyberthinker in digital privacy and reputation. Michael was most recently named Entrepreneur of the Year by TechAmerica, an annual award given by the technology industry trade group to an individual they feel embodies the entrepreneurial spirit that made the U.S. technology sector a global leader.
He is a member of the World Economic Forum Agenda Council on the Future of the Internet, a recipient of the World Economic Forum Technology Pioneer 2011 Award and through his leadership, the Forum named Reputation.com a Global Growth Company in 2012. Fertik is an industry commentator with guest columns in Harvard Business Review, Reuters, Inc.com and Newsweek. He frequently appears on national and international television and radio, including BBC, Good Morning America, Today Show, Dr. Phil, CBS Early Show, CNN, Fox, Bloomberg, and MSNBC. He is also co-author of the bestselling book, "Wild West 2.0" and "The Reputation Economy" (Crown, forthcoming in 2013).
Fertik founded his first Internet company while at Harvard College. He received his JD from Harvard Law School.
This is a two-part discussion. Part I: How can we know how much racial animus costs a black candidate if few will admit
such socially unacceptable attitudes to surveys? I suggest a new source to proxy an
area's prejudice: Google search queries. I compare the proxy --the percent of an area's
Google searches that include racially charged language -- to Barack Obama's 2008
and 2012 vote shares, controlling for the vote share of the 2004 Democratic presidential candidate,
John Kerry. Previous research using a similar specification but survey proxies for racial attitudes
yielded little evidence that racial attitudes affected Obama.
An area's racially charged search rate, in contrast, is a robust negative predictor of
Obama's vote share. Continuing racial animus in the United States appears to have
cost Obama roughly four percentage points of the national popular vote in both 2008
and 2012, giving his opponent the equivalent of a home-state advantage nationally.
Part II: Google searches prior to an election can be used to predict turnout in different parts of the United States.
Change in October search volume for "vote/voting" over a four year period explains 20-40 percent of state-level change in turnout rates.
The predictive power is little affected by changes in registration rates or early votes over the same period. This information might prove useful in predicting candidate performance beyond what is contained in polls.
Seth Stephens-Davidowitz is an economics PhD student at Harvard University.
Society is capturing and constructing data on scales that could not possibly have been imagined previously,
so much so that many of these data flows are never-ending streams containing huge volumes of information.
These are not merely what others refer to as "big data" but tend to include the largest of these,
"very big data". In order for society to reap many of the potential benefits from very big data,
we have to rethink the way researchers engage with data. For example, researchers have routinely
worked with small, self-contained datasets that can be copied and shared online with ease.
But you cannot just copy voluminous continuous real-time streaming data. Researchers need
new tools, methods and practices. Attempting to use historical approaches is like trying
to get a sip of water, not from a water fountain, but from a hydrant! In this TIG session,
we brainstorm on the tools, methods, workflows, and protections needed.
Discussant: Dr. Merce Crosas is the Director of Product Development for the Dataverse,
an open-source application for sharing, citing, archiving, discovering and analyzing
research data, created by the Institute of Quantitative Social Science (IQSS) at Harvard
and used in installations throughout the world, including most recently at Harvard Library
to assist researchers with curation and management of research data. The Dataverse at
IQSS currently houses the world's largest collection of social science research data,
hosting more than 51,000 studies having 719,000 files.
As very big data sets become increasingly common, the Dataverse Network application
will need to support them to continue fulfilling its mission of making data accessible
for research. In this discussion we'll brainstorm on technologies, workflows
and architecture that can help to make this possible. Should computations run locally
or in the cloud? Will new non-relational databases be useful (Cassandra, MongoDB, neo4J,
etc)? Should it support different type of storages? What are the use cases for different
data types and for different types of researchers? ... and what questions are we missing?
Discussant: Michael Bar-Sinai is a software engineer and a PhD student in Ben-Gurion
University of the Negev, Israel. He became interested in the moral implications of
software systems after developing an evaluation system for a human resources department.
He later insisted that the system will never be used.
In the wake of the 2008 real estate market crash and ensuing Great Recession, there have
been growing calls for greater transparency and accountability in the US mortgage market.
The policy response to these calls has taken the form of new and expanded public-use
government mortgage datasets, comprised of individual loan-level data. Fannie Mae and
Freddie Mac, the Securities and Exchange Commission, and the new Consumer Financial
Protection Bureau have all been tasked with launching enhanced mortgage datasets
for public consumption, which could include data on individual borrowers' credit scores,
income, age, race, and gender.
Though the expansion in publicly available government data on mortgage lending promises
to enhance research on and supervision of market activity, including the incidence of
discriminatory lending, the data may also introduce new privacy concerns. These concerns
include an increased risk - and resulting harm - of re-identifying individual borrowers
from government mortgage data. Government agencies therefore face a trade-off between
releasing the most useful mortgage data and protecting consumer privacy. How should
government agencies with public data mandates address these privacy issues,
particularly in a world where government and non-government data on individuals
is increasingly prevalent?
Discussant: Sarah Ellis is a Master in Public Policy student at Harvard's Kennedy
School of Government. She has previously worked on mortgage data policy at the
Consumer Financial Protection Bureau. Her capstone research project focuses on
addressing privacy risk in public use mortgage data.
References:
Executives in technology, retail, marketing and other industries like to say that data is "the new oil" or, at least, the fuel that powers the Internet economy. It is a metaphor that casts consumers as natural resources with no say over the valuable commodities that companies extract from them. Yet this data extraction is often opaque to consumers and largely unregulated. To give readers some insight into the data economy, The New York Times' last year published an investigative series, called "You for Sale," in which we examined different industries that collect, analyze, use and sell information about consumers. The series help prompt separate investigations by the United States House of Representatives, U.S. Senate, Government Accountability Office, and the Federal Trade Commission. This year, the series won an award for personal finance reporting from the Society of American Business Editors and Writers.
Discussant: Natasha Singer is a reporter in the Sunday Business section of The New York Times where she covers the business of consumer data. She was previously a reporter in the Business section covering the pharmaceutical industry and professional medical ethics. In 2010, she was a member of a team of New York Times reporters whose series on cancer was a finalist for a Pulitzer Prize in explanatory reporting.
@natashanyt
References:
Statistical agencies that disseminate public use micrdata, i.e., data on individual records, seek to do so in ways that (i) protect the confidentiality of data subjects' identities and sensitive attributes, (ii) support a wide range of analyses, and (iii) are easy for secondary data users to work with. One approach is to release data sets in which confidential values are replaced with draws from statistical models. These models are estimated so as to preserve as much structure in the data as possible. This approach has been used to release public use data in, for example, the Longitudinal Business Database, the Survey of Income and Program Participation, OnTheMap, and the American Community Survey group quarters data. In this talk, I review the ideas underpinning the approach, including a discussion of recent applications and methods for generating simulated data sets. I highlight open research areas related to disclosure risk estimation and utility assessment. In these contexts, I also describe some of the research activities of the Triangle Census Research Network (sites.duke.edu/tcrn/), an NSF-sponsored research center developing new methodology for dissemination of federal (and other) statistical data.
Discussant: Jerry Reiter is the Mrs. Alexander Hehmeyer Associate Professor of Statistical Science at Duke University. He received his PhD in statistics from Harvard University in 1999. He works extensively with the U. S. Census Bureau and other federal agencies on methods for protecting confidentiality in public use data and on methods for handling missing/faulty data. He supervised the creation of the synthetic Longitudinal Business Database, the first establishment-level, unrestricted public use database on business establishments in the U. S.
Our goal is to create more inclusive, meaningful civic participation to strengthen democratic governance and individual liberty. Voting is currently plagued by problems, including declining participation, voter disaffection, and tribulations at the polls. We propose a groundbreaking framework that we believe will provide results that will be more truly representative of the electorate. The system behind our approach is more dynamic, agile and flexible than existing systems and will permit the entry of new participants. It leverages traditional paper election systems with random sampling and modern end-to-end user-verified audit capabilities. Each voter can confirm that her vote was counted correctly. Anyone can verify the election results. These mechanisms increase accuracy, integrity, and confidentiality, while also encouraging voter participation and engagement. Our approach enables the will of the people to be expressed at much lower cost and with more validity and much greater discernment. These new practices will enhance democracy through innovations in governance, better participation in decision making, and improved self-determination and collective action.
Discussants: Dr. Richard T. Carback III (https://carback.us/rick) is a Principal Member of Technical Staff in the Network and Information Concepts group at Charles Stark Draper Laboratories (https://draper.com/). He has over a decade of R&D experience in computer security and was a key technical contributor to the Punchscan (https://punchscan.org) and Scantegrity (https://scantegrity.org/) voting systems. Punchscan was the first end-to-end voter verifiable system implementation used in a binding election, which was held at the University of Ottawa in 2007. Scantegrity was the first such system to be used in a binding public election, held in Takoma Park, Maryland, in 2009.
Deborah Hurley received the Namur Award of the International Federation of Information Processing in recognition of outstanding contributions, with international impact, to awareness of social implications of information technology. She is the author of Pole Star: Human Rights in the Information Society, "Information Policy and Governance" in Governance in a Globalizing World, and other publications. At the Organization for Economic Cooperation and Development, in Paris, France, she was responsible for drafting, negotiation and adoption of the OECD Guidelines for the Security of Information Systems. Hurley is Chair, Board of Directors, Electronic Privacy Information Center (EPIC). She directed the Harvard University Information Infrastructure Project and carried out a Fulbright study in Korea.
A Double Feature:
Discussant:
Robert Gellman is a privacy and information policy consultant in Washington, D.C., specializing in health
confidentiality policy, privacy and data protection, and Internet privacy. Clients have included federal
agencies, Fortune 500 companies, trade associations, advocacy groups, foreign governments, and others.
A graduate of the Yale Law School, Gellman served for 17 years as chief counsel to the Subcommittee
on Government Information in the House of Representatives. He maintains a webpage with many documents
and other useful resources at https://www.bobgellman.com. He is coauthor of ONLINE PRIVACY A Reference
Handbook published by ABC-CLIO in 2011.
Emmanual Guillory is Staff Assistant to Rep. Joe Barton (TX-6) and talks about what life is like working in Congress.
Multiple-Criteria Decision-Making methods are commonly used when a complex decision, involving sometimes conflicting criteria, has to be made. Many structured methods have been developed. The AHP (Analytic hierarchy process) is one of the most-used in practice and has been applied in fields such as public health, evaluation, e-government or defense analysis. We argue that besides the classical AHP issues (such rank-reversal or lack of normative foundations), special care has to be taken to deal with the psychological distortions that affect the AHP pairwise comparisons.
Discussant:
Christine Choirat is the Director of Research and an Associate Professor of Quantitative Methods at the School of Economics and Business Administration of the University of Navarra (Pamplona, Spain).
www.nytimes.com/2012/04/29/business/on-the-data-sharing-trail-try-following-the-breadcrumbs.html
www.nytimes.com/2012/07/22/business/acxiom-consumer-data-often-unavailable-to-consumers.html
www.nytimes.com/2012/07/25/technology/congress-opens-inquiry-into-data-brokers.html
www.nytimes.com/2012/10/11/technology/senator-opens-investigation-of-data-brokers.html
www.nytimes.com/2012/08/19/business/electronic-scores-rank-consumers-by-potential-value.html
www.nytimes.com/2012/11/18/technology/your-online-attention-bought-in-an-instant-by-advertisers.html
www.nytimes.com/2012/10/14/technology/do-not-track-movement-is-drawing-advertisers-fire.html
www.nytimes.com/2012/12/19/technology/ftc-opens-an-inquiry-into-data-brokers.html
www.nytimes.com/2013/02/03/technology/consumer-data-protection-laws-an-ocean-apart.html
www.nytimes.com/2013/02/03/business/q-and-a-with-viviane-reding.html