Workshop | Topics: 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | Other | References | Mailing list | Welcome

Topic 2. Interplay with data sharing.

What are the best practices for a researcher who is engaging in a re-identification experiment to engage with those who provided the data? What are the responsibilities to communities of people who may be affected by the re-identification or to individuals who were not in the study? What happens if participants consent to participate or whether participants are unaware of the dataset? How does a researcher involved in re-identification compare to commercial entities that may be exploiting the data? Examples for consideration include: research datasets, government mandated datasets, and industry contest data.


Post 1
A researcher who is trying to determine whether the data is identifiable is not using the
data for monetary gain, but they still run the risk of publishing their methods to those
who would misuse those methods. So we would want the data to be modified before
methods were published.

Post 2
A researcher should directly contact those people included in the data to seek
their consent before publishing. A researcher is not exploiting the data for
commercial gain.

Post 3
If the researcher is interested in publishing the data, they should be in direct contact
with those people whose data is involved, regardless of whether or not they knew
beforehand that they were in the set. If people may be affected and are in the dataset,
they should be contacted (within reason), and if people, a very brief form of contact or
publishing should be used for those who may be affected but are not in the study.

If a researcher is "exploiting" the data for publishing purposes, he may be gaining from
the data, but with the explicit purpose of correcting the system and protecting those
people whom he or she is "exploiting".

Post 4
Data should be released in aggregate form (i .e. we were able to reidentify 13 out of 20
health records) Specifics can be given from researchers to interested entities only if
those entities sign a legally binding agreement not to abuse the data.

Your responsibility to the community is to make it clear that data can be reidentified.

Post 5
This seems like a very slight expansion on Topic 1. My answer still stands (good
luck guessing which one it was).

Post 6
Those who are in the dataset should be contacted. Doing less than that would border on exploitation.

Post 7
Same as Topic 1

Post 8
When publishing any documents on the research work, researchers have a
responsibility to keep the re-identified data as secure as possible (i.e they shouldn't
release all the data to the general public.) The responsibility to the community is to
make it clear that the current information that is being released by entities is re-
identifiable.

Post 9
Publicly, research data should only be released on the aggregate scale—specifics and
data points that contribute to re-identification should be redacted. All subjects should
be informed of their participation and the findings of the study, as well as the methods
used. If other researchers wish to use the data, they should follow criteria mandated by
HIPAA in redacting data and limiting its use through mutual agreement, in addition to
the above standards of informing research subjects.

Post 10
At some point a re-identifier should use their information to correct problems of privacy violation.
Commercial entities aren't to the same standards but they also don't have the same weight of truth in their presentations. Researchers are in the business of sharing information, corporations are just trying to use it to profit.

Post 11
The researcher might be able to directly contact the people who
have their data exposed and alert them of this problem.
Communities not affected by the exposure should not exploit those
exposed. If participants are unaware of the vulnerabilities, then
they might be exploited without knowing how or why. Researchers
have have better intentions than commercial entities.

Post 12
I think the important thing is to talk to those who provided the data and perhaps to
those who may be affected by the re-indentification without necessarily opting in. If
they consent to be part of the study, that changes the dynamic. A researcher isn't
necessarily doing this all for profit. This is one of the debates relating to NSA: it is
better to have the government hold the information or some commercial entity that
doesn't have the same responsibility...

Post 13
Basic principles same as Topic 1.
In terms of comparison to a commercial entity, they have a greater incentive to use the data to their own commercial incentive. After all, they use the data to increase their commerce, whereas researchers seek to publish and publicize their methods for the common good.



IQSS  |    Data Privacy Lab  |    Silent Spring Institute   |    Northeastern University