Canada’s main private-sector privacy law, the Personal Information Protection and Electronic Documents Act (PIPEDA) is currently under revision by the Office of the Privacy Commissioner. A discussion paper published by the Commissioner suggests a variety of potential solutions to new privacy issues catalyzed by an acceleration of the collection, retention, use, and disclosure of personal data often described as ‘big data.’ Along with improved consent practices and new ethical assessments, de-identification is a proposed solution. We would agree that risk-based de-identification can be used effectively to protect privacy in big data contexts.
A variety of sophisticated new data liberation technologies allow users access to data while masking or erasing the identity of the data source, utilizing de-identification techniques such as tokenization or anonymization. Optimally used with automated risk analysis tools, de-identification allows both ongoing utilization of data and protection of individual privacy.
De-identification does have limits at present. Most current technologies focus on the protection of text records. Given the proliferation of recording technologies (such as smart phone cameras, Google Glass, or drones) future privacy-bolstering technologies will need to adapt to different kinds of content, and an individual’s rights therein. For example:
- Video privacy: Does an individual consent to be photographed or filmed? If not, privacy-bolstering technology could allow the image to be masked or erased.
- Audio privacy: Does an individual consent to be recorded? If not, privacy-bolstering technology could allow the relevant part of the recording to be masked or erased.
Beyond technological limitations, however, there are certain ethical discussions about the use of de-identified data that need to take place. Effective de-identification may completely conceal individual identities, but the ways in which data is used have broad impacts on society. Conclusions drawn on the basis of big data analytics affect marketing decisions, media coverage, and corporate and government policy. Apart from the question of individual privacy, we suggest that big data should be seen as a common good. Just as corporations need a “social license” to exploit publicly owned natural resources, they should also be required to engage in meaningful public consultation about the uses of big data.