It's been five months since the General Data Protection Regulation (GDPR) came into effect in the European Union, setting strict rules on the collection and processing of people’s personal data.
Last week, CyLab's Norman Sadeh, a professor in the Institute for Software Research in the School of Computer Science and co-director of the Privacy Engineering program, spoke about privacy, artificial intelligence (AI), and the challenges at the intersection of the two at the International Conference of Data Protection and Privacy Commissioners (ICDPPC).
CyLab: Can you give us a taste of what the International Conference of Data Protection and Privacy Commissioners was all about?
Sadeh: As its name suggests, ICDPPC is the big international conference where once a year regulators and other key stakeholders from around the world come together to discuss privacy regulation and broader challenges associated with privacy.
You served as a panelist on one of the plenary panels titled, "Right vs. Wrong." What exactly was discussed?
Sadeh: This panel was aimed at broadening the scope of privacy discussions beyond just regulation and address deeper, more complex ethical issues related to the use and collection of data. I discussed how our research has shown that getting the full benefits of existing regulations, whether GDPR or the recently passed California Consumer Protection Act, is hampered by complex cognitive and behavioral limitations we people have. I talked about the technologies our group has been developing to assist users make better informed privacy decisions and overcome these limitations.
How exactly do you define what's right vs. what's wrong?
Sadeh: When people discuss ethics, they generally refer to a collection of principles that include basic expectations of trustworthiness, transparency, fairness, and autonomy. As you can imagine, there is no single definition out there and this list is not exhaustive.
In my presentation, I discussed the principles and methodologies our group uses to evaluate and fine-tune technologies we develop, and how we ultimately ask ourselves whether a user is better off with a given configuration of one or more technologies. This often involves running human subject studies designed to isolate and quantify the effects of those technologies.
Examples of privacy technologies we have been developing range from technologies to nudge users to more carefully reflect on privacy decisions they need to make, to machine learning techniques to model people’s privacy preferences and help them configure privacy settings. They also include technologies to automatically answer privacy questions users may have about a given product or service.
When people discuss ethics, they generally refer to a collection of principles that include basic expectations of trustworthiness, transparency, fairness, and autonomy.Norman Sadeh, professor in the Institute for Software Research and co-director of CMU's Privacy Engineering program
Can you talk about the context in which this conference took place? What kinds of privacy regulations have we seen go into effect this year, and what other regulations might we see in the future?
Sadeh: This conference took place in a truly unique context. People’s concerns about privacy have steadily increased over the past several years, from the Snowden revelations of a few years ago to the Cambridge Analytica fiasco exposed earlier this year. People have come to realize that privacy is not just about having their data collected for the sake of sending them better targeted ads, but that it goes to the core of our democracy and how various actors are using data they collect to manipulate our opinions and even influence our votes. The widespread use of artificial intelligence and how it can lead to bias, discrimination, and other challenges is also of increasing concerns to many.
A keynote presentation at the conference by Apple CEO, Tim Cook, as well as messages from Facebook's Mark Zuckerberg and Alphabet Sundar Pichai also suggest that big tech may now be in favor of a sweeping US Federal privacy law that would share some similarities with the EU GDPR. While the devil is in the details, such a development would mark a major shift in the way in which data collection and use practices are regulated in the US, with many technologies being by and large unregulated today.
How does your research inform some of these types of discussions?
Sadeh: Research is needed on many fronts, from developing a better understanding of how new technologies negatively impact people’s expectations of privacy to how we can mitigate the risks associated with undesirable inferences made by data mining algorithms.
At the conference, I focused on some of the research we have conducted on modeling people’s privacy preferences and expectations and how we have been able to develop technologies that can assist users in making better informed decisions at scale.
The widespread use of artificial intelligence and how it can lead to bias, discrimination, and other challenges is also of increasing concerns to many.Norman Sadeh, professor in the Institute for Software Research and co-director of CMU's Privacy Engineering program
How do you address the scale at which data is collected and the complexity of the value chains along with data travels?
Sadeh: Regulations by themselves, such as more transparent privacy policies and offering users more control and more privacy settings, are important but not sufficient to empower users to regain control over their data at scale. I strongly believe that our work on using AI to build privacy assistants can ultimately make a very big difference here.
People are just unable to read the privacy policies and configure the settings associated with the many technologies with which they interact on a typical day. There is a need for intelligent assistants that can help them zoom in on those issues they care about, can answer questions they have, and can help them configure settings.
What do you see as the main challenges for privacy in the age of AI and IoT?
Sadeh: A first issue is the scale at which data is collected and the diverse ways in which it is used. A second challenge has to do with the difficulty of controlling the inferences that can be made by machine learning algorithms. A third challenge in the context of IoT is that we don’t have any mechanisms today to even advertise the presence of these technologies – think cameras with computer vision or home assistants, let alone expose privacy settings to people who come in contact with these technologies.
For instance, how is a camera supposed to allow users to opt in or opt out of facial expression recognition technology, if the user does not even know the camera is there, doesn’t know that facial expression recognition algorithms are processing the footage, and has no user interface to opt in or out?
If I had to identify one final challenge, I would emphasize the need to help and train developers do a better job when it comes to adopting privacy-by-design practices, from data minimization practices all the way to more transparent data practice disclosures.