Computer Science Thesis Oral

  • Gates Hillman Centers
  • Traffic21 Classroom 6501
  • SHAYAK SEN
  • Ph.D. Student
  • Computer Science Department
  • Carnegie Mellon University
Thesis Orals

Accountable Information Use in Data-Driven Systems

Increasingly, decisions and actions affecting people's lives are determined by automated systems processing personal data.  Excitement about these systems has been accompanied by serious concerns about their opacity and the threats that they pose to privacy, fairness, and other values.  Recognizing these concerns, it is important to make real-world automated decision-making systems accountable for privacy and fairness by enabling them to detect and explain violations of these values. System maintainers may leverage such accounts to repair systems to avoid future violations with minimal impact on the utility goals.

In this dissertation, we provide a basis for developing accounting tools for information use in data-driven systems. We use the term ``accounting tools'' to refer to mechanisms that explain the behavior of the system under study.  These explanations increase trust in the functioning of the system, allowing us to verify that they make not only right decisions but also for justifiable reasons. Further, explanations can be used to support detection of privacy and fairness violations, as well as explain how they came about. We can then leverage this understanding to repair systems to avoid future violations.

Our approach to accounting for information use of complex data-driven systems involves answering two questions:  (influence) Which factors were influential in determining outcomes?, and (interpretation) What do these factors mean? We first present key results measuring the causal influence of factors in data-driven systems. We then examine the following settings: (i) systems with potential indirect use of information, (ii) convolutional neural networks, and (iii) large data processing pipelines. For each setting we demonstrate how influence and interpretation combine to account for information use.


Thesis Committee:
Anupam Datta (Chair)
Jaime Carbonell
Matt Fredrikson
Sriram Rajamani (Microsoft Research)
Jeannette Wing (Columbia University)

For More Information, Please Contact: 
Keywords: