Despite decades of research into developing abstract security advice and improving interfaces, users still struggle to make security and privacy decisions. In particular, users often make security and privacy decisions that they are unsure about, that are based on misunderstandings of reality, or that do not reflect their preferences. Prior work suggests that a major cause of these problems is that users do not have the necessary contextual information about themselves and about greater ecosystems to make decisions.

To better support users' decisions, I propose building just-in-time data about a user's own behaviors and situations into security and privacy interfaces. When considered relative to data about greater ecosystems, a single user's own data can help the user make decisions that are objectively more secure or private, that he or she feels more confident about, that reflect a greater awareness of risks, and that better match the user's preferences.

I will examine this premise through one security case study and two privacy case studies. First, I will test the effectiveness of giving users feedback on precisely what they are doing wrong in creating a password. This approach leverages data I have collected through detailed analyses of both the semantic structure and guessability of large data sets of passwords, in addition to my studies of password-strength meters. Second, to counteract users' privacy misunderstandings related to online tracking and help them make privacy decisions that better match their preferences, I will examine the impact of visualizing different abstractions of how a user's own web browsing has been tracked. These visualizations rely on "tracking the trackers," but also build on my qualitative understanding of how users perceive privacy tradeoffs in the context of online behavioral advertising. Third, I will provide average consumers an interactive database that merges data I have collected about over 6,000 U.S. financial institutions' privacy practices with data about those institutions' branch locations. Using this interactive database, I will test whether surfacing this large-scale, comparative privacy information impacts users' willingness to consider switching banks, as well as how a willingness to find a more privacy-protective bank interacts with the logistical barriers of actually switching.

Thesis Committee:
Lorrie Faith Cranor (Chair)
Alessandro Acquisti (Heinz)
Lujo Bauer (ECE/CyLab)
Jason Hong (HCII)
Michael Reiter (University of North Carolina at Chapel Hill)

Copy of Proposal Document

People make a range of everyday decisions about how and whether to share content with different people, across different platforms and services. These sharing decisions can encompass complex preferences and a variety of access-control dimensions. I plan to explore how varied conceptions of privacy, as well as the sharing “ecosystem” created by the platforms and services with which, and people with whom, one shares can shape these sharing decisions. I also plan to look at how the affordances that shape these decisions can fall short, resulting in regret or self-censorship. Drawing on prior and proposed work, I plan to propose and examine improved modalities for sharing on SNSs.

Thesis Committee:
Lorrie Faith Cranor (Chair)
Lujo Bauer (CyLab/ECE)
Laura Dabbish (HCII/Heinz)
Moira Burke (Facebook)

Copy of Proposal Document

In this talk, I will present three techniques that we have built in my group to detect and predict software defects. These techniques leverage software text including manual pages, code comments, and commit messages. The first, called DASE (Document-Assisted Symbolic Execution), leverages natural language processing techniques to extract input constraints automatically from software documents. It then uses the extracted input constraints to guide symbolic execution to improve automated software testing. DASE detects more bugs and covers more code than existing symbolic execution techniques. In addition, I will talk about two of our latest defect prediction techniques---personalized defect prediction and online defect prediction for imbalanced data---which build models to predict defective software file changes. I will also share our experience and lessons learnt in applying and improving the defect prediction techniques in industry.

Lin Tan is an Assistant Professor in the Department of Electrical and Computer Engineering at the University of Waterloo. She received her Ph.D. from the University of Illinois, Urbana-Champaign in 2009.  She is a recipient of the Ontario Early Researcher Award (ERA) and NSERC Discovery Accelerator Supplements (DAS) Award. Lin has served on the IEEE TSE Editor-in-Chief Evaluation/Selection Committee, and is the program co-chairs of ICSME-ERA 2015, ICSE-NIER 2017, and MSR 2017. Her co-authored papers have been nominated for Best Paper Awards at ASE 2013 and ASPLOS 2010, and have been selected for IEEE Micro's Top Picks in 2006.

Prejudices define the biased views we hold of other people.  Our prejudices play a role, both implicitly and explicitly, in every social situation we encounter.  They tell us whom to talk to and whom to stay away from, whom to befriend and whom to bully, whom to treat with reverence and whom to view with disgust.  Prejudices play a role in less mundane social processes as well. In particular, genocide is often motivated by negative prejudices of particular social groups.

Despite the omnipotent impact of prejudice on our lives, existent theory and methods used to understand it are lacking. Theoretical models tend to focus heavily on the cognitive or social dimensions of prejudice, rather than on a joint socio-cognitive explanation.  Methodologically, research on prejudice is based largely on small-scale laboratory experiments and survey data that is costly to obtain.

The first portion of this thesis develops a new mathematical theory of prejudice that incorporates both its cognitive and social dimensions.  The theory provides a parsimonious explanation of prejudice at the individual, group and culture-wide levels.  The second part of this thesis develops two new tools to extract prejudices from existing text corpora, an approach that can provide broader data than laboratory experiments at a much lower cost than surveys.

The final part of this thesis applies the theory and tools I develop to two case studies. In the first, I use a corpora of Twitter data relevant to the Eric Garner and Michael Brown tragedies. I focus on providing a better understanding the “paradox of race” in America today, where few Americans are explicitly racist yet racial inequalities are as strong as ever. In the second case study, I use both Twitter and newspaper data from the “Arab Spring” in an attempt to better understand the socio-cognitive web of prejudices existent in the Arab World.

Thesis Committee:
Kathleen M. Carley (Chair)
Dr. Eric Xing
Jason Hong (HCII)
Lynn Smith-Lovin (Duke University)

Copy of Proposal Document

In an effort to improve security by preventing users from picking weak passwords, system administrators set policies: sets of requirements that passwords must meet. Guidelines for policies have been published by various groups but this guidance has not been empirically verified. In fact, our research group and others have discovered it to be inaccurate.

The goal of this thesis is to provide an improved metric for evaluating the security of password policies, compared to previous machine-learning approaches. We make several major contributions to passwords research. First, we develop a guess calculator framework that automatically learns a model of adversary guessing from a training set of prior data mixed with samples, and applies this model to a set of test passwords. Second, we find several enhancements to the underlying grammar that increase the power of the learning algorithm and improve guessing efficiency over previous approaches. Third, we use the guess calculator framework to study the guessability of passwords under various policies and provide methodological and statistical guidance for conducting these studies and analyzing the results. While much of this thesis focuses on an offline-attack threat model in which an adversary can make trillions of guesses, the methods used here can also be applied to an online-attack model, where the user can only make a small number of guesses before being locked out by the authentication system.

Thesis Committee:
Lorrie Cranor (Chair)
Lujo Bauer (ECE/CyLab)
Nicolas Christin (ECE/CyLab)
Paul C. Van Oorschot (Carleton University)

Copy of Thesis Document

Online deliberation seeks to improve group decision making by accessing diverse expertise and experience, informing and marshaling evidence in a fruitful exchange of ideas. Successful deliberation environments can bring great benefits, such as broadening participation, tapping a greater range of knowledge, testing ideas against each other, and fostering appreciation of other views. However, for large and complex problem spaces that generate extensive discussion, it is difficult to navigate through the contributions and understand the connections between elements of the problem space. Approaches for dynamically structuring a set of contributed ideas, including adding navigation links between related ideas, can help address this by allowing participants and readers to explore the space of ideas and contributions in a way that supports that user’s goals and respects their cognitive capacities to evaluate both ideas and connections between them, without requiring the full cognitive efforts associated with unsorted list presentations. The proposed work focuses on methods for adding such dynamic structure and evaluating the effectiveness of such steps.

The research will integrate insights from computer science and social psychology with latent variable modeling techniques in order to identify the necessary structure in textual data to enable personalized information extraction, summarization, and presentation. The proposed work includes evaluation of algorithms for detecting and adding structural links, experiments to measure specific decisions and algorithms used in structuring and presenting the set of contributions, and staged experimental studies simulating certain aspects of potential deployment environments to measure the value to users of adding such structure as well as to gain and incorporate feedback on the underlying theories. Through these activities, the work will systematically explore the effects of design decisions on participation, navigability and the nature of the deliberation.

Thesis Committee:
James D. Herbsleb (Chair)
Carolyn P. Rosé
Daniel B Neill (Heinz)
Thomas W. Malone (MIT)

Organizations can design themselves to increase their assurance of continued mission(s) execution in contested cyber environments.

How do organizations assess command-level effects of cyber attacks? Leaders need a way of assuring themselves that their organization, people, and information technology (IT) can continue their missions in a contested cyber environment. To do this, leaders should: 1) require assessments be more than analogical or anecdotal; 2) should demand the ability to rapidly model their organizations; 3) identify their organization’s structural vulnerabilities; and 4) have the ability to forecast mission assurance scenarios.

I present a rapid data-to-modeling and assessment approach for organizations to develop and analyze complex models of their people, resources, tasks, knowledge, beliefs, and other characteristics that impact the ability of the organization to continue its mission(s). I integrate graph-theoretic analysis techniques used in the social network analysis research as well as meta-network analysis and research—supporting objective analysis across multiple dimensions rather than simple compliance with risk management frameworks. I demonstrate the migration of these models into agent-based dynamic simulations and examine the impacts of the three most common effects of contested cyber environments—loss of confidentiality, integrity, and availability. I find that most attacks are in the nuisance range and that only multi-prong targeted or severe stochastic attacks cause meaningful failure.

Though performance along multiple measures of performance often decreases during attacks, organizations can put structural and procedural mitigations in place to improve their resilience to these events. Through simulations, I show that structural and functional mitigations are feasible and effective at reducing the impacts of contested cyber environments on the organizations’ performance. I find that organizations can design for resiliency and provide guidelines in how to do so.

Thesis Committee:
Kathleen M. Carley (Chair)
Virgil Gligor
Juergen Pfeffer
Robert Elder (Lt. Gen. USAF Ret)
John Graham (COL, USA, USMA)

Draft Copy of Thesis Document

Programmers often need to backtrack while coding. Here, “backtracking” refers to when programmers go back at least partially to an earlier state either by removing inserted code or by restoring removed code. For example, when some newly added feature does not work as imagined, the programmer might have to backtrack and try something else. When learning an unfamiliar API, programmers often need to try some sequence of object instantiation and method calls, run the program, and backtrack if the result is not as expected. A series of three empirical studies were conducted in order to better understand the backtracking behavior of programmers. The results indicated that backtracking is prevalent in programming, and programmers often face challenges when backtracking. For example, they had difficulties when trying to find all the relevant parts of code to be backtracked or when trying to restore some code they had deleted that later turned out to be needed.

However, programmers only have very limited support for backtracking in today’s tools. The linear undo command can only undo the most recent changes, and loses the undone changes as soon as the programmer makes a single new change after invoking the undo command. Version control systems such as Subversion and Git can also be used for backtracking, but only when the desired code is already committed in the repository. Furthermore, the results from the empirical studies showed that 38% of all the backtrackings are done manually without any tool support and 9.5% are selective, which means that they could not have been performed using the conventional undo command.

To help programmers backtrack more easily and accurately, a novel selective undo mechanism for code editors was devised, and implemented as an IDE plug-in called AZURITE. The core idea is to combine the following mechanisms into a coherent programming tool: a selective undo mechanism for code editors, novel visualizations of the coding history, and a code change history search. AZURITE retains the full fine-grained code change history, and the selective undo mechanism allows users to select and undo one or more isolated edit operations, while appropriately detecting and handling conflicting operations. The visualizations and history search are the user interfaces that help users to select the desired edit operations to be backtracked and express what they remember about the code changes that they want to revert. In a controlled lab experiment, programmers using AZURITE performed twice as fast compared to the control group when completing typical backtracking tasks.

Thesis Committee:
Brad A. Myers (Chair/HCII)
Jonathan Aldrich
Christian Kästner
Emerson Murphy-Hill (North Carolina State University)

Copy of Thesis Document

Organizations undergo deliberate change routinely all over the world. Despite this accumulated experience, these changes rarely produce desired outcomes, resulting frequently in needless and exorbitant expense. These failures occur, in part, because the standard methods for proposing organizational change rely on extensive interviews with self-selected and agenda-driven actors. Because of this weakness, I propose leveraging extant empirical data to instantiate computational organizational theory (COT) models of organizational behavior in a multi-level modeling framework.

This multi-level modeling framework will use the relationships between individuals, groups, resources, knowledge, and tasks to instantiate stochastic agent-based models. These models are used to evaluate current organizational performance, to examine potential futures for the organization, and to explore counterfactual scenarios. Unlike prior models, the represented agents are not all from the same organizational level. Examples of levels include individuals, teams, divisions, and the organization. Each agent reacts to their local state to address perceived weaknesses. These reactions will, themselves, alter the local context of other agents who will then react in turn. This cycle of reactions will produce the organization’s aggregate behavior. Thus, the organization’s behavior will emerge from decision-making at multiple levels of the organization.

To test this new multi-level method’s flexibility, I will apply the framework to two distinct organizational contexts. These two organizational contexts are a) air traffic control processes for the Federal Aviation Administration (FAA), and b) a multi-national’s competitor acquisition. Through demonstration in these two applied contexts, I hope to demonstrate the general utility of the framework and model more realistically organizational behavior and potential dysfunction.

Thesis Committee:
Kathleen Carley (Chair)
Linda Argote (Tepper)
Brandy Aven (Tepper)
Giuseppe (Joe) LaBianca (University of Kentucky)

Copy of Proposal Document

Software-intensive systems are increasingly expected to operate under changing conditions, including varying user needs and workloads, fluctuating resource capacity, and degraded parts. Furthermore, considering the scale of systems today, the high availability demanded of them, and the fast pace at which conditions change, it is not viable to rely mainly on humans to reconfigure, and change systems as needed. Self-adaptive systems aim to address this problem by incorporating mechanisms that allow them to change their behavior and structure to adapt to changes in themselves and in their environment.

Most self-adaptive systems have some form of closed-loop control that monitors the state of the system and its environment, decides if and how the system should be changed, and performs the adaptation if necessary. They rely on a set of adaptation tactics they can use to deal with different conditions. For example, adding a server is a tactic that can deal with increased load in the system, and revoking permissions from a user is a suitable tactic for protecting the system from an insider attack. Although some adaptation tactics are fast and could be considered instantaneous, others are not; that is, they exhibit tactic latency.

Current self-adaptive systems tend to be reactive and myopic. Typically, they adapt in response to changes without anticipating what the subsequent adaptation needs will be. This can result in a suboptimal sequence of adaptations. Furthermore, when deciding how to adapt, they ignore tactic latency, and consequently are not able to favor a fast tactic over a slower one, or to account for the delay before the tactic produces its effect.

I propose an approach that improves self-adaptation effectiveness by integrating timing considerations in a three-pronged approach:

  • latency awareness: explicitly considers how long tactics take to execute, accounting for the delay in producing their effect.
  • proactivity: leverages predictions of the future states of the environment to start adaptation tactics with the necessary lead time, and to avoid unnecessary adaptations.
  • concurrent tactic execution: exploits non-conflicting tactics to speed up adaptations, and to complement slow tactics with others that can produce intermediate results sooner.

Thesis Committee:
David Garlan (Chair)
Mark Klein (SEI)
Clair Le Goues
Sam Malek (George Mason University)

Copy of Proposal Document


Subscribe to ISR