At the core of Machine Learning (ML) analytics is often an expert-suggested model, whose parameters are refined by iteratively processing a training dataset until convergence. The completion time (i.e. convergence time) and quality of the learned model not only depends on the rate at which the refinements are generated but also the quality of each refinement. While data-parallel ML applications often employ a loose consistency model when updating shared model parameters to maximize throughput, the accumulated error may seriously impact the quality of refinements and thus delay completion time, a problem that usually gets worse with scale. Although more immediate propagation of updates reduces the accumulated error, this strategy is limited by physical network bandwidth.

In this talk, I will present Bosen, a system that maximizes the network communication efficiency to minimize such inconsistency error by fully utilizing the inter-machine network bandwidth under a given budget and prioritizing updates that are most significant to convergence. Experiments on various ML applications showed 2-3X improvements in convergence time compared to the previous state-of-the-art synchronization mechanism.

*Presented in Partial Fulfillment of the CSD Speaking Skills Requirement. *

# CSD

Motivated by these observations, we formalize axioms that any node classification algorithm should obey and propose NetConf which satisfies these axioms and handles arbitrary network effects (homophily/heterophily) at scale. Our contributions are: (1) Axioms: We state axioms that any node classification algorithm should satisfy; (2) Theory: NetConf is grounded in a Bayesian-theoretic framework to model uncertainties, has a closed-form solution and comes with precise convergence guarantees; (3) Practice: Our method is easy to implement and scales linearly with the number of edges in the graph. On experiments using real world data, we always match or outperform BP while taking less processing time.

*Presented in Partial Fulfillment of the CSD Speaking Skills Requirement. *

Neural networks are machine learning models represented by continuous programs of many parameters. Those parameters are generally optimized via gradient descent and related methods. However, those methods are indeed limited to tuning parameters of a given program. Choosing the program itself is an open problem. Programs are generally chosen by expert judgement, for computational convenience or by brute force search.

I present "nonparametric neural networks", a framework for jointly optimizing program parameters and structure. Under this framework, we alternate between random local changes to the program and gradient-based parameter tuning and thus search over both components jointly. The search is enabled by defining a connected, continuous space over all program-parameter pairs, by penalizing programs according to their size, and by several optimization tricks.

*Presented in Partial Fulfillment of the CSD Speaking Skills Requirement. *

The theory of approximation algorithms has seen great progress since 90's, and the optimal approximation ratio was revealed for many fundamental combinatorial optimization problems. Despite this success for individual problems, our understanding is not complete to have a unified understanding of each class of problems. One of the most notable exceptions is an important subclass of CSPs called MAX CSP, where there is a simple semidefinite programming based algorithm provably optimal for every problem in this class under the Unique Games Conjecture. Such a complete understanding is not known for other basic classes such as coloring, covering, and graph cut problems.

This thesis tries to expand the frontiers of approximation algorithms, with respect to the range of optimization problems as well as mathematical tools for algorithms and hardness. We show tight approximabilities for various fundamental problems in combinatorial optimization beyond Max CSP. It consists of the following five parts:

1. CSPs: We introduce three variants of MAX CSP, called Hard CSP, Balance CSP, and Symmetric CSP. Our results show that current hardness theories for Max CSP can be extended to its generalizations (Hard CSP, Balance CSP) to prove much stronger hardness, or can be significantly simplified for a special case (Symmetric CSP).

2. Applied CSPs: Numerous new optimization problems have emerged since the last decade, as computer science has more actively interacted with other fields. We study three problems called Unique Coverage, Graph Pricing, and decoding LDPC codes, motivated by networks, economics, and error-correcting codes. Extending tools for Max CSP, we show nearly optimal hardness results or integrality gas for these problems.

3. Coloring: We study complexity of hypergraph coloring problems when instances are promised to have a structure much stronger than admitting a proper 2-coloring, and prove both algorithmic and hardness results. For both algorithms and hardness, we give unifed frameworks that can be used for various promises.

4. H-Transversal: We study H-Transversal, where given a graph G and a fixed "pattern" graph H, the goal is to remove the minimum number of vertices from G to make sure it does not include H as a subgraph. We show an almost complete characterization of the approximability of H-Transversal depending on properties of H. One of our algorithms reveals a new connection between path transversal and graph partitioning.

5. We also study various cut problems on graphs, where the goal is to remove the minimum number of vertices or edges to cut desired paths or cycles. We present a general tool called "length-control dictatorship tests" to prove strong hardness results under the Unique Games Conjecture, which allow us to prove improved hardness results for multicut, bicut, double cut, interdiction, and firefigher problems.

**Thesis Committee: **

Venkatesan Guruswami (Chair)

Anupam Gupta

Ryan O'Donnell

R. Ravi

Ola Svensson (EPFL)

Concurrent programming presents a challenge to students and experts alike because of the complexity of multithreaded interactions and the difficulty to reproduce and reason about bugs. Stateless model checking is a concurrency testing approach which forces a program to interleave its threads in many different ways, checking for bugs each time. This technique is powerful, in principle capable of finding any nondeterministic bug in finite time, but suffers from exponential explosion as program size increases. Checking an exponential number of thread interleavings is not a practical or predictable approach for programmers to find concurrency bugs before their project deadlines.

In this thesis, I propose to make stateless model checking more practical for human use by way of several new techniques. I have built Landslide, a stateless model checker specializing in student projects for undergraduate operating systems classes. Landslide includes a novel algorithm for automatically managing multiple state spaces according to their estimated completion times, which I will show quickly finds bugs should they exist and also quickly verifies correctness otherwise. I will evaluate Landslide's suitability for inexpert use by presenting the results of many semesters providing it to students in 15-410, CMU's Operating System Design and Implementation class. Finally, I will explore broader impact by extending Landslide to test some real-world programs and to be used by students at other universities.

**Thesis Committee: **

Garth Gibson (Chair)

David Eckhardt

Brandon Lucia

Haryadi Gunawi (University of Chicago)

Copy of Thesis Summary

Duolingo has become the #1 education app on both Google Play and the Apple App Store. Since its public launch five years ago, the company has created three independent apps on multiple platforms and attracted over 170 million users worldwide.

Much of this success can be attributed to a data-centric and incremental approach to improving the app. Everything is A/B tested. Though user retention and daily active users are fairly straightforward to measure, a challenge that has presented itself over the years is how to measure how well people learn a language on our app.

*Presented in Partial Fulfillment of the CSD Speaking Skills Requirement *

*Motivation:* Most methods for reconstructing response networks from high throughput data generate static models which cannot distinguish between early and late response stages.

*Results:* We present TimePath, a new method that integrates time series and static datasets to reconstruct dynamic models of host response to stimulus. TimePath uses an Integer Programming formulation to select a subset of pathways that, together, explain the observed dynamic responses. Applying TimePath to study human response to HIV-1 led to accurate reconstruction of several known regulatory and signaling pathways and to novel mechanistic insights. We experimentally validated several of TimePaths predictions highlighting the usefulness of temporal models.

*Presented in Partial Fulfillment of the CSD Speaking Skills Requirement *

Though immutability has been long-proposed as a way to prevent bugs in software, little is known about how to make immutability support in programming languages effective for software engineers. We designed a new formalism that extends Java to support transitive class immutability, the form of immutability for which there is the strongest empirical support, and implemented that formalism in a tool called Glacier. We applied Glacier successfully to two real-world systems. We also compared Glacier to Javas final in a user study of twenty participants. We found that even after being given instructions on how to express immutability with final, participants who used final were unable to express immutability correctly, whereas almost all participants who used Glacier succeeded. We also asked participants to make specific changes to immutable classes and found that participants who used final all incorrectly mutated immutable state, whereas almost all of the participants who used Glacier succeeded. Glacier represents a promising approach to enforcing immutability in Java and provides a model for enforcement in other languages.

*Presented in Partial Fulfillment of the CSD Speaking Skills Requirement*

Given a fixed graph H with k vertices, H-Transversal is the problem to remove the minimum number of vertices from a large graph G so that G does not have H as a subgraph. This problem captures many fundamental combinatorial optimization problems as special cases, especially when H is a clique, a cycle, or a path.

We study how approximability of H-Transversal changes depending on H. A simple greedy algorithm achieves k-approximation whenever H has k vertices. We show that if H is 2-vertex-connected, we cannot have better than a (k-1)-approximation algorithm unless NP is contained in RP. We also prove that among 1-vertex-connected graphs, both k-Path Transversal and k-Star Transversal admit an O(log k)-approximation algorithm. The algorithm for k-Path Transversal reveals an interesting connection between H-Transversal and graph partitioning.

*Presented in Partial Fulfillment of the CSD Speaking Skills Requirement*

Given a k-ary predicate P, a random instance of a constraint satisfaction problem with predicate P (CSP(P)) is a set of m constraints on n variables. Each constraint is P applied to k random literals. Such an instance is unsatisfiable with high probability when m >> n. Refutation is certifying that a given randomly chosen instance is unsatisfiable. This task has applications in cryptography, hardness of approximation, and learning theory. This thesis studies refutation using sum-of-squares (SOS) proof systems. SOS is a sequence of increasingly powerful proof systems parameterized by degree: the higher the degree, the more powerful the proof system. On the other hand, finding an SOS proof requires time exponential in the degree.

First, we consider refutation via constant-degree SOS proofs, which can be found in polynomial time. We show that the number of constraints required for constant-degree SOS refutation of CSP(P) is determined by a complexity parameter C(P), the minimum t for which there is no t-wise uniform distribution over {0,1}^k supported on satisfying assignments to P. With Allen and O'Donnell, we proved that constant-degree SOS can refute CSP(P) when m = O~(n^{C(P)/2}). With Kothari, Mori, and O'Donnell, we showed a nearly matching lower bound.

More generally, we consider delta-refutation, certifying that at most a 1-delta fraction of constraints can be simultaneously satisfied. We also consider SOS proofs with arbitrary, possibly superconstant, degree. With Allen and O’Donnell, we showed that if every *t*-wise uniform distribution is delta-far from every distribution supported on satisfying assignments to *P*, then constant-degree SOS can (delta-o(1))-refute CSP(P) with m = O~(n^{t/2}). For such *P*, this can be extended using a result of Raghavendra, Rao, and Schramm to obtain (delta-o(1))-refutation with m = \Delta n constraints via degree-O~(n/Delta^{2/(t-2)}) SOS. With Kothari, Mori, and O’Donnell, we proved that (delta+o(1))-refutation of CSP(P) with Delta n constraints requires SOS degree Omega~(n/Delta^{2/(t-1)}) when there exists a t-wise uniform distribution that is delta-close to being supported on satisfying assignments to *P*. These results establish a three-way trade-off among number of constraints, SOS degree, and refutation strength delta that is tight up to logarithmic factors.

**Thesis Committee:**

Anupam Gupta (Co-Chair)

Ryan O’Donnell (Co-Chair)

Alan Frieze

Boaz Barak (Harvard University)

Eli Ben-Sasson (Technion -- Israel Institute of Technology)