# Seminar on Theory of Computing - Seminář z teoretické informatiky

**Organizers:** Ondřej Čepek, Pavel Hubáček, Petr Kolman, Michal Koucký, Jiří Sgall, Pavel Veselý

**Contact:** <sgall@iuuk.mff.cuni.cz>

**Lecture code:** NTIN102 - 0/2 Z

**Time:** Tue 12:00-14:00

**Place:** S8, Malá Strana

A seminar presenting ongoing research projects of participants and related recent progress within theoretical computer science. Attendees are expected to present their research or some (recent) exciting results, and follow-up discussion is expected. The seminar starts with pizza at 12:00, which will be provided for the participants, followed by the actual talk at 12:20.

## Spring semester 2024

This semester, we plan again a mix of talks on new results of the participants and expositions of recent papers in TCS by students. The first seminar will take place on February 20, 2024.

### Next program

### 9. 4. 2024

**Pavel Hrubeš (Czech Academy of Sciences): Sum-of-squares composition formulas**

A classical problem of Hurwitz asks for a characterisation of identities of the form (x_1^2+..+x_n^2)(y_1^2+...+y_m^2)= f_1^2+...+f_s^2, where f_1,...,f_s are bilinear in x and y. I will overview several aspects of this problem, including the elegant Hurwitz-Radon theorem, and discuss its connection to complexity theory

### Papers to be covered

**Antonios Antoniadis, Ruben Hoeksma, Sándor Kisfaludi-Bak, Kevin Schewior: Online Search for a Hyperplane in High-Dimensional Euclidean Space.** IPL 2022.

**Nikhil Bansal, John Kuszmaul, William Kuszmaul: A Nearly Tight Lower Bound for the d-Dimensional Cow-Path Problem. **IPL 2023.

**Calvin Beideman, Karthekeyan Chandrasekaran, Weihang Wang: Approximate minimum cuts and their enumeration. **SOSA 2023.

** Li Chen, Rasmus Kyng, Maximilian Probst Gutenberg, Sushant Sachdeva: ****A Simple Framework for Finding Balanced Sparse Cuts via APSP.** SOSA 2023.

**Péter Hajnal, Zhihao Liu, György Turán: ****Nearest neighbor representations of Boolean functions. **Information and Computation, vol. 285, Part B, May 2022.

(Jelena Glišić)**Michael A. Bender, Martín Farach-Colton, John Kuszmaul, William Kuszmaul: Modern Hashing Made Simple.** SOSA 2024.

**Joakim Blikstad, Ola Svensson, Radu Vintan, David Wajc: Simple and Asymptotically Optimal Online Bipartite Edge Coloring.** SOSA 2024.

**Alexander Golovnev, Siyao Guo, Spencer Peters, Noah Stephens-Davidowitz: Revisiting Time-Space Tradeoffs for Function Inversion.** CRYPTO 2023.

** Or Zamir: ****The Wrong Direction of Jensen's Inequality Is Algorithmically Right.** ICALP 2023.

(possibly Petr Chmel in Spring 2024)

### Previous program

### 2. 4. 2024

**Petr Kolman: Approximating Spanning Tree Congestion on Graphs with Polylog Degree**

Given a graph $G$ and a spanning tree $T$ of $G$, the congestion of an edge $e\in E(T)$, with respect to $G$ and $T$, is the number of edges $uv$ in $G$ such that the unique path in $T$ connecting the vertices $u$ and $v$ traverses the edge $e$. Given a connected graph $G$, the {\em spanning tree congestion} problem is to construct a spanning tree $T$ that minimizes its maximum edge congestion.

It is known that the problem is NP-hard, and that {\em every} spanning tree is an $n/2$-approximation, but it is not even known whether an $o(n)$-approximation is possible in polynomial time; by $n$ we denote the number of vertices in the graph $G$.

We consider the problem on graphs with maximum degree bounded by $polylog(n)$ and describe an $o(n)$-approximation algorithm; note that even on this restricted class of graphs the spanning tree congestion can be of order $n\cdot polylog(n)$.

### 26. 3. 2024

**Ian Mertz (Warwick University): How to Reuse Space**

Space and time are two of the oldest and most fundamental resources in computer science, both in theory and in practice. However, the relationship between space and time efficiency—known in the language of complexity as L and P, respectively—has stood unresolved since the beginnings of the field.

The central candidate for showing the weakness of L, known as Tree Evaluation, relies on an idea known as composition: the intuitive belief that memory cannot serve two different purposes, such as storage and computation, at once. This strategy, of showing that two unrelated tasks are no easier done together than separately, is a long-standing technique in designing hard functions, and continues to be an active area of study in many different contexts.

In this talk we show that this logic fails in the world of space: memory can be used for both storage and computation at the same time, and thus, contrary to expectations, Tree Evaluation can in fact be solved nearly in L. We explore this new paradigm, which we term "reusing space", from existing techniques to potential connections throughout theoretical computer science.

### 19. 3. 2024

**Vishwas Bhargava (University of Waterloo): Multivariate Multipoint Evaluation over Finite Fields**

Polynomials are widely used throughout computer science, ranging from heavily applied algorithms in signal processing to purely theoretical endeavors in complexity theory. What makes polynomials so applicable across domains, and what are the fundamental problems linked with them? In this talk, we will address these questions.

The main focus of this talk is on Multivariate Multipoint Evaluation (MME), tackling the question: How can we efficiently evaluate a multivariate polynomial with N coefficients at N different points? A straightforward quadratic (N^2 time) algorithm involves iteratively evaluating the polynomial at each input. Obtaining faster algorithms than the naive approach (ideally nearly linear, i.e., N^{1+o(1)} time) for this problem is a natural problem in computational algebra and is a direct generalization of the Discrete Fourier Transform! Furthermore, fast algorithms for multipoint evaluation have implications for polynomial factorization, matrix rigidity, and modular composition. We will discuss recent progress, including a complete resolution of the MME problem over finite fields.

We will conclude by providing a broad overview of the exciting applications of polynomials in various domains and exploring some interesting research directions related to the theme of algebraic methods.

### 12. 3. 2024

**Richard Hladík (ETH): Universal optimality in graph problems via beyond-worst-case heaps**

In this talk, we will discuss how beyond-worst-case heaps lead to new insights about decades-old graph problems. We first consider the problem of sorting a set of items having an unknown total order by doing binary comparisons of the items, given the outcomes of some pre-existing comparisons. We present a simple algorithm with a running time of O(m + n + lg T), where n, m, and T are the number of items, the number of pre-existing comparisons, and the number of total orders consistent with the outcomes of the pre-existing comparisons, respectively. The algorithm does O(lg T) comparisons. Our time and comparison bounds are the best possible up to constant factors. The best previous algorithm with a bound of O(lg T) on the number of comparisons has a time bound of O(n^2.5 + lg T) and is much more complicated. Our algorithm uses heaps with a certain beyond-worst-case property, analogous to the working set property of splay trees.

We then shift our attention to Dijkstra's algorithm and the problem of listing vertices of a weighted graph in the order of their distance from a source vertex. Using the exact same techniques, we show that Dijkstra's algorithm paired with a heap with the working set property is universally optimal for this problem. Universal optimality is a powerful beyond-worst-case performance guarantee for graph algorithms that informally states that a single algorithm performs as well as possible for every single graph topology. Our first result can also be viewed through this lens.

Joint work with Bernhard Haeupler, John Iacono, Václav Rozhoň, Robert Tarjan, Jakub Tětek.

### 5. 3. 2024

**Péter Hajnal, Zhihao Liu, György Turán: ****Nearest neighbor representations of Boolean functions. **Information and Computation, vol. 285, Part B, May 2022.

presented by Jelena Glišić

A nearest neighbor representation of a Boolean function is a set of positive and negative prototypes in Rnsuch that the function has value 1 on an input iff the closest prototype is positive. For k-nearest neighbor representation the majority classification of the kclosest prototypes is considered. The nearest neighbor complexity of a Boolean function is the minimal number of prototypes needed to represent the function. We give several bounds for this measure. Separations are given between the cases when prototypes can be real or are required to be Boolean. The complexity of parity is determined exactly. An exponential lower bound is given for mod 2 inner product, and a linear lower bound is given for its k-nearest neighbor complexity. The results are proven using connections to other models such as polynomial threshold functions over {1, 2}. We also discuss some of the many open problems arising.

### 27. 2. 2024

**Josef Tkadlec: Amplifiers of selection for both Birth-death and death-Birth updating**

Moran process is a classic discrete-time random process that models how entities such as opinions or viruses propagate through graphs. In each step, the current vertex coloring is updated according to a certain random rule. There are two versions of the Moran process called Birth-death updating and death-Birth updating. The two versions behave very differently. For example, under Birth-death updating many graphs function as so-called "amplifiers of selection", whereas for death-Birth updating only a handful of amplifiers are known. Curiously, none of those few death-Birth amplifiers are

amplifiers under Birth-death updating. In this work, we identify a new family of graphs that function as amplifiers under both versions of the Moran process.

Joint work with Jakub Svoboda, Soham Joshi, and Krishnendu Chatterjee.

### 20. 2. 2024

**Jiří Fink: Matchings extend to long cycles in hypercubes**

In 1993, Ruskey and Savage conjectured that every matching of the hypercube can be extended into a Hamilton cycle. The conjecture is known to be true if the matching is perfect or it has size at most a quadratic in the dimension. It is also known that every matching can be extended into a 2-factor. We will discuss how to extend a matching into a single cycle visiting at least 2/3-fraction of all vertices of the hypercube. The proof uses induction where the base case is computer assisted.

Joint work with Torsten Mütze

### 16. 1. 2024

**Václav Rozhoň (ETH and MIT): Work-Efficient Parallel Derandomization: Chernoff-like Concentrations via Pairwise Independence**

We present a novel technique for work-efficient parallel derandomization, for algorithms that rely on the concentration of measure bounds such as Chernoff, Hoeffding, and Bernstein inequalities. Our method increases the algorithm's computational work and depth by only polylogarithmic factors. Before our work, the only known method to obtain parallel derandomization with such strong concentrations was by the results of [Motwani, Naor, and Naor FOCS'89; Berger and Rompel FOCS'89], which perform a binary search in a $k$-wise independent space for $k=\poly(\log n)$. However, that method blows up the computational work by a high $\poly(n)$ factor and does not yield work-efficient parallel algorithms. Their method was an extension of the approach of [Luby FOCS'88], which gave a work-efficient derandomization but was limited to algorithms analyzed with only pairwise independence. Pushing the method from pairwise to the higher $k$-wise analysis resulted in the $\poly(n)$ factor computational work blow-up. Our work can be viewed as an alternative extension from the pairwise case, which yields the desired strong concentrations while retaining work efficiency up to logarithmic factors.

Our approach works by casting the problem of determining the random variables as an iterative process with $\poly(\log n)$ iterations, where different iterations have independent randomness. This is done so that for the desired concentrations, we need only pairwise independence inside each iteration. In particular, we model each binary random variable as a result of a gradual random walk, and our method shows that the desired Chernoff-like concentrations about the endpoints of these walks can be boiled down to some pairwise analysis on the steps of these random walks in each iteration (while having independence across iterations). Hence, we can fix the randomness of each iteration efficiently before proceeding to the next.

Joint work with Mohsen Ghaffari, Christoph Grunau (FOCS 2023).

### 9. 1. 2024

**Dmytro Gavinski (Czech Academy of Sciences): Patterned non-determinism in communication complexity**

*exhaustive*witness-searching problem has an efficient deterministic protocol too (this part would only be claimed during the talk, a more detailed exposition might be presented some other time).

### 19. 12. 2023

**Chakraborty, Vinodchandran, Meel: Distinct Elements in Streams: An Algorithm for the (Text) Book. ESA 2022.**

presented by Nicolaos Matsakis

Given a data stream A=⟨a_1,a_2,…,a_m⟩ of m elements where each a_i∈[n], the Distinct Elements problem is to estimate the number of distinct elements in A.Distinct Elements has been a subject of theoretical and empirical investigations over the past four decades resulting in space optimal algorithms for it. All the current state-of-the-art algorithms are, however, beyond the reach of an undergraduate textbook owing to their reliance on the usage of notions such as pairwise independence and universal hash functions. We present a simple, intuitive, sampling-based space-efficient algorithm whose description and the proof are accessible to undergraduates with the knowledge of basic probability theory.

### 12. 12. 2023

**Ninad Rajgopal (U. of Warwick): On the Power of Interactive Proofs for Learning**

We study interactive proof systems for verifying the results of agnostic learning tasks, which was recently initiated by Goldwasser, Rothblum, Shafer and Yehudayoff [GRSY21], as PAC-verification. Here, for a given target hypothesis class H and for any unknown n-variate Boolean function f, the verifier who only gets random example access to f (over the uniform distribution) interacts with an untrusted prover who has query access to f, to output a hypothesis that is as close to f as an optimal hypothesis from H, up to a small error. Such proof systems enable a verifier to delegate an agnostic learning task to an untrusted prover, with the goal being one of verification using much fewer random examples and/or running time, than actually performing learning (using queries) from scratch. Typically, we require such interactive proofs to be doubly efficient, where the honest prover is also an efficient algorithm.

In this paper, we study delegation of learning for certain Boolean hypothesis classes that are central to learning theory. Firstly, we show an interactive proof for learning all the \eps-heavy Fourier coefficients of a function using poly(1/\eps) many examples (i.e., Goldreich-Levin theorem), whereas the prover uses poly(n/\eps) many membership queries, giving a verification speed-up that is independent of the input length n. We also show a similar input-independent separation for the hypothesis class of all k-Juntas. Next, we show such interactive proofs for learning the class AC^0[2] building on the work by Carmosino-Impagliazzo-Kabanets-Kolokolova [CIKK17], where the verifier uses quasi-polynomially many random examples. In contrast, this class has been notoriously resistant beyond trivial brute force learning, even for constructing realisable learners (without a prover) using random examples. Finally, we show that if we do not insist on doubly-efficient proof systems, then the model becomes trivial; even P/Poly can be learnt using O(1) many examples.

This is joint work with Tom Gur, Mohammad Mahdi Jahanara, Mohammad Mahdi Khodabandeh, Bahar Salamatian, and Igor Shinkar.

### 5. 12. 2023

**Tomáš Hons: Structural limits and inverse problems**

The theory of graph convergence and limits was developed for study of large graphs. Inspired by analysis, the idea is to assign a certain limit object to a sequence of growing graphs that captures important properties of the sequence. However, when seeking the right definition of the limit object, the following inverse problem needs to be considered: given a limit object L (i.e. satisfying the definition), is there a sequence of graphs that converges to L? In the talk, we introduce several notions of graph convergence, the corresponding limit objects, and the state of their inverse problem. Moreover, we mention a certain simplified question that deals with approximation of vertices within a limit instead of the limit itself.

**David Mikšaník: Coloring graphs with bounded clique number**A well-known theorem of Brooks states that a graph G of maximum degree Delta≥3 can be colored by Delta colors unless G contains a clique of size Delta+1. It is natural to ask what additional assumptions guarantee that G can be colored with Delta-1 colors. Borodin and Kostochka (1976) conjectured that when Delta≥9, it is sufficient to assume only that G does not contain a clique of size Delta. In 1999, Reed showed that this conjecture holds for graphs with maximum degree at least 10^14.

In this talk, I will suggest possible ways to improve Reed's result by decreasing the constant 10^14 to hundreds or lower. This topic is part of my dissertation thesis and this year's GAUK grant proposal, both under the supervision of Zdeněk Dvořák.

### 28. 11. 2023

**Max Deppert, Matthias Kaul, Matthias Mnich: ****A (3/2 + ε)-Approximation for Multiple TSP with a Variable Number of Depots. **ESA 2023.

presented by Petr Martínek

One of the most studied extensions of the famous Traveling Salesperson Problem (TSP) is the Multiple TSP: a set of m ≥ 1 salespersons collectively traverses a set of n cities by m non-trivial tours to minimize the total length of their tours. This problem can also be considered to be a variant of Uncapacitated Vehicle Routing, where the objective is to minimize the sum of all tour lengths. When all m tours start from and end at a single common depot, then the metric Multiple TSP can be approximated equally well as the standard metric TSP, as shown by Frieze (1983). The metric Multiple TSP becomes significantly harder to approximate when there is a set D of d ≥ 1 depots that form the starting and end points of the m tours. This paper gives the first efficient approximation algorithm for Multiple TSP with a variable number d of depots that yields a better-than-2 approximation. Their algorithm runs in time (1/ε)^O(dlog d) ⋅ n^O(1), and produces a (3/2+ε)-approximation with constant probability. For the graphic case, they obtain a deterministic 3/2-approximation in time 2^d ⋅ n^O(1).

### 21. 11. 2023

**Arturo Merino (U. Saarland): Traversing combinatorial polytopes via optimization**

We present a new framework that exploits combinatorial optimization for efficiently generating a large variety of combinatorial objects based on graphs, matroids, posets, and polytopes. Our method relies on a simple and versatile algorithm for computing a Hamilton path on the skeleton of any 0/1-polytope conv(X), where X ⊆ {0,1}^n. The algorithm uses as a black box any algorithm that solves the classical linear optimization problem, and the resulting delay, i.e., the running time per visited vertex on the Hamilton path, is only a polylogarithmic factor larger than the running time of the optimization algorithm. When X encodes a particular class of combinatorial objects, then traversing the skeleton of the polytope conv(X) along a Hamilton path corresponds to listing the combinatorial objects by local change operations, i.e., we obtain Gray code listings.

As a concrete example application of our framework, we obtain an O(T⋅log n) delay algorithm for the vertex enumeration problem on 0/1 polytopes, where T is the time needed to solve a linear program and n is the dimension of the polytope. This improves upon the 25-year-old O(T⋅n) delay algorithm due to Bussieck and Lübbecke.

This is joint work with Torsten Mütze (U. of Warwick).

### 14. 11. 2023

**Emmett Breen, Renee Mirka, Zichen Wang, David P. Williamson: Revisiting Garg's 2-Approximation Algorithm for the k-MST Problem in Graphs. **SOSA 2023.

presented by Josef Matějka

Garg (STOC'05) gave a polynomial-time 2-approximation algorithm for finding a Minimum Spanning Tree spanning k vertices (k-MST). This paper greatly simplifies the algorithm to achieve Garg's result.

### 7. 11. 2023

**Charlotte Hoffmann (ISTA): Certifying Giant Nonprimes**

GIMPS and PrimeGrid are large-scale distributed projects dedicated to searching giant prime numbers, usually of special forms like Mersenne and Proth primes. The numbers in the current search-space are millions of digits large and the participating volunteers need to run resource-consuming primality tests. Once a candidate prime N has been found, the only way for another party to independently verify the primality of N used to be by repeating the expensive primality test. To avoid the need for a second recomputation of each primality test, these projects have recently adopted certifying mechanisms that enable efficient verification of performed tests. However, the mechanisms presently in place only detect benign errors and there is no guarantee against adversarial behavior: a malicious volunteer can mislead the project to reject a giant prime as being non-prime.

In this talk, I will present a practical, cryptographically-sound mechanism for certifying the non-primality of Proth numbers. That is, a volunteer can – parallel to running the primality test for N – generate an efficiently verifiable proof at a little extra cost, certifying that N is not prime.

Joint work with Pavel Hubáček, Chethan Kamath, and Krzysztof Pietrzak.

### 31. 10. 2023

**Pavel Veselý (CUNI): Massively Parallel Algorithms for Clustering in High Dimension**

We discuss algorithms for k-Median and k-Means clustering in high-dimensional Euclidean spaces that run in the Massively Parallel Computation (MPC) model, which captures key aspects of MapReduce-style frameworks. In MPC, we have a set of machines, each with a local memory that is sublinear in the input size, and computation proceeds in alternating rounds of communication and sequential (but local) computation. We focus on the fully scalable case in which the local memory may be substantially smaller than the number of clusters k. We present new algorithms that take O(1) rounds and achieve O(1)-bicriteria approximation (meaning we output O(k) clusters), improving upon a polylogarithmic approximation. Our approach relies on a connection between k-clustering and Facility Location, for which we design an MPC O(1)-round O(1)-approximation.

Joint work with Artur Czumaj, Guichen Gao, Shaofeng H.-C. Jiang, and Robert Krauthgamer.

### 24. 10. 2023

**Tomasz Kociumaka, Adam Polak: Bellman-Ford Is Optimal for Shortest Hop-Bounded Paths. **ESA 2023.

presented by Jelena Glišić

This paper is about the problem of finding a shortest s-t path using at most h edges in edge-weighted graphs. The Bellman-Ford algorithm solves this problem in O(hm) time, where m is the number of edges. We show that this running time is optimal, up to subpolynomial factors, under popular fine-grained complexity assumptions. More specifically, we show that under the APSP Hypothesis the problem cannot be solved faster already in undirected graphs with nonnegative edge weights. This lower bound holds even restricted to graphs of arbitrary density and for arbitrary h ∈ O(√m). Moreover, under a stronger assumption, namely the Min-Plus Convolution Hypothesis, we can eliminate the restriction h ∈ O(√m). In other words, the O(hm) bound is tight for the entire space of parameters h, m, and n, where n is the number of nodes. Our lower bounds can be contrasted with the recent near-linear time algorithm for the negative-weight Single-Source Shortest Paths problem, which is the textbook application of the Bellman-Ford algorithm.

### 17. 10. 2023 - CANCELLED

### 10. 10. 2023

**Petr Kučera (CUNI): Binary Constraint Trees and Structural Decomposability**

A binary constraint tree (BCT, Wang and Yap 2022) is a normalized binary CSP whose constraint graph is a tree. A BCT constraint is a constraint represented with a BCT where some of the variables may be hidden (i.e. existentially quantified and used only for internal representation). Structured decomposable negation normal forms (SDNNF) were introduced by Pipatsrisawat and Darwiche (2008) as a restriction of decomposable negation normal forms (DNNF). Both DNNFs and SDNNFs were studied in the area of knowledge compilation. In this paper we show that the BCT constraints are polynomially equivalent to SDNNFs. In particular, a BCT constraint can be represented with an SDNNF of polynomial size and, on the other hand, a constraint that can be represented with an SDNNF, can be represented as a BCT constraint of polynomial size. This generalizes the result of Wang and Yap (2022) that shows that a multivalued decision diagram (MDD) can be represented with a BCT. Moreover, our result provides a full characterization of binary constraint trees using a language that is well studied in the area of knowledge compilation. It was shown by Wang and Yap (2023) that a CSP on n variables of domain sizes bounded by d that has treewidth k can be encoded as a BCT on O(n) variables with domain sizes O(d^{k+1}). We provide an alternative reduction for the case of binary CSPs. This allows us to compile any binary CSP to an SDNNF of size that is parameterized by d and k.

### 3. 10. 2023

**Luca Trevisan (Bocconi U.): Approximating Boolean Constraint Satisfaction Problems in Random, Semi-Random, and Worst-Case Instances**

We know several techniques to round Semidefinite Programming relaxations of graph optimization problems and of Boolean constraint satisfaction problems in which each constraint involves at most two variables, and these rounding techniques lead to good approximation algorithms. For problems involving hyper-graphs, for Boolean constraint satisfaction problems with three or more variables per constraints, and for polynomial optimization problems with polynomials of degree higher than two, we lack techniques to round Semidefinite Programs (or other convex relaxations) or to analyze their average-case performance.

We present two new techniques to analyze random, semi-random and worst-case instances of problems such as Max 3SAT, Max 3XOR, and degree 3 polynomial optimization. One technique gives new bounds to certify unsatisfiability of random 3SAT and certified upper bounds for random Max 3XOR and for random instances of degree 3 polynomial optimization, and it extends to a certain semi-random generative model (in which instances are produced with a mix of random choices and worst- case choices). The other technique provides a new rounding scheme for worst-case instances of Max 3XOR and homogeneous degree 3 polynomial optimization.

Joint work with Tommaso D’Orsi, Tim Hsieh, Pravesh Kothari, and Lucas Pesenti