Joint IOL and COGA research seminar 2020–2021

Here is a list of talks from the seminar of Combinatorial Optimization & Graph Algorithms group from Technical University of Berlin, during 2020–2021, when it was jointly organized with our group. For talks outside this period, please see the official homepage.

Photo Tobias Harks Tobias HarksUni Augsburg
Coordinates:

Title. Machine-Learned Prediction Equilibrium for Dynamic Traffic Assignment

Abstract.

We study a dynamic traffic assignment model, where agents base their instantaneous routing decisions on real-time delay predictions. We formulate a mathematically concise model and derive properties of the predictors that ensure a dynamic prediction equilibrium exists. We demonstrate the versatility of our framework by showing that it subsumes the well-known full information and instantaneous information models, in addition to admitting further realistic predictors as special cases. We complement our theoretical analysis by an experimental study, in which we systematically compare the induced average travel times of different predictors, including a machine-learning model trained on data gained from previously computed equilibrium flows, both on a synthetic and a real road network.

Linda KleistTU Braunschweig
Coordinates:

Title. Training Neural Networks is even harder (own results)

Abstract.

Methods and applications of Machine Learning have made impressive progress in recent years. One approach is neural network, a brain-inspired learning model that can be trained to perform complex and diverse tasks. In order to deliver good results, the training of such a network is crucial.

Given a neural network, training data, and a threshold, finding weights for the neural network such that the total error is below the threshold is known to be NP-hard. In this talk, we determine the algorithmic complexity of this fundamental problem precisely, by showing that it is complete for the complexity class existential theory of the reals (ER).

This means that, up to polynomial time reductions, the problem is equivalent to deciding whether a system of polynomial equations and inequalities with integer coefficients and real unknowns has a solution. If, as widely expected, ER is strictly larger than NP, our work implies that the problem of training neural networks is not even in NP.

Neural networks are usually trained using some variation of backpropagation. The result of this paper offers an explanation why techniques commonly used to solve big instances of NP-complete problems seem not to be of use for this task. Examples of such techniques are SAT solvers, IP solvers, local search, dynamic programming, to name a few general ones.

The talk is based on joint work with Mikkel Abrahamsen and Tillman Miltzow.

Photo Laura Vargas Koch Laura Vargas KochETH Zürich
Coordinates:

Title. Convergence of a Packet Routing Model to Flows Over Time (own results)

Abstract.

The mathematical approaches for modeling dynamic traffic can roughly be divided into two categories: discrete packet routing models and continuous flow over time models. Despite very vital research activities on models in both categories, the connection between these approaches was poorly understood so far. We build this connection by specifying a (competitive) packet routing model, which is discrete in terms of flow and time, and by proving its convergence to flows over time with deterministic queuing. More precisely, we prove that the limit of the convergence process, when decreasing the packet size and time step length in the packet routing model, constitutes a flow over time with multiple commodities. The convergence result implies the existence of approximate equilibria in the competitive version of the packet routing model. This is of significant interest as exact pure Nash equilibria, similar to almost all other competitive models, cannot be guaranteed in the multi-commodity setting.

Moreover, the introduced packet routing model with deterministic queuing is very application-oriented as it is based on the network loading module of the agent-based transport simulation MATSim. As the present work is the first mathematical formalization of this simulation, it provides a theoretical foundation and an environment for provable mathematical statements for MATSim.

This is joint work with Leon Sering and Theresa Ziemke.

Photo Valentin Kirchner Valentin KirchnerTU Berlin
Coordinates:

Title. Optimisation with Squared Lasso Penalty (own result)

Abstract.

The lecture offers a concise overview of a master’s thesis on the penalised linear regression problem with squared lasso penalty. We derive its dual and optimality conditions and connect it to the well-known lasso problem. Based on the optimality conditions we study the solution path and briefly discuss an exact and an approximate homotopy algorithm.

Photo Arturo Merino Arturo MerinoCOGA
Coordinates:

Title. Efficient generation of elimination trees and Hamilton paths on graph associahedra (survey)

Abstract.

An elimination tree for a connected graph G is a rooted tree on the vertices of G obtained by choosing a root x and recursing on the connected components of G−x to produce the subtrees of x. Elimination trees appear in many guises in computer science and discrete mathematics, and they are closely related to centered colorings and tree-depth. They also encode many interesting combinatorial objects, such as bitstrings, permutations and binary trees. We apply the recent Hartung-Hoang-Mütze-Williams combinatorial generation framework to elimination trees, and prove that all elimination trees for a chordal graph G can be generated by tree rotations using a simple greedy algorithm. This yields a short proof for the existence of Hamilton paths on graph associahedra of chordal graphs. Graph associahedra are a general class of high-dimensional polytopes introduced by Carr, Devadoss, and Postnikov, whose vertices correspond to elimination trees and whose edges correspond to tree rotations. As special cases of our results, we recover several classical Gray codes for bitstrings, permutations and binary trees, and we obtain a new Gray code for partial permutations. Our algorithm for generating all elimination trees for a chordal graph G can be implemented in time O(m+n) per generated elimination tree, where m and n are the number of edges and vertices of G, respectively. If G is a tree, we improve this to a loopless algorithm running in time O(1) per generated elimination tree. We also prove that our algorithm produces a Hamilton cycle on the graph associahedron of G, rather than just Hamilton path, if the graph G is chordal and 2-connected. Moreover, our algorithm characterizes chordality, i.e., it computes a Hamilton path on the graph associahedron of G if and only if G is chordal. This is joint work with Jean Cardinal (ULB) and Torsten Mütze (Warwick).

Photo Svenja Griesbach Svenja GriesbachDISCO
Coordinates:

Title. Book Embeddings of Nonplanar Graphs with Small Faces in Few Pages (own work)

Abstract.

An embedding of a graph in a book, called book embedding, consists of a linear ordering of its vertices along the spine of the book and an assignment of its edges to the pages of the book, so that no two edges on the same page cross. The book thickness of a graph is the minimum number of pages over all its book embeddings. For planar graphs, a fundamental result is due to Yannakakis, who proposed an algorithm to compute embeddings of planar graphs in books with four pages. Our main contribution is a technique that generalizes this result to a much wider family of nonplanar graphs, which is characterized by a biconnected skeleton of crossing-free edges whose faces have bounded degree. Notably, this family includes all 1-planar and all optimal 2-planar graphs as subgraphs. We prove that this family of graphs has bounded book thickness, and as a corollary, we obtain the first constant upper bound for the book thickness of optimal 2-planar graphs.

This is joint work with Michael A. Bekos, Giordano Da Lozzo, Martin Gronemann, Fabrizio Montecchiani, and Chrysanthi Raftopoulou.

Photo David Weckbecker David WeckbeckerTU Darmstadt
Coordinates:

Title. Fractionally Subadditive Maximization under an Incremental Knapsack Constraint (own result)

Abstract.

We consider the problem of maximizing a fractionally subadditive function under a knapsack constraint that grows over time. An incremental solution to this problem is given by an order in which to include the elements of the ground set, and the competitive ratio of an incremental solution is defined by the worst ratio over all capacities relative to an optimum solution of the corresponding capacity. We present an algorithm that finds an incremental solution of competitive ratio at most max{3.293*sqrt{M},2M}, under the assumption that the values of singleton sets are in the range [1,M], and we give a lower bound of max{2.449,M} on the attainable competitive ratio. In addition, we establish that our framework captures potential-based flows between two vertices, and we give a tight bound of 2 for the incremental maximization of classical flows with unit capacities.

Photo Kevin Schewior Kevin SchewiorUni Köln
Coordinates:

Title. Stochastic Probing with Increasing Precision (own result)

Abstract.

We consider a selection problem with stochastic probing. There is a set of items whose values are drawn from independent distributions. The distributions are known in advance. Each item can be tested repeatedly. Each test reduces the uncertainty about the realization of its value. We study a testing model, where the first test reveals if the realized value is smaller or larger than the median of the underlying distribution. Subsequent tests allow to further narrow down the interval in which the realization is located. There is a limited number of possible tests, and our goal is to design near-optimal testing strategies that allow to maximize the expected value of the chosen item. We study both identical and non-identical distributions and develop polynomial-time algorithms with constant approximation factors in both scenarios.

This is joint work with Martin Hoefer and Daniel Schmand.

Photo Javier Cembrano Javier CembranoDISCO
Coordinates:

Title. Multidimensional Apportionment through Discrepancy Theory (own result)

Abstract.

Deciding how to allocate the seats of a house of representatives is one of the most fundamental problems in the political organization of societies, and has been widely studied over already two centuries. The idea of proportionality is at the core of most approaches to tackle this problem, and this notion is captured by the divisor methods, such as the Jefferson/D’Hondt method. In a seminal work, Balinski and Demange extended the single-dimensional idea of divisor methods to the setting in which the seat allocation is simultaneously determined by two dimensions, and proposed the so-called biproportional apportionment method. The method, currently used in several electoral systems, is however limited to two dimensions and the question of extending it is considered to be an important problem both theoretically and in practice.

In this work we initiate the study of multidimensional proportional apportionment. We first formalize a notion of multidimensional proportionality that naturally extends that of Balinski and Demange. By means of analyzing an appropriate integer linear program we are able to prove that, in contrast to the two-dimensional case, the existence of multidimensional proportional apportionments is not guaranteed and deciding its existence is NP-complete. Interestingly, our main result asserts that it is possible to find approximate multidimensional proportional apportionments that deviate from the marginals by a small amount. The proof arises through the lens of discrepancy theory, mainly inspired by the celebrated Beck-Fiala Theorem. We finally evaluate our approach by using the data from the recent 2021 Chilean Constitutional Convention election.

This is joint work together with José Correa (UChile) and Victor Verdugo (UOH).

Moritz BuchemMaastricht University
Coordinates:

Title. Additive approximation schemes for load balancing problems (own result)

Abstract.

We formalize the concept of additive approximation schemes and apply it to load balancing problems on identical machines. Additive approximation schemes compute a solution with an absolute error in the objective of at most εh for some suitable parameter h and any given ε > 0. We consider the problem of assigning jobs to identical machines with respect to common load balancing objectives like makespan minimization, the Santa Claus problem (on identical machines), and the envy-minimizing Santa Claus problem. For these settings we present additive approximation schemes for h = pmax, the maximum processing time of the jobs. Our technical contribution is two-fold. First, we introduce a new relaxation based on integrally assigning slots to machines and fractionally assigning jobs to the slots. We refer to this relaxation as the slot-MILP. While it has a linear number of integral variables, we identify structural properties of (near-)optimal solutions, which allow us to compute those in polynomial time. The second technical contribution is a local-search algorithm which rounds any given solution to the slot-MILP, introducing an additive error on the machine loads of at most εpmax. This is joint work together with Lars Rohwedder (EPFL), Tjark Vredeveld (UM) and Andreas Wiese (UChile).

Photo Boro Sofranac Boro SofranacIOL
Coordinates:

Title. An Algorithm-Independent Measure of Progress for Linear Constraint Propagation (own results)

Abstract.

Propagation of linear constraints has become a crucial sub-routine in modern Mixed-Integer Programming (MIP) solvers. In practice, iterative algorithms with tolerance-based stopping criteria are used to avoid problems with slow or infinite convergence. However, these heuristic stopping criteria can pose difficulties for fairly comparing the efficiency of different implementations of iterative propagation algorithms in a real-world setting. Most significantly, the presence of unbounded variable domains in the problem formulation makes it difficult to quantify the relative size of reductions performed on them. In this work, we develop a method to measure—independently of the algorithmic design—the progress that a given iterative propagation procedure has made at a given point in time during its execution. Our measure makes it possible to study and better compare the behavior of bounds propagation algorithms for linear constraints. We apply the new measure to answer two questions of practical relevance:

  1. We investigate to what extent heuristic stopping criteria can lead to premature termination on real-world MIP instances.
  2. We compare a GPU-parallel propagation algorithm against a sequential state-of-the-art implementation and show that the parallel version is even more competitive in a real-world setting than originally reported.

Sophie Huiberts
Coordinates:

Title. Combinatorial Diameter of Random Polyhedra

Abstract.

The long-standing polynomial Hirsch conjecture asks if the combinatorial diameter of any polyhedron can be bounded by a polynomial of the dimension and number of facets. Inspired by this question, we study the combinatorial diameter of two classes of random polyhedra.

We prove nearly-matching upper and lower bounds, assuming that the number of facets is very large compared to the dimension.

Photo Khai Van Tran Khai Van TranCOGA
Coordinates:

Title. A Faster Algorithm for Quickest Transshipments via an Extended Discrete Newton Method

Abstract.

The Quickest Transshipment Problem is to route flow as quickly as possible from sources with supplies to sinks with demands in a network with capacities and transit times on the arcs. It is of fundamental importance for numerous applications in areas such as logistics, production, traffic, evacuation, and finance. More than 25 years ago, Hoppe and Tardos presented the first (strongly) polynomial-time algorithm for this problem. Their approach, as well as subsequently derived algorithms with strongly polynomial running time, are hardly practical as they rely on parametric submodular function minimization via Megiddo‘s method of parametric search. In this talk we present a considerably faster algorithm for the Quickest Transshipment Problem that instead employs a subtle extension of the Discrete Newton Method. This improves the previously best known running time of Õ(m4k14}) to Õ(m2k5+m3k3+m3n), where n is the number of nodes, m the number of arcs, and k the number of sources and sinks.

Photo Frieder Smolny Frieder SmolnyCOGA
Coordinates:

Title. Computational experiments and multiscale optimization (own results)

Abstract.

In this talk we will see how to solve a network flow problem by a coarsening technique, and speculate if this technique can be carried over to other optimization problems. Then we will witness the struggles of experimentally evaluating an algorithm with many parameters.

Photo Daniel Schmidt Daniel SchmidtCOGA
Coordinates:

Title. Restricted Adaptivity in Stochastic Scheduling (own results)

Abstract.

We consider the stochastic scheduling problem of minimizing the expected makespan on m parallel identical machines. While the (adaptive) list scheduling policy achieves an approximation ratio of 2, any (non-adaptive) fixed assignment policy has performance guarantee 𝛺(log m/loglog m). Although the performance of the latter class of policies are worse, there are applications, e.g. in surgery scheduling, in which rather non-adaptive policies are desired.

We introduce two classes of policies, namely 𝛿-delay and 𝜏-shift policies, whose degree of adaptivity can be controlled by a parameter. Using the fixed assignment policy induced by LEPT as a basis, we present a policy - belonging to both classes - for which we show that it is an O(loglog m)-approximation for reasonably bounded parameters. In other words, an improvement on the performance of any fixed assignment policy can be achieved when allowing a small degree of adaptivity. Moreover, we provide a matching lower bound for any 𝛿-delay and 𝜏-shift policy when both parameters, respectively, are in the order of the expected makespan of an optimal non-anticipatory policy.

This is joint work with Guillaume Sagnol.

Photo Max-Georg Schorr Max-Georg SchorrIOL
Coordinates:

Title. Evolution of Boosting

Abstract.

The question whether we can convert a bunch of weak learners into a strong learner arose in the late 80s. It was confirmed by Schapire and led to the most famous approach: AdaBoost. This algorithm is well known for it’s good performance in practice and there are a lot of variants. In this talk, we will look at the application of linear and mixed-integer optimization and we will derive corresponding LP and MIP formulations. In particular, we will look at:

  • Properties of AdaBoost including possible drawbacks and approaches to overcome them
  • Column generation as an efficient way to solve LP formulations
  • LPBoost and IPBoost

Theresa Ziemke
Coordinates:

Title. Nash flows over time in MATSim? (own result)

Photo Arturo Merino Arturo MerinoCOGA
Coordinates:

Title. Greedy strategies for exhaustive generation (survey)

Abstract.

Day to day we frequently encounter different kinds of combinatorial objects, e.g., permutations, binary strings, binary trees, spanning trees of a graph, bases of a matroid. Usually, there are three fundamental tasks for these objects: counting, random sampling, and exhaustive generation. Even though for the first two there are powerful well-understood general approaches like generating functions or Markov chains; this has been lacking for exhaustive generation.

In this talk, we will discuss greedy strategies for exhaustive generation. This is a very natural attempt to provide a general approach for exhaustive generation as greedy strategies are a widely used algorithmic paradigm. In particular, we will discuss:

  • Various classical examples of Greedy based approaches in exhaustive generation.
  • An introduction to the zig-zag framework by Hartung, Hoang, Mütze, and Williams, which provides a common lense for many greedy strategies.
  • Some greedy strategies that appear to not fit into the zig-zag framework, including a new greedy approach for generating spanning trees of any given connected graph.

Photo Paco Criado Paco CriadoIOL
Coordinates:

Title. Borsuk’s problem (survey)

Abstract.

In 1932, Borsuk proved that every 2-dimensional convex body can be divided in three parts with strictly smaller diameter than the original. He also asked if the same would hold for any dimension d: is it true that every d-dimensional convex body can be divided in d+1 pieces of strictly smaller diameter?

The answer to this question is positive in 3 dimensions (Perkal 1947), but in 1993, Kahn and Kalai constructed polytopal counterexamples in dimension 1325. Nowadays, we know that the answer is “no” for all dimensions d≥64 (Bondarenko and Jenrich 2013). It remains open for every dimension between 4 and 63.

In this talk we introduce the problem, and we show the proofs for dimensions 2 and 3 plus computational ideas on how to extend these solutions to dimension 4.

Photo Christoph Hertrich Christoph HertrichCOGA [homepage]
Coordinates:

Title. Tackling Neural Network Expressivity via (virtual Newton) polytopes (open problem)

Abstract.

I will talk about how to translate two open problems in the context of ReLU neural network (NN) expressivity into the polytope world:

  1. Can constant-depth NNs compute any continuous piecewise linear (CPWL) function? Smallest open special case: Can 3-layer NNs compute the maximum of 5 numbers?
  2. Can polynomial-size NNs precisely compute the objective value of the Minimum Spanning Tree problem (or other combinatorial optimization problems) for arbitrary real-valued edge weights?

To achieve these translations, I will introduce an isomorphism between the two semigroups (convex CPWL functions, pointwise addition) and (polytopes, Minkowski sum), known as “support functions” (in convex geometry) and “Newton polytopes” (in tropical geometry). Extending this concept to arbitrary (not necessarily convex) CPWL functions leads to the (in my opinion beautiful) construction of “virtual” polytopes, that is, the group of formal Minkowski differences of two polytopes.

Photo Martin Skutella Martin SkutellaCOGA
Coordinates:

Title. A simple proof of the Moore–Hodgson Algorithm for minimizing the number of late jobs (own results)

Abstract.

The Moore–Hodgson Algorithm minimizes the number of late jobs on a single machine. That is, it finds an optimal schedule for the classical problem where jobs with given processing times and due dates have to be scheduled on a single machine so as to minimize the number of jobs completing after their due date. Several proofs of the correctness of this algorithm have been published. We present a new short proof.

Photo Max Klimm Max KlimmCOGA/DO
Coordinates:

Title. Static and dynamic pricing of identical items (own result)

Abstract.

We consider the problem of maximizing the expected revenue from selling k identical items to n unit-demand buyers who arrive sequentially with independent and identically distributed valuations. We show bounds on the gap between the revenue achievable with dynamic prices and that achievable with a static price. These bounds are tight for all values of k and n and vary depending on a regularity property of the underlying distribution. The bound for regular distributions increases in n and tends to 1/(1-kk/(ek k!)). Our upper bounds are obtained via a classic relaxation of the pricing problem known as the ex-ante relaxation, and a novel use of Bernstein polynomials to approximate the revenue from static pricing. This technique allows for a uniform and lossless analysis across all parameter values, both in the finite regime and in the limit. (Joint work with Paul Dütting and Felix Fischer)

Maximilian StahlbergCOGA/DO
Coordinates:

Title. Robust conic optimization in Python (own result)

Abstract.

I present an extension of the Python-embedded optimization modeling language PICOS that makes key results from the fields of robust and distributionally robust optimization available to application developers. While the Python ecosystem already offers multiple frameworks concerned with mathematical optimization, library support for optimization under uncertainty has been sparse in comparison. My talk opens with a brief introduction to the modeling interface provided by PICOS, then leads into a discussion of the recent additions that allow uncertainty in the data to be modelled explicitly and accounted for, assuming worst-case outcomes, during the solution process. In particular, I present an illustrative application in linear signal estimation where distributionally robust optimization can be seen to perform quite well.

Photo Thomas Kerdreux Thomas KerdreuxIOL
Coordinates:

Title. Set Curvature in Machine Learning

Abstract.

From the global strong convexity assumptions to local Łojasiewicz gradient inequalities, functional structures are ubiquitous in Machine Learning. They successfully describe convergence properties of optimization methods, regret bounds of online methods, or generalization bounds. However, much less attention has been dedicated to similar structures in the optimization constraint sets or online learner’s action sets.

Here, we first motivate such an issue from the perspective of projection-free methods (e.g., Conditional Gradients in offline optimization). We then describe how global and local set uniform convexity, i.e., a natural quantitative notion for upper curvature, can help obtain faster convergence rates or regret bounds for online or bandit algorithms. By doing so, we explore some useful characterization of these sets leveraging tools developed to study Banach Spaces. We finally provide some connection between uniform convexity and upper-bound on some Rademacher constants.

Valentin Hartmann
Coordinates:

Title. Differential Privacy for Machine Learning

Abstract.

Since its invention in 2006, differential privacy (DP) has become the de-facto standard for releasing aggregate information about data in a privacy-preserving way. It has not only found its way into companies such as Google, Microsoft and Apple, but is also used by government agencies such as the U.S. Census Bureau. In this talk, I will give an introduction into the concept of DP. I will further describe the most important building blocks that can be used to construct differentially private machine learning models that are safe against, e.g., training data extraction attacks.

Photo Daniel Schmidt (Waldschmidt) Daniel Schmidt (Waldschmidt)COGA
Coordinates:

Title. Scheduling under Contact Restrictions—A Problem Arising in Pandemics (own results)

Abstract.

In pandemic times, prohibiting close proximity between people is crucial to control the spread of the virus. This is particularly interesting in test and vaccination centers where each person should keep sufficient distance to other people during their transits to and from their designated rooms to prevent possible infections. To model this as a machine scheduling problem, each job has a transit time before and after its processing time. The task is to find a conflict-free schedule of minimum makespan in which the transit times of no two jobs on machines in close proximity (captured by an undirected graph) intersect. In this talk, we discuss hardness as well as approximation results for the case of identical jobs.

Photo Guillaume Sagnol Guillaume SagnolCOGA
Coordinates:

Title. Greedy Batch-Scheduling (own results)

Abstract.

In this short talk, we will discuss some ongoing work on the Min Weighed Sum Bin Packing problem. The goal is to assign n items (with weight wj and size sj) to bins, without exceeding the bin capacities, when assigning item j to the i’th bin incurs a cost of i * wj. This can also be understood as a (batch)-scheduling problem, in which all jobs have unit processing time, the machine can process batches of jobs satisfying a knapsack constraint, and the goal is to minimize the weighted sum of completion time.

A PTAS is known for this problem, but its astronomic running time makes it useless in the practice. Apart from that, the First-Fit algorithm that considers items in non-increasing order of wj/sj is known to have a tight performance guarantee of 2, and there is a greedy algorithm with a performance guarantee of 4 for a slight generalization of this problem, in which the feasible batches are given explicitly by a family of subsets of items. We will show that in the considered case (batches defined implicitly by a knapsack constraint), the performance guarantee of the greedy algorithm can be improved to 2.

Mathieu Besançon
Coordinates:

Title. Differentiable Optimization & Integration with Automatic Differentiation (survey)

Abstract.

Differentiable optimization aims at bringing constrained convex optimization to differentiable programming, allowing the computation of solution derivatives with respect to arbitrary input parameters. After an overview of some of the advances in that direction, we will see how to integrate differentiable optimization as a layer in differentiable models and its interplay with automatic differentiation, and conclude with some comparisons with other approaches.

Photo Sven Jäger Sven Jäger
Coordinates:

Title. Online Scheduling of Deterministic and Stochastic Jobs on Unrelated Machines

Abstract.

We consider the problem of scheduling jobs on parallel machines so as to minimize the sum of weighted job completion times. We assume that each job must be completely processed on one machine and that the time required may depend arbitrarily on the job and the machine. Furthermore, the jobs arrive online over time. In the first part of the talk, we consider preemptive scheduling of deterministic jobs. I will use dual fitting to prove that a simple greedy algorithm for this problem is 4-competitive. The second part is about non-preemptive scheduling of stochastic jobs. Here a similar greedy algorithm for the assignment of jobs to machines, combined with known policies for scheduling the jobs assigned to each machine (Schulz, 2008), leads to stochastic online policies with performance guarantees depending on an upper bound 𝛥 for the squared coefficients of variation of the random variables. These improve upon the previously best known guarantees (Gupta, Moseley, Uetz, & Xie) when 𝛥 is large enough (for the simplest variant considered when 𝛥 ≥ 0.815). This includes the important case of exponentially distributed processing times.

Photo Philipp Warode Philipp Warode
Coordinates:

Title. Parametric Computation of Minimum Cost Flows (own results)

Abstract.

Minimum cost flows have numerous applications, e.g. electrical or gas flows or equilibrium flows in traffic networks. In the classical minimum cost flow problem we are given some flow-dependent cost functions on the edges and demand values on the vertices. In this setting we want to find a flow with minimal costs satisfying the demands. However, in practice, demand and supply values are not constant but may vary or be uncertain. Therefore, we want to analyze the minimum cost flows as a function of the demand values. More specifically, we consider a setting where the demands in the network are given as a function depending on a one-dimensional parameter 𝜆.

We analyze these functions theoretically and use the insights to develop an output-polynomial algorithm that computes the minimum cost flow function explicitly for piecewise-quadratic edge costs and demands that are piecewise-linear functions of 𝜆. Further, we can generalize this approach to compute approximate solutions for arbitrary convex cost functions. The algorithm is tested on real-world gas and traffic networks.

This is joint work with Max Klimm and Per Joachims.

Photo Shpresim Sadiku Shpresim SadikuIOL
Coordinates:

Title. Neural Network Approximation Theory (own results)

Abstract.

We review classical and modern results in approximation theory of neural networks. First, the density of neural networks within different function spaces under various assumptions on the activation function is considered. Next, lower and upper bounds on the order of approximation with neural networks are given based on the input dimension, the number of neurons and a parameter quantifying the smoothness of the target function. Lastly, a family of compositional target functions for which the curse of dimensionality can be overcome using deep neural networks is examined.

Photo Kartikey Sharma Kartikey SharmaIOL
Coordinates:

Title. Robust Optimization and Learning (own results)

Abstract.

Robust Optimization is a popular paradigm for optimisation under uncertainty. It assumes that the uncertainty is specified by a set and aims to optimise the worst case objective over all elements in the set.

However in many problems the uncertainty changes over time as more information comes in. For this project our goal is to develop an adaptive learning algorithm that can simultaneously learn the uncertainty set and solve the optimisation problem. We achieve this by statistical updates and leveraging online learning algorithms.

This is joint work with Kevin-Martin Aigner, Kristin Braun, Frauke Liers and Sebastian Pokutta.

Photo Davide Lofano Davide LofanoCOGA about
Coordinates:

Title. Contractibility vs Collapsibility (survey & own results)

Abstract.

Since the beginning of Topology, one of the most used approach to study an object was to triangulate it. Many invariants have been introduced in the years, the two most important surely being homotopy and homology. However, the direct computation of these invariants is infeasible for triangulations with a large number of faces. This is why methods to reduce the number of faces of a complex without changing its homology and homotopy type are particularly important. In this talk, we will focus on various ways to see if a complex is contractible, starting from collapsibility and then trying to generalize the concept.

Photo Arturo Merino Arturo Merino
Coordinates:

Title. Efficient generation of rectangulations via permutation languages (own results)

Abstract.

Generic rectangulations are tilings of the unit square by rectangles such that no 4 rectangles share a corner. Generic rectangulations are a fundamental object in computer science: not only appearing in practice (e.g. in VLSI circuit design) and theory (e.g. being closely related to binary search trees); but also within many other fundamental combinatorial objects such as permutations. In this work, we present a versatile algorithmic framework for exhaustively generating a large variety of different classes of generic rectangulations. Our algorithms work under very mild assumptions and apply to a considerable number of interesting rectangulation classes known from the literature, such as generic rectangulations, diagonal rectangulations, S-equivalent rectangulations, 1-sided/area-universal rectangulations, and their guillotine variants. The resulting generation algorithms are efficient, in most interesting cases even loopless or constant amortized time, i.e., each new rectangulation is generated in constant time in the worst case or on average, respectively.

In this talk we will show:

  • A simple introduction to rectangulations and some of its interesting classes,
  • Our algorithmic generation framework for rectangulation classes and
  • Correctness proof (sketch) of the framework by relating rectangulations to permutations.

This is joint work with Torsten Mütze.

Photo Boro Sofranac Boro SofranacIOL
Coordinates:

Title. Accelerating Domain Propagation: an Efficient GPU-Parallel Algorithm over Sparse Matrices (own results)

Abstract.

Fast domain propagation of linear constraints has become a crucial component of today’s best algorithms and solvers for mixed integer programming and pseudo-boolean optimization to achieve peak solving performance. Irregularities in the form of dynamic algorithmic behaviour, dependency structures, and sparsity patterns in the input data make efficient implementations of domain propagation on GPUs and, more generally, on parallel architectures challenging. This is one of the main reasons why domain propagation in state-of-the-art solvers is single thread only. In this paper, we present a new algorithm for domain propagation which (a) avoids these problems and allows for an efficient implementation on GPUs, and is (b) capable of running propagation rounds entirely on the GPU, without any need for synchronization or communication with the CPU. We present extensive computational results which demonstrate the effectiveness of our approach and show that ample speedups are possible on practically relevant problems: on state-of-the- art GPUs, our geometric mean speed-up for reasonably-large instances is around 10x to 20x and can be as high as 195x on favorably-large instances.

Photo Alejandro Carderera Alejandro CardereraGeorgia Tech & IOL
Coordinates:

Title. Local Acceleration of Conditional Gradients (own results)

Abstract.

Conditional gradients (CG) constitute a class of projection-free first-order algorithms for smooth convex optimization. As such, they are frequently used in solving smooth convex optimization problems over polytopes, for which the computational cost of projections is prohibitive. We will discuss two topics in this talk. First of all, CG algorithms do not enjoy the optimal convergence rates achieved by projection-based first-order accelerated methods; achieving such globally-accelerated rates is information-theoretically impossible. To address this issue, we present Locally Accelerated Conditional Gradients – an algorithmic framework that couples accelerated steps with conditional gradient steps to achieve local acceleration on smooth strongly convex problems. Our approach does not require projections onto the feasible set, but only on (typically low-dimensional) simplices, thus keeping the computational cost of projections at bay. Further, it achieves optimal accelerated local convergence.

The second topic in this talk deals with the setting where first and second-order information about the function is costly to compute, and so we would like to make as much primal progress as possible per oracle call. Constrained second-order convex optimization algorithms are the method of choice when a high accuracy solution to a problem is needed, due to their local quadratic convergence. These algorithms require the solution of a constrained quadratic subproblem at every iteration. In this setting, we present the Second-Order Conditional Gradient Sliding (SOCGS) algorithm, which uses a projection-free algorithm to solve the constrained quadratic subproblems inexactly and uses inexact Hessian oracles (subject to an accuracy requirement). When the feasible region is a polytope, and a lower bound on the primal gap is known, the algorithm converges quadratically in primal gap after a finite number of linearly convergent iterations. Once in the quadratic regime the SOCGS algorithm requires O(log log (1/ε)) first-order and inexact Hessian oracle calls and O(log (1 / ε) ⋅ log log (1/ε)) linear minimization oracle calls to achieve an ε-optimal solution.

Photo Rico Raber Rico RaberCOGA
Coordinates:

Title. Multidimensional Packing under Convex Quadratic Constraints (own results)

Abstract.

We consider a general class of binary packing problems with multiple convex quadratic knapsack constraints. We present three constant-factor approximation algorithms based upon three different algorithmic techniques: (1) a rounding technique tailored to a convex relaxation in conjunction with a non-convex relaxation; (2) a greedy strategy; and (3) a randomized rounding method. The practical performance is tested by a computational study of the empirical approximation of these algorithms for problem instances arising in the context of real-world gas transport networks.

Photo Elias Wirth Elias WirthIOL
Coordinates:

Title. Learning Relations From Data With Conditional Gradients (own results)

Abstract.

The learning of relations, dynamics, and governing equations from (sampled) data is a prevalent problem in various scientific fields. In the absence of noise, this task can often be solved using the Buchberger-Möller Algorithm, see Möller and Buchberger (1982). In general, however, when we do not have direct access to aforementioned equations, we have access to them indirectly only via noisy sampling. Known methods which extract relations from noisy data are the Approximate Vanishing Ideal Algorithm (AVI) and Vanishing Component Analysis (VCA), see Heldt et al. (2009) and Livni et al. (2013), respectively. These algorithms are often computationally inefficient and only allow for restricted recovery of relations from data. We present a new AVI algorithm and two variants utilizing Conditional Gradient Descent or the Frank-Wolfe Algorithm, see, e.g., Frank and Wolfe (1956) or Levitin and Polyak (1966). Our method is more general, easily regularizable, and can also extract non-homogeneous relations. We then present the efficacy and versatility of our approach for three learning problems: reconstruction of governing equations from data, classification, and regression.

Photo Stephan Wäldchen Stephan Wäldchen
Coordinates:

Title. Understanding Neural Network Decisions is hard - From Probabilistic Prime Implicants to Arc Bending. (own results)

Abstract.

This talk is a basic overview over my PhD projects. I focus on a project on making neural networks interpretable. If a neural network outputs a classification result for a high-dimensional input, we would like to know whether a small set of the input parameters was already decisive for the classification. This problem is formalised by introducing a probabilistic variant of prime implicants. We show that this problem is hard for the class NPPP and remains NP-hard for any non-trivial approximation. We present a heuristic solution strategy for real-world applications that is build on Assumed Density Filtering: A technique to propagate multivariate normal distributions through neural networks. To justify the use of this heuristic we show that in fact no sensible family of distributions can be invariant under ReLu-neural network layers.

Photo Thomas Kerdreux Thomas KerdreuxIOL
Coordinates:

Title. The artification of the so-called A.I. art and the creative industry (discussion)

Abstract.

In Computer Vision, algorithms are now capable of “intelligent” automatic processing of the images. And these possibilities have been recently recycled in many creative processes by A.I. artists, which outputs sometimes reached hammer prizes in art auctions. This emergence of A.I. art looks like an artification, i.e., a societal process by which a practice becomes commonly acknowledged as an art form. Art history abounds with such examples, e.g., the transitioning from craftmanship to solo painters in the Renaissance.

Hence, rather than assessing the artist’s narratives, this point of view allows understanding A.I. art as a figurehead of the broader ecosystem of the creative industry. The latter indeed highly benefited from recent advances in Computer Vision.

We discuss here the possible intertwinements between A.I. art, the use of Computer Vision in the next generation of creative software, the creative industry, and broader societal consequences.

This is ongoing work with Louis Thiry and Léa Saint-Raymond at Ecole Normale Supérieure in Paris.

Photo Khai Van Tran Khai Van TranTU Berlin
Coordinates:

Title. Improved Bounds on the Competitive Ratio for Symmetric Rendezvous-on-the-Line with Unknown Initial Distance (own results)

Abstract.

During the COGA Retreat 2019 we started research on the Symmetric Rendezvous-on-the-Line-Problem with Unknown Initial Distance: Two robots are placed on the real line with unknown initial distance. They do not know the location of the other robot and do not necessarily share the same sense of “left” and “right” on the line. They move with constant speed over the line and try to achieve rendez-vous using the same (randomized) search strategy in as little time as possible.

In a March 2020 talk (given in the COGA Research Seminar) we presented a family of search strategies that performed well in simulations without being able to determine their exact competitive ratio. During this talk we will present a novel family of search strategies whose precise competitive ratio can be computed using generating functions.

This is WIP by Guillaume, Khai Van, Martin, and Max.

Photo Christoph Hertrich Christoph HertrichTU Berlin [homepage]
Coordinates:

Title. Computing the Maximum Function with ReLU Neural Networks (own results)

Abstract.

This talk deals with an open problem concerning the expressivity of feedforward neural networks (NNs) with rectified linear unit (ReLU) activations. While it has been shown that every continuous, piecewise linear (CPWL) function can be exactly represented by such an NN with logarithmic depth (with respect to the input dimension), it is still unknown whether there exists any CPWL function that cannot be represented with three layers. Based on a yet unproven theoretical assumption, we show a computer-aided proof of non-existence of a 3-layer ReLU NN that computes the maximum of five scalar input numbers. We achieve this partial result by converting the existence question into a linear programming feasibility problem. This is joint work in progress with Amitabh Basu, Marco Di Summa, and Martin Skutella.

Photo Cyrille Combettes Cyrille CombettesGeorgia Tech & IOL [homepage]
Coordinates:

Title. Frank–Wolfe with New and Practical Descent Directions (own results)

Abstract.

The Frank-Wolfe algorithm has become a popular constrained optimization algorithm for it is simple and projection-free, and it has been successfully applied to a variety of real-world problems. Its main drawback however lies in its convergence rate, which can be excessively slow due to naive descent directions. In this talk, we provide a brief review and propose two methods significantly improving its performance: one for minimizing a convex objective and one for minimizing a (non)convex finite-sum objective. The first method speeds up the Frank-Wolfe algorithm by better aligning the descent direction with that of the negative gradient via a matching pursuit-style subroutine. Although the idea is reasonably natural, its advantage over the state-of-the-art Frank-Wolfe algorithms is quite considerable. The second method blends stochastic Frank-Wolfe algorithms with adaptive gradients, which set entry-wise step-sizes that automatically adjust to the geometry of the problem. Computational experiments on convex and nonconvex objectives demonstrate the advantage of our approach.

Talks by earlier visitors

Photo Gonzalo Muñoz Gonzalo Muñoz (Universidad de O’Higgins, Chile) [homepage]

Coordinates:

Title. A Sample Size-Efficient Polyhedral Encoding and Algorithm for Deep Neural Network Training Problems

Abstract.

Deep Learning has received much attention lately, however, results regarding the computational complexity of training deep neural networks have only recently been obtained. Moreover, all current training algorithms for DNNs with optimality guarantees possess a poor dependency on the sample size, which is typically the largest parameter. In this work, we show that the training problems of large classes of deep neural networks with various architectures admit a polyhedral representation whose size is linear in the sample size. This provides the first theoretical training algorithm with provable worst-case optimality guarantees whose running time is polynomial in the size of the sample.

Photo Ariel Liebman Ariel Liebman (Monash University, Australia) [homepage]

Coordinates:

Title. Optimisation, Machine Learning and AI for Rapid Grid Decarbonisation

Abstract.

The national and transcontinental electricity grids of today are based on devices such as coal furnaces, steam turbines, copper and steel wires, electric transformers, and electromechanical power switches that have remained unchanged for 100 years. However imperceptibly, the components and operational management of this great machine, the grid, has began to change irreversibly. This is fortunate, as climate science tells us we must reduce CO2 emissions from the energy sector to zero by 2050 and to 50% of current levels by 2030 if we are to prevent dangerous climate changes in future world that is over 1.5 degree hotter that today. Now utility scale wind and solar PV farms as large as coal, gas and nuclear generators are being deployed more cheaply than it is possible to build and operate generators using older technologies. In some cases, even these new technologies can be cheaper that even merely the operating costs of older technologies. In addition, low cost rooftop solar PV has also enabled consumers to become self-suppliers and also contributors to the supply of energy for their neighbours. Moreover, the “dumb” grid of the past, is becoming “smarter”. This is enabled through a combination of ubiquitous low-cost telecommunication and programmable devices at the edge of the grid such as smart meters, smart PV inverters, smart air conditioners and home energy management systems. The final component is the electrification of the private transport system that will finally eliminate the need for fossil fuels. The implications of this are that it is now necessary to rapidly replan and reinvest in the energy system at rates and in ways that are unprecedented in industrial civilisations history. While the majority of hardware technology already exist, the missing piece of the puzzle are new computers science technologies, and particularly Optimisation, Machine Learning, Forecasting and Data analytics methods needed to plan and operate this rapidly transforming system.

In this talk I will describe a range of ways existing computer science tools in the Optimisation, AI, ML and other areas we and others are enhancing in order to better operate and plan the existing power system. I will focus on identifying emerging research opportunities in areas that are needed to complete the transformation to a cheaper, smarter and zero carbon energy system.

Photo Masashi Sugiyama Masashi Sugiyama (RIKEN-AIP/University of Tokyo, Japan) [homepage]

Coordinates:

Title. Introduction of RIKEN Center for Advanced Intelligence Project

Abstract.

RIKEN is one of Japan’s largest fundamental-research institutions. The RIKEN Center for Advanced Intelligence Project (AIP) was created in 2016 to propose and investigate new machine learning methodologies and algorithms, and apply them to societal problems. AIP covers a wide range of topics from generic AI research (machine learning, optimization, applied math., etc.), goal-oriented AI research (material, disaster, cancer, etc.), and AI-in-society research (ethics, data circulation, laws, etc.). In the first half of my talk, I will briefly introduce our center’s activities and collaboration strategies.

Then, in the latter half, I will talk about the research activities in my team, i.e., machine learning from imperfect information. Machine learning has been successfully used to solve various real-world problems. However, we are still facing many technical challenges, for example, we want to train machine learning systems without big labeled data and we want to reliably deploy machine learning systems under noisy environments. I will overview our recent advances in tackling these problems.