Joint IOL and COGA research seminar

List of talks for the joint weekly seminar with Combinatorial Optimization & Graph Algorithms group from TUB in Fall 2020. See the official homepage for more information and contact Christoph Hertrich for Zoom links and other questions, but please use the contact address below for questions on this page.

We meet every Thursday at 14:00 in room MA 517 at TUB when the pandemic situation allows, otherwise online.

Until further notice the seminar takes place online.

Mathieu Besançon (IOL)
Coordinates:

Title. Differentiable Optimization & Integration with Automatic Differentiation (survey)

Abstract.

Differentiable optimization aims at bringing constrained convex optimization to differentiable programming, allowing the computation of solution derivatives with respect to arbitrary input parameters. After an overview of some of the advances in that direction, we will see how to integrate differentiable optimization as a layer in differentiable models and its interplay with automatic differentiation, and conclude with some comparisons with other approaches.

Photo Sven Jäger Sven Jäger (COGA)
Coordinates:

Title. Online Scheduling of Deterministic and Stochastic Jobs on Unrelated Machines

Abstract.

We consider the problem of scheduling jobs on parallel machines so as to minimize the sum of weighted job completion times. We assume that each job must be completely processed on one machine and that the time required may depend arbitrarily on the job and the machine. Furthermore, the jobs arrive online over time. In the first part of the talk, we consider preemptive scheduling of deterministic jobs. I will use dual fitting to prove that a simple greedy algorithm for this problem is 4-competitive. The second part is about non-preemptive scheduling of stochastic jobs. Here a similar greedy algorithm for the assignment of jobs to machines, combined with known policies for scheduling the jobs assigned to each machine (Schulz, 2008), leads to stochastic online policies with performance guarantees depending on an upper bound 𝛥 for the squared coefficients of variation of the random variables. These improve upon the previously best known guarantees (Gupta, Moseley, Uetz, & Xie) when 𝛥 is large enough (for the simplest variant considered when 𝛥 ≥ 0.815). This includes the important case of exponentially distributed processing times.

Photo Philipp Warode Philipp Warode
Coordinates:

Title. Parametric Computation of Minimum Cost Flows (own results)

Abstract.

Minimum cost flows have numerous applications, e.g. electrical or gas flows or equilibrium flows in traffic networks. In the classical minimum cost flow problem we are given some flow-dependent cost functions on the edges and demand values on the vertices. In this setting we want to find a flow with minimal costs satisfying the demands. However, in practice, demand and supply values are not constant but may vary or be uncertain. Therefore, we want to analyze the minimum cost flows as a function of the demand values. More specifically, we consider a setting where the demands in the network are given as a function depending on a one-dimensional parameter 𝜆. We analyze these functions theoretically and use the insights to develop an output-polynomial algorithm that computes the minimum cost flow function explicitly for piecewise-quadratic edge costs and demands that are piecewise-linear functions of 𝜆. Further, we can generalize this approach to compute approximate solutions for arbitrary convex cost functions. The algorithm is tested on real-world gas and traffic networks. This is joint work with Max Klimm and Per Joachims.

Photo Shpresim Sadiku Shpresim Sadiku (IOL)
Coordinates:

Title. Neural Network Approximation Theory (own results)

Abstract.

We review classical and modern results in approximation theory of neural networks. First, the density of neural networks within different function spaces under various assumptions on the activation function is considered. Next, lower and upper bounds on the order of approximation with neural networks are given based on the input dimension, the number of neurons and a parameter quantifying the smoothness of the target function. Lastly, a family of compositional target functions for which the curse of dimensionality can be overcome using deep neural networks is examined.

Photo Kartikey Sharma Kartikey Sharma (IOL)
Coordinates:

Title. Robust Optimization and Learning (own results)

Abstract.

Robust Optimization is a popular paradigm for optimisation under uncertainty. It assumes that the uncertainty is specified by a set and aims to optimise the worst case objective over all elements in the set. However in many problems the uncertainty changes over time as more information comes in. For this project our goal is to develop an adaptive learning algorithm that can simultaneously learn the uncertainty set and solve the optimisation problem. We achieve this by statistical updates and leveraging online learning algorithms. This is joint work with Kevin-Martin Aigner, Kristin Braun, Frauke Liers and Sebastian Pokutta.

Photo Davide Lofano Davide Lofano (COGA about)
Coordinates:

Title. Contractibility vs Collapsibility (survey & own results)

Abstract.

Since the beginning of Topology, one of the most used approach to study an object was to triangulate it. Many invariants have been introduced in the years, the two most important surely being homotopy and homology. However, the direct computation of these invariants is infeasible for triangulations with a large number of faces. This is why methods to reduce the number of faces of a complex without changing its homology and homotopy type are particularly important. In this talk, we will focus on various ways to see if a complex is contractible, starting from collapsibility and then trying to generalize the concept.

Photo Arturo Merino Arturo Merino (COGA)
Coordinates:

Title. Efficient generation of rectangulations via permutation languages (own results)

Abstract.

Generic rectangulations are tilings of the unit square by rectangles such that no 4 rectangles share a corner. Generic rectangulations are a fundamental object in computer science: not only appearing in practice (e.g. in VLSI circuit design) and theory (e.g. being closely related to binary search trees); but also within many other fundamental combinatorial objects such as permutations. In this work, we present a versatile algorithmic framework for exhaustively generating a large variety of different classes of generic rectangulations. Our algorithms work under very mild assumptions and apply to a considerable number of interesting rectangulation classes known from the literature, such as generic rectangulations, diagonal rectangulations, S-equivalent rectangulations, 1-sided/area-universal rectangulations, and their guillotine variants. The resulting generation algorithms are efficient, in most interesting cases even loopless or constant amortized time, i.e., each new rectangulation is generated in constant time in the worst case or on average, respectively.

In this talk we will show:

  • A simple introduction to rectangulations and some of its interesting classes,
  • Our algorithmic generation framework for rectangulation classes and
  • Correctness proof (sketch) of the framework by relating rectangulations to permutations.

This is joint work with Torsten Mütze.

Photo Boro Sofranac Boro Sofranac (IOL)
Coordinates:

Title. Accelerating Domain Propagation: an Efficient GPU-Parallel Algorithm over Sparse Matrices (own results)

Abstract.

Fast domain propagation of linear constraints has become a crucial component of today’s best algorithms and solvers for mixed integer programming and pseudo-boolean optimization to achieve peak solving performance. Irregularities in the form of dynamic algorithmic behaviour, dependency structures, and sparsity patterns in the input data make efficient implementations of domain propagation on GPUs and, more generally, on parallel architectures challenging. This is one of the main reasons why domain propagation in state-of-the-art solvers is single thread only. In this paper, we present a new algorithm for domain propagation which (a) avoids these problems and allows for an efficient implementation on GPUs, and is (b) capable of running propagation rounds entirely on the GPU, without any need for synchronization or communication with the CPU. We present extensive computational results which demonstrate the effectiveness of our approach and show that ample speedups are possible on practically relevant problems: on state-of-the- art GPUs, our geometric mean speed-up for reasonably-large instances is around 10x to 20x and can be as high as 195x on favorably-large instances.

Photo Alejandro Carderera Alejandro Carderera (Georgia Tech & IOL)
Coordinates:

Title. Local Acceleration of Conditional Gradients (own results)

Abstract.

Conditional gradients (CG) constitute a class of projection-free first-order algorithms for smooth convex optimization. As such, they are frequently used in solving smooth convex optimization problems over polytopes, for which the computational cost of projections is prohibitive. We will discuss two topics in this talk. First of all, CG algorithms do not enjoy the optimal convergence rates achieved by projection-based first-order accelerated methods; achieving such globally-accelerated rates is information-theoretically impossible. To address this issue, we present Locally Accelerated Conditional Gradients – an algorithmic framework that couples accelerated steps with conditional gradient steps to achieve local acceleration on smooth strongly convex problems. Our approach does not require projections onto the feasible set, but only on (typically low-dimensional) simplices, thus keeping the computational cost of projections at bay. Further, it achieves optimal accelerated local convergence. The second topic in this talk deals with the setting where first and second-order information about the function is costly to compute, and so we would like to make as much primal progress as possible per oracle call. Constrained second-order convex optimization algorithms are the method of choice when a high accuracy solution to a problem is needed, due to their local quadratic convergence. These algorithms require the solution of a constrained quadratic subproblem at every iteration. In this setting, we present the Second-Order Conditional Gradient Sliding (SOCGS) algorithm, which uses a projection-free algorithm to solve the constrained quadratic subproblems inexactly and uses inexact Hessian oracles (subject to an accuracy requirement). When the feasible region is a polytope, and a lower bound on the primal gap is known, the algorithm converges quadratically in primal gap after a finite number of linearly convergent iterations. Once in the quadratic regime the SOCGS algorithm requires O(log log (1/ε)) first-order and inexact Hessian oracle calls and O(log (1 / ε) ⋅ log log (1/ε)) linear minimization oracle calls to achieve an ε-optimal solution.

Photo Rico Raber Rico Raber (COGA)
Coordinates:

Title. Multidimensional Packing under Convex Quadratic Constraints (own results)

Abstract.

We consider a general class of binary packing problems with multiple convex quadratic knapsack constraints. We present three constant-factor approximation algorithms based upon three different algorithmic techniques: (1) a rounding technique tailored to a convex relaxation in conjunction with a non-convex relaxation; (2) a greedy strategy; and (3) a randomized rounding method. The practical performance is tested by a computational study of the empirical approximation of these algorithms for problem instances arising in the context of real-world gas transport networks.

Photo Elias Wirth Elias Wirth (IOL)
Coordinates:

Title. Learning Relations From Data With Conditional Gradients (own results)

Abstract.

The learning of relations, dynamics, and governing equations from (sampled) data is a prevalent problem in various scientific fields. In the absence of noise, this task can often be solved using the Buchberger-Möller Algorithm, see Möller and Buchberger (1982). In general, however, when we do not have direct access to aforementioned equations, we have access to them indirectly only via noisy sampling. Known methods which extract relations from noisy data are the Approximate Vanishing Ideal Algorithm (AVI) and Vanishing Component Analysis (VCA), see Heldt et al. (2009) and Livni et al. (2013), respectively. These algorithms are often computationally inefficient and only allow for restricted recovery of relations from data. We present a new AVI algorithm and two variants utilizing Conditional Gradient Descent or the Frank-Wolfe Algorithm, see, e.g., Frank and Wolfe (1956) or Levitin and Polyak (1966). Our method is more general, easily regularizable, and can also extract non-homogeneous relations. We then present the efficacy and versatility of our approach for three learning problems: reconstruction of governing equations from data, classification, and regression.

Photo Stephan Wäldchen Stephan Wäldchen
Coordinates:

Title. Understanding Neural Network Decisions is hard - From Probabilistic Prime Implicants to Arc Bending. (own results)

Abstract.

This talk is a basic overview over my PhD projects. I focus on a project on making neural networks interpretable. If a neural network outputs a classification result for a high-dimensional input, we would like to know whether a small set of the input parameters was already decisive for the classification. This problem is formalised by introducing a probabilistic variant of prime implicants. We show that this problem is hard for the class NPPP and remains NP-hard for any non-trivial approximation. We present a heuristic solution strategy for real-world applications that is build on Assumed Density Filtering: A technique to propagate multivariate normal distributions through neural networks. To justify the use of this heuristic we show that in fact no sensible family of distributions can be invariant under ReLu-neural network layers.

Photo Thomas Kerdreux Thomas Kerdreux (IOL)
Coordinates:

Title. The artification of the so-called A.I. art and the creative industry (discussion)

Abstract.

In Computer Vision, algorithms are now capable of “intelligent” automatic processing of the images. And these possibilities have been recently recycled in many creative processes by A.I. artists, which outputs sometimes reached hammer prizes in art auctions. This emergence of A.I. art looks like an artification, i.e., a societal process by which a practice becomes commonly acknowledged as an art form. Art history abounds with such examples, e.g., the transitioning from craftmanship to solo painters in the Renaissance. Hence, rather than assessing the artist’s narratives, this point of view allows understanding A.I. art as a figurehead of the broader ecosystem of the creative industry. The latter indeed highly benefited from recent advances in Computer Vision. We discuss here the possible intertwinements between A.I. art, the use of Computer Vision in the next generation of creative software, the creative industry, and broader societal consequences. This is ongoing work with Louis Thiry and Léa Saint-Raymond at Ecole Normale Supérieure in Paris.

Photo Khai Van Tran Khai Van Tran (TU Berlin)
Coordinates:

Title. Improved Bounds on the Competitive Ratio for Symmetric Rendezvous-on-the-Line with Unknown Initial Distance (own results)

Abstract.

During the COGA Retreat 2019 we started research on the Symmetric Rendezvous-on-the-Line-Problem with Unknown Initial Distance: Two robots are placed on the real line with unknown initial distance. They do not know the location of the other robot and do not necessarily share the same sense of “left” and “right” on the line. They move with constant speed over the line and try to achieve rendez-vous using the same (randomized) search strategy in as little time as possible. In a March 2020 talk (given in the COGA Research Seminar) we presented a family of search strategies that performed well in simulations without being able to determine their exact competitive ratio. During this talk we will present a novel family of search strategies whose precise competitive ratio can be computed using generating functions. This is WIP by Guillaume, Khai Van, Martin, and Max.

Photo Christoph Hertrich Christoph Hertrich (TU Berlin) [homepage]
Coordinates:

Title. Computing the Maximum Function with ReLU Neural Networks (own results)

Abstract.

This talk deals with an open problem concerning the expressivity of feedforward neural networks (NNs) with rectified linear unit (ReLU) activations. While it has been shown that every continuous, piecewise linear (CPWL) function can be exactly represented by such an NN with logarithmic depth (with respect to the input dimension), it is still unknown whether there exists any CPWL function that cannot be represented with three layers. Based on a yet unproven theoretical assumption, we show a computer-aided proof of non-existence of a 3-layer ReLU NN that computes the maximum of five scalar input numbers. We achieve this partial result by converting the existence question into a linear programming feasibility problem. This is joint work in progress with Amitabh Basu, Marco Di Summa, and Martin Skutella.

Photo Cyrille Combettes Cyrille Combettes (Georgia Tech & IOL) [homepage]
Coordinates:

Title. Frank–Wolfe with New and Practical Descent Directions (own results)

Abstract.

The Frank-Wolfe algorithm has become a popular constrained optimization algorithm for it is simple and projection-free, and it has been successfully applied to a variety of real-world problems. Its main drawback however lies in its convergence rate, which can be excessively slow due to naive descent directions. In this talk, we provide a brief review and propose two methods significantly improving its performance: one for minimizing a convex objective and one for minimizing a (non)convex finite-sum objective. The first method speeds up the Frank-Wolfe algorithm by better aligning the descent direction with that of the negative gradient via a matching pursuit-style subroutine. Although the idea is reasonably natural, its advantage over the state-of-the-art Frank-Wolfe algorithms is quite considerable. The second method blends stochastic Frank-Wolfe algorithms with adaptive gradients, which set entry-wise step-sizes that automatically adjust to the geometry of the problem. Computational experiments on convex and nonconvex objectives demonstrate the advantage of our approach.