Traditionally, data was collected and used to validate a model hypothesis, which in turn, once verified, formed the basis for decision-making and optimization. With the abundance of data and computational power we can, for the first time, afford an integrated perspective of data, learning, and decisions, where data is directly considered in the decision-making problems through state-of-the-art machine learning. Overcoming the traditional approach of optimizing against a proxy provides faster and more relevant decisions as well as a more direct risk-mitigation in decision making.
We are interested in the development of new algorithms at the intersection of optimization and machine learning. A particular focus is on integrating discrete aspects (as decisions can be discrete and continuous), which have traditionally been ignored as they often lead to very hard (almost always NP-hard) non-convex optimization problems. However, in recent years solvers for such discrete optimization problems (e.g., SCIP) got very fast and at the same time non-convex optimization problems have become more important due to their additional expressive power.
In the context of industry problems, we complement our foundational research with application-oriented research. To date, we have successfully deployed our algorithms in more than 20 real-world projects. The applications of interest span various areas, including logistics and supply chain management, manufacturing, predictive analytics, big data, digital services, energy, transportation, and medical and healthcare systems.
Due to our tight integration into the ZIB’s larger AI research ecosystem we can tackle a wide variety of research questions.
Checkout out select ongoing and past projects.
A very important family of first-order methods for constrained smooth convex optimization are so-called Conditional Gradient algorithms (aka Frank-Wolfe algorithms). These algorithms are projection-free and access the feasible region by means of a so-called linear optimization oracle. The latter oracle is where the link to discrete optimization naturally comes into play.
Standard variants of conditional gradient algorithms solve a linear optimization problem in each iteration. Solving one linear optimization problem per iteration is often unsatisfactory due to the huge computational cost when the underlying linear program is either large or stems from a hard discrete optimization problem. We are interested in modified variants of conditional gradient algorithms that further reduce the computational requirements by reusing already collected information in previous iterations.
Explainability and Interpretability
The aforementioned conditional gradient methods have the additional advantage that they typically represent their iterates as (relatively) sparse convex combinations of extremal points of the feasible region. This allows for solutions with a higher level of explainability and interpretability which is often necessary when deploying algorithms in real-world contexts.
We are similarly interested in understanding how explainability and interpretability can be incorporated into mixed-integer programming. While it is understood quite well from a theoretical perspective how such explanations of optimality can be obtained in principle (e.g., by outputting the whole branch-and-bound tree including cutting-planes), such solutions are not viable as those certificates can be extremely large. Rather, we aim for short and simple explanations for most problems.
Learning and Discrete Optimization
In view of the above, a natural question to ask is whether we can also use machine learning approaches to improve solving discrete optimization problems. These approaches can be separated into the following categories:
- Learning across problems: We solve similar problems and we want to learn from previous problems, e.g., what heuristics or branching rules to use. This can be done offline and then applied to future problems (of a potentially similar type).
- Learning within a problem: Quickly gather information at the beginning, and use it to improve the rest of the solution process. Typically use an online and even one-shot learning component.
Moreover, we often consider combinations and modifications of the above. Initial approaches here are quite promising. Also, discrete optimization methods recently gained attention in verifying neural networks and in the context of adversarial attacks.
A key component for solving an optimization problem efficiently is the size of its formulation. The theory of extended formulations is concerned with the following question: which optimization problems can be efficiently solved via Linear and Semidefinite Programming (or more general Conic Programming such as, e.g., Hyperbolic Programming) by means of a small formulation? An important aspect of extended formulations is that the obtained results are unconditional: while we only know that many important optimization problems are computationally hard assuming the famous P ≠ NP conjecture which conjectures that a certain type of problem does not admit an efficient solution procedure, results obtained via extended formulations establish universal hardness (for a given solution technique such as, e.g., Linear Programming), irrespective of the P vs. NP question.
The underlying key idea is that by lifting the optimization problem into a higher dimensional space, the number of required constraints (or variables) can be reduced. For example, on the left, we see that the 8-gon in the plane requires 8 inequalities, while writing it in a three-dimensional space, we only need 6 inequalities.
Alternative Mode(l)s of Computation
Today’s SoCs and FPGAs are powerful customizable hardware that allows to accelerate and implement complex algorithms for specialized loads. While specialized/customized hardware is quite common in the context of deep learning (e.g., by means of GPUs or Google’s TPUs which one might think of as custom ASICs) no such custom implementations exist for discrete methods (such as e.g., Mixed-Integer Programming algorithms or Constraint Programming algorithms) or heuristics commonly used in discrete methods. Similarly, quantum computing is an extended model of computation promising speedups for some types of problems and there is a huge need to understand for which problems such speedups can be obtained.
Customized accelerators. We are interested in designing custom hardware (via FPGAs) to significantly speed-up, e.g., discrete optimization algorithms by implementing specialized operations/functions. On a higher level, such functions include specific operations commonly used in, e.g., Mixed-Integer Programming algorithms. On a lower level such functions could include indirect memory access operations, which are one of the main bottlenecks in today’s simplex implementations.
Edge computing. We are also interested in the edge computing regime, where we want to deploy complex algorithms in-situ. An example in this context is e.g., the deployment of the SCIP on a Raspberry Pi but more complex setups with customized SoCs are conceivable; both bare metal and full system setups. Some major advantages include that data can remain in-situ and does not need to be transferred to a data center alleviating potential privacy concerns and reducing latency, e.g., for time-critical predictive maintenance applications.
Quantum computing. We are also interested in understanding which optimization and learning algorithms can be transferred either into a hybrid or purely quantum computing setting, with the aim of harnessing quantum speedups for optimization and learning.