GENAD ▶ ongoing

Generative AI for Scalable Autonomous Driving Systems

The project aims at shaping the next generation of highway autonomy for heavy trucks by combining long-range multimodal data, controllable generative models, a closed-loop virtual gym, and vision-language model (VLM) embedded reasoning. Rather than treating autonomy as a pure perception or planning problem, the project aims to advance generative simulation, reinforcement learning and chain-of-thought reasoning and language-guided representation learning as a unified research agenda.

Mar 2026 to Mar 2029Mar 26 to Mar 29

🧑‍🎓 IOL Project Members

Sebastian Pokutta
Principal Investigator
pokutta (at) zib.de

Edoardo Palladin
palladin (at) zib.de

Filippo Ghilotti
ghilotti (at) zib.de

Gao Lili
gao (at) zib.de

Samuel Brucker
brucker (at) zib.de

🪙 Funding

This project is being funded by the Torc Robotics from March 2026 to March 2029.

🔬 Project Description

Safe highway autonomy for heavy trucks remains an open and unsolved challenge: due to long braking distances, scene understanding of hundreds of meters is required for anticipatory planning and to allow safe braking margins. However, existing driving research primarily cover urban scenes, with perception effectively limited to short ranges of only up to 100 meters.

The Generative AI for Scalable Autonomous Driving Systems project brings together experts in AI, Applied Mathematics, Computer Science and Autonomous Driving, being a collaboration between ZIB, TU Berlin and TORC Robotics. The project aims to develop scalable AI methods for highway autonomy of heavy trucks. These advancements will enable the creation of long-range perception systems, controllable driving scenario generation, closed-loop training and vision-language model (VLM) embedded reasoning.

Controllable Generative Models

This line of work is intended to produce contributions on controllable scenario generation and realism evaluation, combining diffusion, flow matching, autoregressive and hybrid architectures over vectorised scenes, bird’s-eye grids and sparse representations. The objective is to confirm that Generative Models models can target specific families of challenging long-range situations, such as cut-ins at 300-800 metres, occluded slow vehicles, construction zones and adverse weather, while remaining realistic and diverse.

Closed-Loop Simulations

The project aims at building a closed-loop virtual gym for heavy-truck highway autonomy that uses real data and generative models in a unified framework. The gym will replay, augment and perturb recorded scenes, allowing agents to experience counterfactual and adversarial situations. The environment will also be used for reinforcement learning and related methods, with multi-objective criteria over safety, efficiency, comfort and rule compliance and for systematic robustness evaluation. The objective is to demonstrate how generative models and structured scenario curricula can improve closed-loop robustness and sample efficiency.

Vision Language Driving Models

This research line analyses and develops a framework for VLM-embedded reasoning in modular autonomous driving stacks. Large VLMs will be used to produce textual scene descriptions, intent hypotheses and action rationales to improve perception, prediction and planning modules through the use of embeddings and alignment objectives. The goal is showing that aligning internal states with language-based reasoning signals yields driving stacks that are more robust, more data-efficient and more interpretable at test time.