Featured publications

Publications by theme

Agentic systems
Name and link to paper Authors  About Year 
Christos Ziakas, Amir Bar, Alessandra Russo GVP-WM grounds video-generated plans into feasible action sequences using a pre-trained action-conditioned world model via video-guided latent collocation. 2026
AI safety and evaluations
Name and link to paper Authors  About  Year 
Christos Ziakas, Nicholas Loo, Nishita Jain, Alessandra Russo Red-Bandit is a red-teaming framework that adapts online to identify and exploit LLM failure modes under specific attack styles (e.g., manipulation, slang) by selecting among a set of parameter-efficient LoRA experts. 2025
Andrew M Bean, Nabeel Seedat, Shengzhuang Chen, Jonathan Richard Schwarz This paper presents an item-centric method, selecting benchmark subsets via task properties to enable efficient, interpretable and robust large language model evaluation. 2025
Marco Gutierrez, Xinyi Leng, Hannah Cyberey, Jonathan Richard Schwarz, Ahmed Alaa, Thomas Hartvigsen

BenchAlign leverages limited performance data and pairwise rankings to produce interpretable, preference-aligned benchmarks that accurately rank unseen models and better predict real-world utility.

2026
Lukas Thede, Stefan Winzeck, Zeynep Akata, Jonathan Richard Schwarz CapTrack is a capability-centric framework for analysing forgetting in LLMs that combines a behavioural taxonomy with an evaluation suite built on established benchmarks and targeted adaptations.  2026
Model training and reasoning
Name and link to paper Authors  About  Year 

Shengzhuang Chen, Xu Ouyang, Michael Arthur Leopold Pearce, Thomas Hartvigsen, Jonathan Richard Schwarz

An approach to data mixture selection for language models that balances computational cost and performance using Bayesian optimization. 2025

Daniel Furelos-Blanco, Charles Pert, Frederik Kelbel, Alex F. Spies, Alessandra Russo, Michael Dennis

ATLAS tackles a key reinforcement learning challenge by jointly co-designing task and environment curricula, automatically generating solvable yet challenging training pairs that dramatically outperform random sampling, especially in complex settings where viable combinations are rare. 2026

Christos Ziakas, Alessandra Russo

VITA is a test-time adaptation method that improves both generalization and temporal reasoning of VLMs for zero-shot goal-conditioned value function estimation. 2026

Mohammad Albinhassan, Pranava Madhyastha, Alessandra Russo

SEM-CTRL is a controlled decoding framework that guides and enforces rich semantic constraints on an LLM at inference time, guaranteeing correctness and enabling small LLMs to outperform frontier models without training. 2026
Societal impact
Name and link to paper Authors  About  Year 
Felix Steffek and Mihoko Sumida Cambridge University Press, XVIII, 243 pp. 2025

Holli Sargeant, Ahmed Izzidien, Felix Steffek

This paper addresses a critical gap in legal analytics by developing and applying a novel taxonomy for topic classification of summary judgment cases in the United Kingdom. 2025