Georgios Batzolis

I am a postdoctoral researcher at the Department of Engineering, University of Cambridge, working with Prof. Mark Girolami. I previously completed my PhD at the Department of Applied Mathematics and Theoretical Physics (DAMTP), University of Cambridge, under the supervision of Prof. Carola-Bibiane Schönlieb.

My current research focuses on diffusion language models (DLMs)—methods for text generation that aim to close the gap with autoregressive models (LLMs). In our recent work, CoBit (Continuous Bitstream Diffusion), we run diffusion over binarized text. Based on the GenPPL–entropy protocol, CoBit advances the state of the art among open-source DLMs.

In parallel, I study the connection between Riemannian geometry and diffusion models. A central theme is constructing data-driven Riemannian metrics that uncover the intrinsic geometric structure of data. These metrics yield distance functions that better reflect semantic relationships between data points and enable optimization directly on the data manifold—paving the way for novel methods in controllable generation and inverse problem solving.

Publications

arXiv 2026

CoBit: Language Modeling with Bitstream Diffusion, Georgios Batzolis, Mark Girolami, Luca Ambrogioni arXiv preprint arXiv:2605.07013, 2026
[PDF] [Code]

SPIGM @ ICML 2026

Information-Guided Noise Allocation for Efficient Diffusion Training, Gabriel Raya, Bac Nguyen, Georgios Batzolis, Yuhta Takida, Dejan Stancevic, Naoki Murata, Chieh-Hsin Lai, Yuki Mitsufuji, Luca Ambrogioni SPIGM Workshop, ICML 2026
[arXiv]

ICML 2025

Score-based Pullback Riemannian Geometry, Willem Diepeveen*, Georgios Batzolis*, Zakhar Shumaylov, Carola-Bibiane Schönlieb ICML 2025
[Published Version] [Code]

ICML 2024

Diffusion Models Encode the Intrinsic Dimension of Data Manifolds, Jan Stanczuk*, Georgios Batzolis*, Teo Deveney, Carola-Bibiane Schönlieb ICML 2024
[Published Version] [arXiv] [Official Code] [Demo/Clean Code] [Official Webpage]

FoDS 2024

CAFLOW: conditional autoregressive flows, Georgios Batzolis, Marcello Carioni, Christian Etmann, Soroosh Afyouni, Zoe Kourtzi, Carola-Bibiane Schönlieb Foundations of Data Science, AIMS, 2024
DOI: 10.3934/fods.2024028
[arXiv] [Official Code]

arXiv 2023

Variational Diffusion Auto-encoder: Latent Space Extraction from Pre-trained Diffusion Models, Georgios Batzolis*, Jan Stanczuk*, Carola-Bibiane Schönlieb arXiv preprint arXiv:2304.12141, 2023
[PDF]

AAAI 2022

How to distribute data across tasks for meta-learning?, Alexandru Cioba, Michael Bromberg, Qian Wang, Ritwik Niyogi, Georgios Batzolis, Jezabel Garcia, Da-shan Shiu, Alberto Bernacchia AAAI Conference on Artificial Intelligence, 2022
[PDF]

arXiv 2022

Non-uniform diffusion models, Georgios Batzolis, Jan Stanczuk, Carola-Bibiane Schönlieb arXiv preprint arXiv:2207.09786, 2022
[PDF]

arXiv 2021

Conditional image generation with score-based diffusion models, Georgios Batzolis, Jan Stanczuk, Carola-Bibiane Schönlieb, Christian Etmann arXiv preprint arXiv:2111.13606, 2021
[PDF]

Education

University of Cambridge

(Oct 2020 - Mar 2025)

Department of Applied Mathematics
PhD in Machine Learning

University of Cambridge

(Oct 2016 - Oct 2020)

Department of Engineering
MEng in Information and Computer Engineering, First Class Honours

Work Experience

University of Cambridge

Postdoctoral Researcher (Mar 2025 - present)

Google DeepMind

Research Collaboration (Feb 2024 - Sep 2025)

Huawei Technologies Research & Development (UK) Ltd

AI Research Internship (July 2022 - Oct 2022)

MediaTek Research UK Ltd

AI Research Internship (June 2020 - Oct 2020)

Contact

Feel free to reach out via email, Google Scholar, or LinkedIn.