Théo Ladune

Logo

PhD in signal processing from University of Rennes, France.

My research interests include image/video coding using learning-based methods.

View My GitHub Profile

Add me on LinkedIn

Google scholar

Our open-source learned video coder AIVC

Short Bio

I defended my PhD on “Design of Learned Video Coding Schemes” in October 2021. My PhD focused on novel deep learning-based coding schemes for image and video compression. The objective is to leverage the high abstraction capacity of neural networks to achieve more efficient compression than traditional codecs.

Updates

Publications

AIVC: Artificial Intelligence based Video Codec

ICIP22

T. Ladune, P. Philippe, IEEE ICIP 2022

Paper

This paper introduces AIVC, an end-to-end neural video codec. It is based on two conditional autoencoders MNet and CNet, for motion compensation and coding. AIVC learns to compress videos using any coding configurations through a single end-to-end rate-distortion optimization. Furthermore, it offers performance competitive with the recent video coder HEVC under several established test conditions. A comprehensive ablation study is performed to evaluate the benefits of the different modules composing AIVC. The implementation is made available here.

Design of Learned Video Coding Schemes

ICLR21

T. Ladune, PhD Thesis

Manuscript

This thesis proposes to design a learned video coder, to leverage the promising abilities of neural networks. Due to the novelty of this field, the design of the proposed learned coding scheme starts from a blank page. The different elements of the coder are thoroughly considered. In the end, it is shown that the proposed learned coder is able to achieve performance equivalent to modern video coding standards in a realistic video coding setup.

Conditional Coding and Variable Bitrate for Practical Learned Video Coding

ICLR21

T. Ladune, P. Philippe, W. Hamidouche, L. Zhang, O. Déforges, IEEE CVPR 2021, CLIC Workshop

Paper

This paper introduces a practical learned video codec. Conditional coding and quantization gain vectors are used to provide flexibility to a single encoder/decoder pair, which is able to compress video sequences at a variable bitrate. The flexibility is leveraged at test time by choosing the rate and GOP structure to optimize a rate-distortion cost. Using the CLIC21 video test conditions, the proposed approach shows performance on par with HEVC.

Conditional Coding for Flexible Learned Video Compression

ICLR21

T. Ladune, P. Philippe, W. Hamidouche, L. Zhang, O. Déforges, ICLR 2021, Neural Compression Workshop

Paper / Video presentation / Slides

This paper introduces a novel framework for end-to-end learned video coding. Image compression is generalized through conditional coding to exploit information from reference frames, allowing to process intra and inter frames with the same coder. The system is trained through the minimization of a rate-distortion cost, with no pre-training or proxy loss. Its flexibility is assessed under three coding configurations (All Intra, Low-delay P and Random Access), where it is shown to achieve performance competitive with the state-of-the-art video codec HEVC.

Optical Flow and Mode Selection for Learning-based Video Coding

MOFNet

T. Ladune, P. Philippe, W. Hamidouche, L. Zhang, O. Déforges, IEEE MMSP 2020

Paper / Video presentation / Slides

This work received the best paper award at the MMSP 2020 conference.

This paper introduces a new method for inter-frame coding based on two complementary autoencoders: MOFNet and CodecNet. MOFNet aims at computing and conveying the Optical Flow and a pixel-wise coding Mode selection. The optical flow is used to perform a prediction of the frame to code. The coding mode selection enables competition between direct copy of the prediction or transmission through CodecNet.

The proposed coding scheme is assessed under the Challenge on Learned Image Compression 2020 (CLIC20) P-frame coding conditions, where it is shown to perform on par with the state-of- the-art video codec ITU/MPEG HEVC. Moreover, the possibility of copying the prediction enables to learn the optical flow in an end-to-end fashion i.e. without relying on pre-training and/or a dedicated loss term.

ModeNet: Mode Selection Network For Learned Video Coding

ModeNet

T. Ladune, P. Philippe, W. Hamidouche, L. Zhang, O. Déforges, IEEE MLSP 2020

Paper / Video presentation / Slides

We propose a mode selection network (ModeNet) to enhance deep learning-based video compression. Inspired by traditional video coding, ModeNet purpose is to enable competition among several coding modes. The proposed ModeNet learns and conveys a pixel-wise partitioning of the frame, used to assign each pixel to the most suited coding mode.

Binary Probability Model for Learning Based Image Compression

Binary

T. Ladune, P. Philippe, W. Hamidouche, L. Zhang, O. Déforges, IEEE ICASSP 2020

Paper / Video presentation / Slides

We propose to enhance learned image compression systems with a richer probability model for the latent variables. Previous works model the latents with a Gaussian or a Laplace distribution. Inspired by binary arithmetic coding, we propose to signal the latents with three binary values and one integer, with different probability models.

Talks