I defended my PhD on “Design of Learned Video Coding Schemes” in October 2021. My PhD focused on novel deep learning-based coding schemes for image and video compression. The objective is to leverage the high abstraction capacity of neural networks to achieve more efficient compression than traditional codecs.
📄 June 2022: Paper accepted @ IEEE ICIP 2022 introducing our learned video codec.
📄 June 2022: Paper describing our solutions for the video coding track of CLIC 22 are available.
🎓 October 2021: I received my PhD in signal processing from the University of Rennes, France. My thesis manuscript is available here.
🖥️ September 2021: Open-source release of our learned video coders: AIVC
🥇 June 2021: Two papers accepted at the Challenge of Learned Image Coding @ CVPR 2021. One of them won the challenge.
👨🏫 May 2021: I gave a talk at the Neural Compression Workshop @ ICLR 2021. I introduced conditional coding, an enhanced way of exploiting decoder-side information.
🥇 September 2020: I received the best paper award @ IEEE MMSP 2020
📄 May 2020: Paper presented @ IEEE MLSP 2020 on learned inter-frame coding.
📄 May 2020: Paper presented @ IEEE ICASSP 2020 on learned image coding.
T. Ladune, P. Philippe, IEEE ICIP 2022
This paper introduces AIVC, an end-to-end neural video codec. It is based on two conditional autoencoders MNet and CNet, for motion compensation and coding. AIVC learns to compress videos using any coding configurations through a single end-to-end rate-distortion optimization. Furthermore, it offers performance competitive with the recent video coder HEVC under several established test conditions. A comprehensive ablation study is performed to evaluate the benefits of the different modules composing AIVC. The implementation is made available here.
T. Ladune, PhD Thesis
This thesis proposes to design a learned video coder, to leverage the promising abilities of neural networks. Due to the novelty of this field, the design of the proposed learned coding scheme starts from a blank page. The different elements of the coder are thoroughly considered. In the end, it is shown that the proposed learned coder is able to achieve performance equivalent to modern video coding standards in a realistic video coding setup.
T. Ladune, P. Philippe, W. Hamidouche, L. Zhang, O. Déforges, IEEE CVPR 2021, CLIC Workshop
This paper introduces a practical learned video codec. Conditional coding and quantization gain vectors are used to provide flexibility to a single encoder/decoder pair, which is able to compress video sequences at a variable bitrate. The flexibility is leveraged at test time by choosing the rate and GOP structure to optimize a rate-distortion cost. Using the CLIC21 video test conditions, the proposed approach shows performance on par with HEVC.
T. Ladune, P. Philippe, W. Hamidouche, L. Zhang, O. Déforges, ICLR 2021, Neural Compression Workshop
This paper introduces a novel framework for end-to-end learned video coding. Image compression is generalized through conditional coding to exploit information from reference frames, allowing to process intra and inter frames with the same coder. The system is trained through the minimization of a rate-distortion cost, with no pre-training or proxy loss. Its flexibility is assessed under three coding configurations (All Intra, Low-delay P and Random Access), where it is shown to achieve performance competitive with the state-of-the-art video codec HEVC.
T. Ladune, P. Philippe, W. Hamidouche, L. Zhang, O. Déforges, IEEE MMSP 2020
This work received the best paper award at the MMSP 2020 conference.
This paper introduces a new method for inter-frame coding based on two complementary autoencoders: MOFNet and CodecNet. MOFNet aims at computing and conveying the Optical Flow and a pixel-wise coding Mode selection. The optical flow is used to perform a prediction of the frame to code. The coding mode selection enables competition between direct copy of the prediction or transmission through CodecNet.
The proposed coding scheme is assessed under the Challenge on Learned Image Compression 2020 (CLIC20) P-frame coding conditions, where it is shown to perform on par with the state-of- the-art video codec ITU/MPEG HEVC. Moreover, the possibility of copying the prediction enables to learn the optical flow in an end-to-end fashion i.e. without relying on pre-training and/or a dedicated loss term.
T. Ladune, P. Philippe, W. Hamidouche, L. Zhang, O. Déforges, IEEE MLSP 2020
We propose a mode selection network (ModeNet) to enhance deep learning-based video compression. Inspired by traditional video coding, ModeNet purpose is to enable competition among several coding modes. The proposed ModeNet learns and conveys a pixel-wise partitioning of the frame, used to assign each pixel to the most suited coding mode.
T. Ladune, P. Philippe, W. Hamidouche, L. Zhang, O. Déforges, IEEE ICASSP 2020
We propose to enhance learned image compression systems with a richer probability model for the latent variables. Previous works model the latents with a Gaussian or a Laplace distribution. Inspired by binary arithmetic coding, we propose to signal the latents with three binary values and one integer, with different probability models.