TLZMC: Hierarchical B-Frame Video Coding without Motion Coding

Author: David Alexandre¹, Hsueh-Ming Hang², Wen-Hsiao Peng³
Affiliation: ¹Electrical Engineering and Computer Science International Graduate Program, ²Institute of Electronics, ³Department of Computer Science, National Yang Ming Chiao Tung University, Hsinchu, Taiwan

Abstract:

Typical video compression systems consist of two main modules: motion coding and residual coding. This general architecture is adopted by classical coding schemes (such as international standards H.265 and H.266) and deep learning-based coding schemes. We propose a novel B-frame coding architecture based on two-layer Conditional Augmented Normalization Flows (CANF). It has the striking feature of not transmitting any motion information. Our proposed idea of video compression without motion coding offers a new direction for learned video coding. Our base layer is a low-resolution image compressor that replaces the full-resolution motion compressor. The low-resolution coded image is merged with the warped high-resolution images to generate a high-quality image as a conditioning signal for the enhancement-layer image coding in full resolution. One advantage of this architecture is significantly reduced computational complexity due to eliminating the motion information compressor. In addition, we adopt a skip-mode coding technique to reduce the transmitted latent samples. The rate-distortion performance of our scheme is slightly lower than that of the state-of-the-art learned B-frame coding scheme, B-CANF, but outperforms other learned B-frame coding schemes. However, compared to B-CANF, our scheme saves 45% of multiply– accumulate operations (MACs) for encoding and 27% of MACs for decoding.

Paper Supplementary Material

Code: https://github.com/nycu-clab/tlzmc-cvpr

(a) TLZMC

(b) Variant of TLZMC: TLZMC+

COMPUTATIONAL COMPLEXITY

RD RESULTS