In our blog series Meet the Minds Inventing the Future of Video, we go behind the scenes to find out more about some of Ateme’s brightest minds and what they’ve been working on. In part four of the series, we introduce Research and Development Engineer Marwa Tarchouli. She spoke to us about Neural Video Compression and the future of better-quality viewing experiences.
What is your role at Ateme?
A PhD student in the CTO office at Ateme, I work in collaboration with the engineering research institute INSA. My PhD work focuses on neural video compression models. I study their different architectures and develop solutions to overcome their limitations.
What specifically have you been working on at Ateme?
Lately, I’ve been working on developing a solution to the memory saturation hardware limitation that neural video codecs face during high-resolution sequences. I speak more about this in my paper, “Patch-based Image Coding with End-to-End Learned Codec using Overlapping,” published in the 12th International Conference on Digital Image Processing and Pattern Recognition (DPPR 2022).
What is neural video compression?
Neural video compression models are based on deep learning architectures. These include generative adversarial network (GAN), transformers, and recurrent neural networks (RNN). RNNs are used to capture the temporal dependencies between frames, meaning the learning cycles connecting past behavior to current behavior.
The models I’m working on are mainly built on the auto-encoder architecture. In this case, an encoder transforms the input frame into a more compact and compressed representation called latent representation. Then, the representation is quantified, compressed even further, and sent to the decoder side.
What industry challenges does this address?
The video industry needs to ensure high-quality images to keep viewers engaged. In addition, with the dramatic increase of video traffic on the internet, it also needs to reduce video data as much as possible to avoid bottlenecks.
Neural video compression aims to find a trade-off between quality of the reconstructed video and the rate used to transmit it.
What have you achieved in this field at Ateme?
As previously mentioned, as part of my PhD work, I proposed a solution to the hardware limitation that neural video codecs face. In fact, while coding high-resolution frames, neural video codecs encounter a memory saturation problem that makes the compression impossible. Therefore, my solution consists of patch-coding the video frame which remedies the memory saturation problem. Moreover, the overlapping method is employed to eliminate border artifacts. This method also exploits parallel coding to benefit as much as possible from the available memory and, thereby, reduces coding time.
What does neural video compression change for viewers?
The goal is to reduce video data as much as possible while maintaining the best video quality. In this context, exploring deep learning techniques could be a good direction to achieve this goal. In any event, the field of neural video compression is advancing quickly. Gathered from my research and work, I expect that soon it will be much more common for viewers to enjoy high-quality video, even during peak viewing times.