Title

3D signal compression and processing (VCIP 2022 Tutorial)


Abstract

With the fast popularization of 3D sensing capture devices, such as Lidar or Kinect, the amount of available 3D data is increasing rapidly, leading to the more-than-ever demands of 3D signal processing. Unlike image and video signals with widely accepted pixel representation on regular grids, the representations of 3D signals are more diverse including RGB-D, voxel, point cloud, mesh and implicit representation, each of which has its own pros and cons.

In this tutorial, we will overview the recent progress of 3D signal processing focusing on the two prevalent 3D representations, i.e., RGB-D and point cloud. For RGB-D, we will introduce the state-of-the-art depth map super-resolution and monocular depth estimation techniques, which takes advantages of RGB images to improve the capture of depth maps. For point cloud, we will introduce the MPEG and AVS standardization of point cloud compression, and advanced point cloud processing techniques, which solve the irregularity of point clouds and effectively model the 3D geometry with octrees, graphs and deep neural networks.


Speaker

Xianming Liu is a Professor with the School of Computer Science and Technology, Harbin Institute of Technology (HIT), Harbin, China. He received the B.S., M.S., and Ph.D degrees in computer science from HIT, in 2006, 2008 and 2012, respectively. In 2011, he spent half a year at the Department of Electrical and Computer Engineering, McMaster University, Canada, as a visiting student, where he then worked as a post-doctoral fellow from December 2012 to December 2013. He worked as a project researcher at National Institute of Informatics (NII), Tokyo, Japan, from 2014 to 2017. He has published over 60 international conference and journal publications, including top IEEE journals, such as T-IP, T-CSVT, T-IFS, T-MM, T-GRS; and top conferences, such as CVPR, IJCAI and DCC. He is the receipt of IEEE ICME 2016 Best Student Paper Award.

Homepage: http://homepage.hit.edu.cn/xmliu

Yuanchao Bai received the B.S. degree in software engineering from Dalian University of Technology, Dalian, China, in 2013. He received the Ph.D. degree in computer science in Peking University, Beijing, China, in 2020. He was a visiting student in National Institute of Informatics, Tokyo, Japan, from 2017 to 2018. He was a Postdoctoral researcher in Peng Cheng Laboratory, Shenzhen, China, from 2020 to 2022. He is now an assistant professor in Harbin Institute of Technology, Harbin, China. His research interests include image/video compression and processing, point cloud processing and graph signal processing.

Wenbo Zhao received B.S., M.S., and Ph.D degrees in 2012, 2014 and 2020 from the Harbin Institute of Technology, Harbin, China. He is currently a postdoctor at the Peng Cheng Laboratory, Shenzhen, China. His current research interests are 3D Reconstruction, mesh denoising and point cloud processing.

Zhenyu Li received the B.S. degree in computer science from Harbin Institute of Technology, Harbin, China, in 2021. He is now a postgraduate student at Harbin Institute of Technology, Harbin, China. His research interests include 3D reconstruction, scene perception, and scene understanding.


Outline

  • Introduction
  • RGB-guided Depth Super-resolution
    • Problem Formulation
    • Filtering based methods
    • Priors based methods
    • Learning based methods
    • Discussion
  • Learning scene structure from cameras
    • Depth estimation from monocular images
    • 3D object detection from cameras
    • Unified scene understanding
  • Recent Process of MPEG point cloud compression standardization
    • Introduction
    • Geometry coding
    • Attribute coding
    • Conclusion
  • Deep learning-based point cloud compression (slide)
    • Introduction
    • Deep Generative Models
    • Learning-based PCC methods
    • Conclusion
  • Q & A