End-to-End Rate-Distortion Optimized 3D Gaussian Representation

ECCV 2024
Henan Wang1, Hanxin Zhu1, Tianyu He2, Runsen Feng1, Jiajun Deng3, Jiang Bian2, Zhibo Chen1,
1University of Science and Technology of China
2Microsoft Research Asia
3The University of Adelaide

The caption displays both the file size and PSNR.

Abstract

3D Gaussian Splatting (3DGS) has become an emerging technique with remarkable potential in 3D representation and image rendering. However, the substantial storage overhead of 3DGS significantly impedes its practical applications. In this work, we formulate the compact 3D Gaussian learning as an end-to-end Rate-Distortion Optimization (RDO) problem and propose RDO-Gaussian that can achieve flexible and continuous rate control. RDO-Gaussian addresses two main issues that exist in current schemes: 1) Different from prior endeavors that minimize the rate under the fixed distortion, we introduce dynamic pruning and entropy-constrained vector quantization (ECVQ) that optimize the rate and distortion at the same time. 2) Previous works treat the colors of each Gaussian equally, while we model the colors of different regions and materials with learnable numbers of parameters. We verify our method on both real and synthetic scenes, showcasing that RDO-Gaussian greatly reduces the size of 3D Gaussian over 40×, and surpasses existing methods in rate-distortion performance.

Method

The training pipeline of RDO-Gaussian is illustrated above. Our optimization process comprises four main components: Gaussian pruning, adaptive spherical harmonics (SHs) pruning, entropy-constrained vector quantization (ECVQ), and rendering. To begin with, the input Gaussians are pruned by learned Gaussian masks. Subsequently, adaptive SH masks are applied to the SHs of each Gaussian, allowing that different Gaussians can have different degrees of SHs. Furthermore, covariance and color parameters are quantized by ECVQ to obtain a more compact representation, followed by a rendering process that is executed on the pruned and quantized 3D Gaussians.

Result

The rate-distortion curves of our method on all four datasets are depicted above. On the x-axis is the rate of the compressed Gaussian representation (in logarithmic scale) and the y-axis represents the corresponding quality metric (SSIM, PSNR, LPIPS). Both are averaged over scenes in a dataset. We also mark the performance of 3DGS (implemented by ourselves) and other 3DGS compression works (from pre-trained models or papers). Compared to 3DGS, our proposed method is able to achieve over 40× compression ratio without severe degradation of quality.

Visual Comparisons

Ours
Size: 43.1MB
3DGS [Kerbl 2023]
Size: 1335MB
Ours
Size: 38.7MB
3DGS [Kerbl 2023]
Size: 1329MB
Ours
Size: 12.3MB
3DGS [Kerbl 2023]
Size: 550MB
Ours
Size: 8.11MB
3DGS [Kerbl 2023]
Size: 349MB

Citation

If you want to cite our work, please kindly use:
@inproceedings{wang2024rdogaussian,
      title={End-to-End Rate-Distortion Optimized 3D Gaussian Representation},
      author={Wang, Henan and Zhu, Hanxin and He, Tianyu and Feng, Runsen and Deng, Jiajun and Bian, Jiang and Chen, Zhibo},
      booktitle={European Conference on Computer Vision},
      year={2024}
    }