Size: 43.1MB
Size: 1335MB
Size: 38.7MB
Size: 1329MB
The caption displays both the file size and PSNR.
3D Gaussian Splatting (3DGS) has become an emerging technique with remarkable potential in 3D representation and image rendering. However, the substantial storage overhead of 3DGS significantly impedes its practical applications. In this work, we formulate the compact 3D Gaussian learning as an end-to-end Rate-Distortion Optimization (RDO) problem and propose RDO-Gaussian that can achieve flexible and continuous rate control. RDO-Gaussian addresses two main issues that exist in current schemes: 1) Different from prior endeavors that minimize the rate under the fixed distortion, we introduce dynamic pruning and entropy-constrained vector quantization (ECVQ) that optimize the rate and distortion at the same time. 2) Previous works treat the colors of each Gaussian equally, while we model the colors of different regions and materials with learnable numbers of parameters. We verify our method on both real and synthetic scenes, showcasing that RDO-Gaussian greatly reduces the size of 3D Gaussian over 40×, and surpasses existing methods in rate-distortion performance.
The training pipeline of RDO-Gaussian is illustrated above. Our optimization process comprises four main components: Gaussian pruning, adaptive spherical harmonics (SHs) pruning, entropy-constrained vector quantization (ECVQ), and rendering. To begin with, the input Gaussians are pruned by learned Gaussian masks. Subsequently, adaptive SH masks are applied to the SHs of each Gaussian, allowing that different Gaussians can have different degrees of SHs. Furthermore, covariance and color parameters are quantized by ECVQ to obtain a more compact representation, followed by a rendering process that is executed on the pruned and quantized 3D Gaussians.
The rate-distortion curves of our method on all four datasets are depicted above. On the x-axis is the rate of the compressed Gaussian representation (in logarithmic scale) and the y-axis represents the corresponding quality metric (SSIM, PSNR, LPIPS). Both are averaged over scenes in a dataset. We also mark the performance of 3DGS (implemented by ourselves) and other 3DGS compression works (from pre-trained models or papers). Compared to 3DGS, our proposed method is able to achieve over 40× compression ratio without severe degradation of quality.
@inproceedings{wang2024rdogaussian,
title={End-to-End Rate-Distortion Optimized 3D Gaussian Representation},
author={Wang, Henan and Zhu, Hanxin and He, Tianyu and Feng, Runsen and Deng, Jiajun and Bian, Jiang and Chen, Zhibo},
booktitle={European Conference on Computer Vision},
year={2024}
}