While the quality of novel-view images has improved dramatically with 3D Gaussian Splatting,
extracting specific objects from scenes remains challenging.
Isolating individual 3D Gaussian primitives for each object and handling occlusions in scenes remains far from being solved.
We propose a novel object extraction method based on two key principles:
(1) object-centric reconstruction through removal of irrelevant primitives; and
(2) leveraging generative inpainting to compensate for missing observations caused by occlusions.
For pruning, we propose to remove irrelevant Gaussians by looking into how close they are to its K-nearest neighbors and removing those that are statistical outliers.
Importantly, these distances must take into account the actual spatial extent they cover—we thus propose to use Wasserstein distances.
For inpainting, we employ an off-the-shelf diffusion-based inpainter combined with occlusion reasoning, utilizing the 3D representation of the entire scene.
Our findings highlight the crucial synergy between proper pruning and inpainting, both of which significantly enhance extraction performance.
We evaluate our method on a standard real-world dataset and introduce a synthetic dataset for quantitative analysis.