BUOL: A Bottom-Up Framework with Occupancy-aware Lifting for Panoptic 3D Scene Reconstruction From A Single Image

Introduction

This is an official release of the paper BUOL: A Bottom-Up Framework with Occupancy-aware Lifting for Panoptic 3D Scene Reconstruction From A Single Image

BUOL: A Bottom-Up Framework with Occupancy-aware Lifting for Panoptic 3D Scene Reconstruction From A Single Image
Tao Chu, Pan Zhang, Qiong Liu, Jiaqi Wang
The IEEE / CVF Computer Vision and Pattern Recognition Conference (CVPR 2023)

Results

The results of BUOL on each dataset are shown below. We have released the models.

dataset	PRQ	RSQ	RRQ	PRQ_th	PRQ_st	Download
3D FRONT	54.05	63.72	83.14	49.77	73.34	front3d.pth
Matterport3D	14.54	45.91	31.08	11.02	25.09	matterport3d.pth

Installation

Creat environment.

conda create -n buol -y conda activate buol conda install pytorch==1.8.0 torchvision==0.9.0 torchaudio==0.8.0 cudatoolkit=11.1 -c pytorch -c conda-forge -y

Install MinkowskiEngine.

git clone https://github.com/NVIDIA/MinkowskiEngine.git cd MinkowskiEngine python setup.py install --blas_include_dirs=${CONDA_PREFIX}/include --blas=openblas --force_cuda

Install PyMarchingCubes.

git clone https://github.com/xheon/PyMarchingCubes.git cd PyMarchingCubes git clone https://gitlab.com/libeigen/eigen.git python setup.py install

Install other dependency packages.

pip install yacs fvcore pip install opencv-python conda install -c conda-forge openexr-python -y pip install pyexr pip install matplotlib pip install plyfile pip install loguru pip install scipy

Run

Demo

Download front3d.pth and put it at models/front3d.pth, and run:

python demo.py

Train

Download datasets and put them in datasets/<dataset_name> as the following structure, and then set GPUS (e.g. GPUS: (0, 1, 2, 3)) and MODEL.EVAL: False in the config file, and train with multi-GPU:

python -m torch.distributed.launch --nproc_per_node=4 main.py --cfg configs/front.yaml

Test

Download the model or train the model, and then set MODEL.WEIGHTS as the model path. Set GPUS: (0,) and MODEL.EVAL: True in the config file, and test with one GPU:

python main.py --cfg configs/front.yaml

Datasets

3D FRONT

The 3D FRONT is a synthetic indoor dataset. We process it the same as Dahnert et al. (Panoptic 3D Scene Reconstruction from a Single RGB Image). You can download or process it from there.

Structure

front3d/ <scene_id>/ ├── rgb_<frame_id>.png # Color image: 320x240x3 ├── depth_<frame_id>.exr # Depth image: 320x240x1 ├── segmap_<frame_id>.mapped.npz # 2D Segmentation: 320x240x2, with 0: pre-mapped semantics, 1: instances ├── geometry_<frame_id>.npz # 3D Geometry: 256x256x256x1, truncated, (unsigned) distance field at 3cm voxel resolution and 12 voxel truncation. ├── segmentation_<frame_id>.mapped.npz # 3D Segmentation: 256x256x256x2, with 0: pre-mapped semantics & instances ├── weighting_<frame_id>.mapped.npz # 3D Weighting mask: 256x256x256x1

Matterport3D

The Matterport3D is a real-world indoor datasets. We follow Dahnert et al. to preprocess this dataset. In addition, we generate depth and room mask by rendering 3D scenes instead of using the origin version.

Structure

matterport/ <scene_id>/ ├── <image_id>_i<frame_id>.png # Color image: 320x240x3 ├── <image_id>_segmap<frame_id>.mapped.npz # 2D Segmentation: 320x240x2, with 0: pre-mapped semantics, 1: instances ├── <image_id>_intrinsics_<camera_id>.png # Intrinsics matrix: 4x4 ├── <image_id>_geometry<frame_id>.npz # 3D Geometry: 256x256x256x1, truncated, (unsigned) distance field at 3cm voxel resolution and 12 voxel truncation. ├── <image_id>_segmentation<frame_id>.mapped.npz # 3D Segmentation: 256x256x256x2, with 0: pre-mapped semantics & instances ├── <image_id>_weighting<frame_id>.npz # 3D Weighting mask: 256x256x256x1 matterport_depth_gen/ <scene_id>/ ├── <posithion_id>_d<frame_id>.png # Depth image: 320x240x1 matterport_room_mask/ <scene_id>/ ├── <posithion_id>_rm<frame_id>.png # room mask: 320x240x1

Citation

@inproceedings{chu2023buol, title={BUOL: A Bottom-Up Framework With Occupancy-Aware Lifting for Panoptic 3D Scene Reconstruction From a Single Image}, author={Chu, Tao and Zhang, Pan and Liu, Qiong and Wang, Jiaqi}, booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition}, pages={4937--4946}, year={2023} }

License

This project is licensed under the MIT License - see the LICENSE file for details.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

BUOL: A Bottom-Up Framework with Occupancy-aware Lifting for Panoptic 3D Scene Reconstruction From A Single Image

Introduction

Results

Installation

Run

Demo

Train

Test

Datasets

3D FRONT

Structure

Matterport3D

Structure

Citation

License

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
configs		configs
dataset		dataset
demo		demo
model		model
resources		resources
utils		utils
LICENSE		LICENSE
README.md		README.md
demo.py		demo.py
main.py		main.py

License

chtsy/buol

Folders and files

Latest commit

History

Repository files navigation

BUOL: A Bottom-Up Framework with Occupancy-aware Lifting for Panoptic 3D Scene Reconstruction From A Single Image

Introduction

Results

Installation

Run

Demo

Train

Test

Datasets

3D FRONT

Structure

Matterport3D

Structure

Citation

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages