Skip to content

ChaoyueSong/MoDA

Repository files navigation

MoDA: Modeling Deformable 3D Objects from Casual Videos

IJCV 2024


Installation

We test our method on torch 1.10 + cu113

# clone repo git clone https://github.com/ChaoyueSong/MoDA.git --recursive cd MoDA # create conda env conda env create -f misc/moda.yml conda activate moda # install pytorch3d, kmeans-pytorch pip install -e third_party/pytorch3d pip install -e third_party/kmeans_pytorch # install detectron2 python -m pip install detectron2 -f \ https://dl.fbaipublicfiles.com/detectron2/wheels/cu113/torch1.10/index.html

Data preparation

For casual-human (adult7) and casual-cat (cat-pikachiu) used in this work, you can download the pre-processed data as in BANMo, plz check the license for these data in BANMo.

# (~8G for each) bash misc/processed/download.sh cat-pikachiu bash misc/processed/download.sh human-cap

For AMA and Synthetic data, please check here.

To use your own videos, or pre-process raw videos into our format, please follow this instruction.

PoseNet weights

Download pre-trained PoseNet weights for human and quadrupeds.

mkdir -p mesh_material/posenet && cd "$_" wget $(cat ../../misc/posenet.txt); cd ../../

Training

1. cat-pikachiu (casual-cat)

# We store images as lines of pixels following BANMo.  # only needs to run it once per sequence and data are stored in # database/DAVIS/Pixel python preprocess/img2lines.py --seqname cat-pikachiu # Training bash scripts/template.sh 0,1 cat-pikachiu 10001 "no" "no" # argv[1]: gpu ids separated by comma  # args[2]: sequence name # args[3]: port for distributed training # args[4]: use_human, pass "" for human, "no" for others # args[5]: use_symm, pass "" to force x-symmetric shape # Extract articulated meshes and render bash scripts/render_mgpu.sh 0 cat-pikachiu logdir/cat-pikachiu-e120-b256-ft2/params_latest.pth \ "0 1 2 3 4 5 6 7 8 9 10" 256 # argv[1]: gpu id # argv[2]: sequence name # argv[3]: weights path # argv[4]: video id separated by space # argv[5]: resolution of running marching cubes (256 by default)

2. adult7 (casual-human)

python preprocess/img2lines.py --seqname adult7 bash scripts/template.sh 0,1 adult7 10001 "" "" bash scripts/render_mgpu.sh 0 adult7 logdir/adult7-e120-b256-ft2/params_latest.pth \ "0 1 2 3 4 5 6 7 8 9" 256

TODO

  • Initial code release.
  • Code cleaning and further checking.
  • Release the pretrained models.

Citation

@article{song2024moda, title={Moda: Modeling deformable 3d objects from casual videos}, author={Song, Chaoyue and Wei, Jiacheng and Chen, Tianyi and Chen, Yiwen and Foo, Chuan-Sheng and Liu, Fayao and Lin, Guosheng}, journal={International Journal of Computer Vision}, pages={1--20}, year={2024}, publisher={Springer} }

Acknowledgments

We thank BANMo for their code and data.

About

[IJCV 2024] MoDA: Modeling Deformable 3D Objects from Casual Videos

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published