📓RoDLA

Benchmarking the Robustness of Document Layout Analysis Models (CVPR'24)

🏡 Project Homepage

This is the official repository for our CVPR 2024 paper RoDLA:Benchmarking the Robustness of Document Layout Analysis Models. For more result and benchmarking details, please visit our project homepage.

🔎 Introduction

We introduce RoDLA that aims to benchmark the robustness of Document Layout Analysis (DLA) models. RoDLA is a large-scale benchmark that contains 450,000+ documents with diverse layouts and contents. We also provide a set of evaluation metrics to facilitate the comparison of different DLA models. We hope that RoDLA can serve as a standard benchmark for the robustness evaluation of DLA models.

📝 Catalog

Perturbation Benchmark Dataset
- PubLayNet-P
- DocLayNet-P
- M⁶Doc-P
Perturbation Generation and Evaluation Code
RoDLA Model Checkpoints
RoDLA Model Training Code
RoDLA Model Evaluation Code

📦 Installation

1. Clone the repository

git clone https://github.com/yufanchen96/RoDLA.git cd RoDLA

2. Create a conda virtual environment

# create virtual environment conda create -n RoDLA python=3.7 -y conda activate RoDLA

3. Install benchmark dependencies

Install Basic Dependencies

pip install torch==1.10.2+cu113 torchvision==0.11.3+cu113 torchaudio==0.10.2+cu113 -f https://download.pytorch.org/whl/cu113/torch_stable.html python -m pip install detectron2 -f https://dl.fbaipublicfiles.com/detectron2/wheels/cu113/torch1.10/index.html pip install -U openmim mim install mmcv-full==1.5.0 pip install timm==0.6.11 mmdet==2.28.1 pip install Pillow==9.5.0 pip install opencv-python termcolor yacs pyyaml scipy

Install ocrodeg Dependencies

git clone https://github.com/NVlabs/ocrodeg.git cd ./ocrodeg pip install -e .

Compile CUDA operators

cd ./model/ops_dcnv3 sh ./make.sh python test.py

You can also install the operator using .whl files

DCNv3-1.0-whl

📂 Dataset Preparation

RoDLA Benchmark Dataset Preparation

Download the RoDLA dataset from Google Driver to the desired root directory.

Self-generated Perturbation Dataset Preparation

Prepare the dataset as follows by yourself:

cd ./perturbation python apply_perturbation.py \ --dataset_dir ./publaynet/val \ --json_dir ./publaynet/val.json \ --dataset_name PubLayNet-P \ --output_dir ./PubLayNet-P \ --pert_method all \ --background_folder ./background \ --metric all

Dataset Structure

After dataset preparation, the perturbed dataset structure would be:

.desired_root └── PubLayNet-P ├── Background │ ├── Background_1 │ │ ├── psnr.json │ │ ├── ms_ssim.json │ │ ├── cw_ssim.json │ │ ├── val.json │ │ ├── val │ │ │ ├── PMC538274_00004.jpg ... │ ├── Background_2 ... ├── Rotation ...

🚀 Quick Start

Download the RoDLA model checkpoints

Evaluate the RoDLA model

cd ./model python -u test.py configs/publaynet/rodla_internimage_xl_publaynet.py \ checkpoint_dir/rodla_internimage_xl_publaynet.pth \ --work-dir result/rodla_internimage_publaynet/Speckle_1 \ --eval bbox \ --cfg-options data.test.ann_file='PubLayNet-P/Speckle/Speckle_1/val.json' \ data.test.img_prefix='PubLayNet-P/Speckle/Speckle_1/val/'

Training the RoDLA model

Modify the configuration file under configs/_base_/datasets to specify the dataset path
Run the following command to train the model with 4 GPUs

sh dist_train.sh configs/publaynet/rodla_internimage_xl_2x_publaynet.py 4

🌳 Citation

If you find this code useful for your research, please consider citing:

@inproceedings{chen2024rodla, title={RoDLA: Benchmarking the Robustness of Document Layout Analysis Models}, author={Yufan Chen and Jiaming Zhang and Kunyu Peng and Junwei Zheng and Ruiping Liu and Philip Torr and Rainer Stiefelhagen}, booktitle={CVPR}, year={2024} }

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
assets		assets
model		model
perturbation		perturbation
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

📓RoDLA

Benchmarking the Robustness of Document Layout Analysis Models (CVPR'24)

🏡 Project Homepage

🔎 Introduction

📝 Catalog

📦 Installation

📂 Dataset Preparation

RoDLA Benchmark Dataset Preparation

Self-generated Perturbation Dataset Preparation

Dataset Structure

🚀 Quick Start

Download the RoDLA model checkpoints

Evaluate the RoDLA model

Training the RoDLA model

🌳 Citation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 2

Uh oh!

Languages

License

yufanchen96/RoDLA

Folders and files

Latest commit

History

Repository files navigation

📓RoDLA

Benchmarking the Robustness of Document Layout Analysis Models (CVPR'24)

🏡 Project Homepage

🔎 Introduction

📝 Catalog

📦 Installation

📂 Dataset Preparation

RoDLA Benchmark Dataset Preparation

Self-generated Perturbation Dataset Preparation

Dataset Structure

🚀 Quick Start

Download the RoDLA model checkpoints

Evaluate the RoDLA model

Training the RoDLA model

🌳 Citation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 2

Uh oh!

Languages

Packages