Skip to content

Latest commit

Β 

History

History
183 lines (161 loc) Β· 5.15 KB

File metadata and controls

183 lines (161 loc) Β· 5.15 KB

DepthART: Monocular Depth Estimation as Autoregressive Refinement Task

Bulat Gabdullin, Nina Konovalova, Nikolay Patakin, Dmitry Senushkin, Anton Konushin

depthart

Installation

install python and requirements:

git clone https://github.com/AIRI-Institute/DepthART
cd DepthART
conda create -n depthART python=3.10.14
conda activate depthART
pip install -r requirements.txt

Datasets

Hypersim

hypersim/
β”œβ”€β”€ final_train_split.csv
β”œβ”€β”€ final_val_split.csv
└── scenes/
β”œβ”€β”€ scene_001/
β”‚ β”œβ”€β”€ images/
β”‚ β”‚ β”œβ”€β”€ image_0001.jpg
β”‚ β”‚ β”œβ”€β”€ image_0002.jpg
β”‚ β”‚ └── ...
β”‚ └── depth/
β”‚ β”œβ”€β”€ depth_0001.h5
β”‚ β”œβ”€β”€ depth_0002.h5
β”‚ └── ...
β”œβ”€β”€ scene_002/
β”‚ β”œβ”€β”€ images/
β”‚ └── depth/
└── ...

ETH3D

eth3d/
β”œβ”€β”€ samples_test.pth
β”œβ”€β”€ samples_train.pth
└── data/
    β”œβ”€β”€ images/
    β”‚   β”œβ”€β”€ image_0001.png
    β”‚   β”œβ”€β”€ image_0002.png
    β”‚   └── ...
    └── depth/
        β”œβ”€β”€ depth_0001.png
        β”œβ”€β”€ depth_0002.png
        └── ...

The root folder eth3d/ contains:

  • samples_test.pth: file with sample information for the test set.
  • samples_train.pth: file with sample information for the training set (if available).

The data/ subfolder contains:

  • images/: folder with original images in PNG format.
  • depth/: folder with corresponding depth maps in PNG format.

IBIMS

ibims1/
β”œβ”€β”€ imagelist.txt
β”œβ”€β”€ rgb/
β”‚   β”œβ”€β”€ image_0001.png
β”‚   β”œβ”€β”€ image_0002.png
β”‚   └── ...
β”œβ”€β”€ depth/
β”‚   β”œβ”€β”€ image_0001.png
β”‚   β”œβ”€β”€ image_0002.png
β”‚   └── ...
β”œβ”€β”€ mask_invalid/
β”‚   β”œβ”€β”€ image_0001.png
β”‚   β”œβ”€β”€ image_0002.png
β”‚   └── ...
β”œβ”€β”€ mask_transp/
β”‚   β”œβ”€β”€ image_0001.png
β”‚   β”œβ”€β”€ image_0002.png
β”‚   └── ...
β”œβ”€β”€ calib/
β”‚   β”œβ”€β”€ image_0001.txt
β”‚   β”œβ”€β”€ image_0002.txt
β”‚   └── ...
β”œβ”€β”€ edges/
β”‚   β”œβ”€β”€ image_0001.png
β”‚   β”œβ”€β”€ image_0002.png
β”‚   └── ...
β”œβ”€β”€ mask_table/
β”‚   β”œβ”€β”€ image_0001.png
β”‚   β”œβ”€β”€ image_0001.txt
β”‚   β”œβ”€β”€ image_0002.png
β”‚   β”œβ”€β”€ image_0002.txt
β”‚   └── ...
β”œβ”€β”€ mask_floor/
β”‚   β”œβ”€β”€ image_0001.png
β”‚   β”œβ”€β”€ image_0001.txt
β”‚   β”œβ”€β”€ image_0002.png
β”‚   β”œβ”€β”€ image_0002.txt
β”‚   └── ...
└── mask_wall/
    β”œβ”€β”€ image_0001.png
    β”œβ”€β”€ image_0001.txt
    β”œβ”€β”€ image_0002.png
    β”œβ”€β”€ image_0002.txt
    └── ...

NYUv2

nyuv2/
β”œβ”€β”€ data/
β”‚   β”œβ”€β”€ nyuv2_test.pkl
β”‚   β”œβ”€β”€ nyuv2_train.pkl
β”‚   β”œβ”€β”€ raw_depth_test.pkl
β”‚   β”œβ”€β”€ raw_depth_train.pkl
β”‚   └── samples_test_0_01.pth
β”œβ”€β”€ images/
β”‚   β”œβ”€β”€ image_0001.jpg
β”‚   β”œβ”€β”€ image_0002.jpg
β”‚   └── ...
└── depth/
    β”œβ”€β”€ depth_0001.pkl
    β”œβ”€β”€ depth_0002.pkl
    └── ...

TUM

tum/
β”œβ”€β”€ data/
β”‚   β”œβ”€β”€ sample_0001.h5
β”‚   β”œβ”€β”€ sample_0002.h5
β”‚   └── ...
β”œβ”€β”€ samples_test.pth
β”œβ”€β”€ samples_train.pth (if available)
└── samples_val.pth (if available)

The data/ directory contains:

  • HDF5 files (.h5) for each sample in the dataset. Each file contains:
    • An 'image' data: RGB image data
    • A 'depth' data: Corresponding depth map

Evaluation

  1. Download datasets and set <DATASET>_PATH at config/environment.yaml to your dataset paths.
  2. To remove dataset from evaluation, comment out the dataset at core/datasets/eval/all.yaml
  3. To evaluate DepthART, load checkpoint model.safetensors. Then set model.model.ckpt_path to PATH_TO_CHECKPOINT/model.safetensors in depthART.yaml to point to the checkpoint. then run:
python tools/eval.py --config-name=eval_depthART.yaml

Training

  1. Download Hypersim dataset, and set HYPERSIM_PATH at config/environment.yaml to your dataset path. Also download final_train_split.csv and final_val_split.csv and put them in your dataset directory.
  2. Download VQ-VAE and VAR checkpoints and put them in ./vae_ch160v4096z32.pth and ./var_d16.pth respectively.
  3. Run training:
bash tools/dist_train.sh train_depthART.yaml

Citation

If you find this work useful for your research, please cite our paper:

@article{gabdullin2024depthart,
  title={DepthART: Monocular Depth Estimation as Autoregressive Refinement Task},
  author={Gabdullin, Bulat and Konovalova, Nina and Patakin, Nikolay and Senushkin, Dmitry and Konushin, Anton},
  journal={arXiv preprint arXiv:2409.15010},
  year={2024}
}