EC-Depth: Exploring the consistency of self-supervised monocular depth estimation in challenging scenes

University of Science and Technology of China
Arxiv 2023

*Indicates Equal Contribution
MY ALT TEXT

The first-stage training framework of EC-Depth. In the first stage, we propose weak-to-strong image perturbations to construct image triplets. Following the consistency regularization paradigm, we design a perturbation-invariant depth consistency loss to propogate supervision signals from standard scenes to challenging scenes.

MY ALT TEXT

The second-stage training framework of EC-Depth. In the second stage, we leverage the Mean Teacher paradigm to generate pseudo-labels for self-distillation. In particular, we propose a depth consistency-based filter (DC-Filter) and a geometric consistency-based filter (GC-Filter) to filter out unreliable pseudo-labels.

Abstract

Self-supervised monocular depth estimation holds significant importance in the fields of autonomous driving and robotics. However, existing methods are typically trained and tested on standard datasets, overlooking the impact of various adverse conditions prevalent in real-world applications, such as rainy days. As a result, it is commonly observed that these methods struggle to handle these challenging scenarios. To address this issue, we present EC-Depth, a novel self-supervised two-stage framework to achieve robust depth estimation. In the first stage, we propose depth consistency regularization to propagate reliable supervision from standard to challenging scenes. In the second stage, we adopt the Mean Teacher paradigm and propose a novel consistency-based pseudo-label filtering strategy to improve the quality of pseudo-labels, further improving both the accuracy and robustness of our model. Extensive experiments demonstrate that our method achieves accurate and consistent depth predictions in both standard and challenging scenarios, surpassing existing state-of-the-art methods on KITTI, KITTI-C, DrivingStereo, and NuScenes-Night benchmarks.

BibTeX

@article{zhu2023ecdepth,
  title={EC-Depth: Exploring the consistency of self-supervised monocular depth estimation in challenging scenes},
  author={Song, Ziyang and Zhu, Ruijie and Wang, Chuxin and Jiacheng Deng and He, Jianfeng and Zhang, Tianzhu},
  journal={arXiv preprint arXiv:2310.08044},
  year={2023}
}