Self-supervised monocular depth estimation holds significant importance in the fields of autonomous driving and robotics. However, existing methods are typically trained and evaluated on clear, sunny datasets, overlooking the impact of various adverse conditions commonly encountered in real-world applications, such as rainy weather, low visibility, and motion blur. As a result, they often struggle in challenging scenarios and produce artifacts. To address this issue, we propose ER-Depth, a novel two-stage self-supervised framework designed for robust depth estimation. In the first stage, we propose perturbation-invariant depth consistency regularization to propagate reliable supervision from standard to challenging scenes. In the second stage, we adopt the Mean Teacher paradigm for self-distillation and present a novel consistency-based pseudo-label filtering strategy to improve the quality of pseudo-labels. Extensive experiments demonstrate that our method exhibits exceptional robustness in challenging scenarios while maintaining high performance in standard scenes, significantly outperforming existing state-of-the-art methods on challenging KITTI-C, DrivingStereo, and NuScenes-Night benchmarks. Project page: https://ruijiezhu94.github.io/ERDepth_page. \end{abstract}
@article{zhu2023ecdepth,
title={EC-Depth: Exploring the consistency of self-supervised monocular depth estimation in challenging scenes},
author={Song, Ziyang and Zhu, Ruijie and Wang, Chuxin and Jiacheng Deng and He, Jianfeng and Zhang, Tianzhu},
journal={arXiv preprint arXiv:2310.08044},
year={2023}
}