2PCNet: Two-Phase Consistency Training for Day-to-Night Unsupervised Domain Adaptive Object Detection
CVPR 2023

overview

2PCNet is able to overcome the error-propagation present in the Mean Teacher framework for nighttime scenes.

Abstract

Object detection at night is a challenging problem due to the absence of night image annotations. Despite several domain adaptation methods, achieving high-precision results remains an issue. False-positive error propagation is still observed in methods using the well-established student-teacher framework, particularly for small-scale and low-light objects. This paper proposes a two-phase consistency unsupervised domain adaptation network, 2PCNet, to address these issues. The network employs high-confidence bounding-box predictions from the teacher in the first phase and appends them to the student's region proposals for the teacher to re-evaluate in the second phase, resulting in a combination of high and low confidence pseudo-labels. The night images and pseudo-labels are scaled-down before being used as input to the student, providing stronger small-scale pseudo-labels. To address errors that arise from low-light regions and other night-related attributes in images, we propose a night-specific augmentation pipeline called NightAug. This pipeline involves applying random augmentations, such as glare, blur, and noise, to daytime images. Experiments on publicly available datasets demonstrate that our method achieves superior results to state-of-the-art methods by 20%, and to supervised models trained directly on the target data.

Framework

overview

2PCNet consists of: A student network is trained on both the labelled daytime image, which has been augmented with NightAug, and unlabelled nighttime images. A teacher network which is the exponential moving average (EMA) of the student and provides matched pseudo-labels for unsupervised loss. We combine the high-confidence pseudo-labels from the teacher in phase 1 and the low-confidence pseudo-labels derived from the student RPN proposals. This allows matched pseudo-labels to be produced in phase 2 for the weighted consistency loss guiding the student to learn nighttime features.

Results

BDD100K Dataset

overview

SHIFT Dataset

overview

Citation