Due to the costliness of labelled data in real-world applications, semi-supervised object detectors, underpinned by pseudo labelling, are appealing. However, handling confusing samples is nontrivial: discarding valuable confusing samples would compromise the model generalisation while using them for training would exacerbate the confirmation bias issue caused by inevitable mislabelling. To solve this problem, this paper proposes to use confusing samples proactively without label correction. Specifically, a virtual category (VC) is assigned to each confusing sample such that they can safely contribute to the model optimisation even without a concrete label. It is attributed to specifying the embedding distance between the training sample and the virtual category as the lower bound of the inter-class distance. Moreover, we also modify the localisation loss to allow high-quality boundaries for location regression. Extensive experiments demonstrate that the proposed VC learning significantly surpasses the state-of-the-art, especially with small amounts of available labels.
|Teitl||Computer Vision – ECCV 2022|
|Golygyddion||Shai Avidan, Gabriel Brostow, Moustapha Cissé, Giovanni Maria Farinella, Tal Hassner|
|Nifer y tudalennau||17|
|Dynodwyr Gwrthrych Digidol (DOIs)|
|Statws||Cyhoeddwyd - 2022|
|Enw||Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)|