Long-tailed visual recognition with deep models: A methodological survey and evaluation

Yu Fu, Liuyu Xiang, Yumna Zahid, Guiguang Ding, Tao Mei, Qiang Shen, Jungong Han*

*Awdur cyfatebol y gwaith hwn

Allbwn ymchwil: Cyfraniad at gyfnodolynErthygl Adolyguadolygiad gan gymheiriaid

15 Dyfyniadau(SciVal)
59 Wedi eu Llwytho i Lawr (Pure)


In the real world, large-scale datasets for visual recognition typically exhibit a long-tailed distribution, where only a few classes contain adequate samples but the others have (much) fewer samples. With the advancement of data-hungry deep models for visual recognition, the low-tail power-law data distribution that biases the model training has attracted significant attention. When training with the long-tailed data, the majority classes dominate the training procedure, resulting in poor performance in instance-scarce classes. To tackle this problem, numerous strategies, such as re-sampling, cost-sensitive loss, meta-learning and transfer learning, have been proposed. This paper systematically reviews contemporary approaches for the long-tailed visual recognition task and categorizes these methods based on the stage applied as training, fine-tuning, and inference. Furthermore, we categorize training stage methods into data augmentation, re-sampling strategy, cost-sensitive loss, as well as multiple experts and transfer learning. Next, comprehensive comparisons are made in the balanced test set performance of long-tailed benchmarks and method robustness in diverse test distributions using metrics including top-1 accuracy, per-class accuracy, multi-class ROC AUC and Expected Calibration Error (ECE). At last, we outline the challenges in this field and future research trends. Our reviews and intriguing findings can be a tutorial for researchers working in the field of open-world deep learning.

Iaith wreiddiolSaesneg
Tudalennau (o-i)290-309
Nifer y tudalennau20
Dyddiad ar-lein cynnar07 Medi 2022
Dynodwyr Gwrthrych Digidol (DOIs)
StatwsCyhoeddwyd - 14 Hyd 2022

Ôl bys

Gweld gwybodaeth am bynciau ymchwil 'Long-tailed visual recognition with deep models: A methodological survey and evaluation'. Gyda’i gilydd, maen nhw’n ffurfio ôl bys unigryw.

Dyfynnu hyn