TY - JOUR
T1 - Joint Cross-Modal and Unimodal Features for RGB-D Salient Object Detection
AU - Huang, Nianchang
AU - Liu, Yi
AU - Zhang, Qiang
AU - Han, Jungong
N1 - Funding Information:
Manuscript received January 13, 2020; revised June 3, 2020; accepted July 14, 2020. Date of publication July 24, 2020; date of current version July 30, 2021. This work is supported in part by the National Natural Science Foundation of China under Grants 61773301 and 61876140 and in part by the China Postdoctoral Support Scheme for Innovative Talents under Grant BX20180236. (Corresponding authors: Qiang Zhang; Jungong Han.) Nianchang Huang, Yi Liu, and Qiang Zhang are with the Key Laboratory of Electronic Equipment Structure Design, Ministry of Education, Xidian University, Xi’an, Shaanxi 710071, China, and also with Center for Complex Systems, School of Mechano-Electronic Engineering, Xidian University, Xi’an, Shaanxi 710071, China (e-mail: [email protected]; [email protected]; [email protected]).
Publisher Copyright:
© 1999-2012 IEEE.
PY - 2020/7/24
Y1 - 2020/7/24
N2 - RGB-D salient object detection is one of the basic tasks in computer vision. Most existing models focus on investigating efficient ways of fusing the complementary information from RGB and depth images for better saliency detection. However, for many real-life cases, where one of the input images has poor visual quality or contains affluent saliency cues, fusing cross-modal features does not help to improve the detection accuracy, when compared to using unimodal features only. In view of this, a novel RGB-D salient object detection model is proposed by simultaneously exploiting the cross-modal features from the RGB-D images and the unimodal features from the input RGB and depth images for saliency detection. To this end, a Multi-branch Feature Fusion Module is presented to effectively capture the cross-level and cross-modal complementary information between RGB-D images, as well as the cross-level unimodal features from the RGB images and the depth images separately. On top of that, a Feature Selection Module is designed to adaptively select those highly discriminative features for the final saliency prediction from the fused cross-modal features and the unimodal features. Extensive evaluations on four benchmark datasets demonstrate that the proposed model outperforms the state-of-the-art approaches by a large margin.
AB - RGB-D salient object detection is one of the basic tasks in computer vision. Most existing models focus on investigating efficient ways of fusing the complementary information from RGB and depth images for better saliency detection. However, for many real-life cases, where one of the input images has poor visual quality or contains affluent saliency cues, fusing cross-modal features does not help to improve the detection accuracy, when compared to using unimodal features only. In view of this, a novel RGB-D salient object detection model is proposed by simultaneously exploiting the cross-modal features from the RGB-D images and the unimodal features from the input RGB and depth images for saliency detection. To this end, a Multi-branch Feature Fusion Module is presented to effectively capture the cross-level and cross-modal complementary information between RGB-D images, as well as the cross-level unimodal features from the RGB images and the depth images separately. On top of that, a Feature Selection Module is designed to adaptively select those highly discriminative features for the final saliency prediction from the fused cross-modal features and the unimodal features. Extensive evaluations on four benchmark datasets demonstrate that the proposed model outperforms the state-of-the-art approaches by a large margin.
KW - RGB-D
KW - multi-branch feature fusion and feature selection
KW - saliency detection
UR - http://www.scopus.com/inward/record.url?scp=85106071220&partnerID=8YFLogxK
U2 - 10.1109/TMM.2020.3011327
DO - 10.1109/TMM.2020.3011327
M3 - Article
SN - 1941-0077
VL - 23
SP - 2428
EP - 2441
JO - IEEE Transactions on Multimedia
JF - IEEE Transactions on Multimedia
M1 - 9147051
ER -