Progressively real-time video salient object detection via cascaded fully convolutional networks with motion attention

Qingping Zheng, Ying Li, Ling Zheng, Qiang Shen

Allbwn ymchwil: Cyfraniad at gyfnodolynErthygladolygiad gan gymheiriaid

20 Dyfyniadau (Scopus)
131 Wedi eu Llwytho i Lawr (Pure)

Crynodeb

Semantics and motion are two cues of essence for the success in video salient object detection. Most existing deep-learning based approaches extract semantic features by the use of only one fully convolutional network with simple stacked encoders. They simulate motion patterns of video objects with two consecutive frames being simultaneously fed into a convolutional LSTM network or a weights-sharing fully convolutional network. However, such approaches have the shortcomings of producing a coarse predicted saliency map or requiring significant computational overheads. In this paper, we present a novel approach with cascaded fully convolutional networks involving motion attention (abbreviated as CFCN-MA), to achieve real-time saliency detection in videos. Our key idea is to construct twofold fully convolutional networks in order to gain a saliency map from coarse to fine. We devise an optical flow-based motion attention mechanism to improve the prediction accuracy of the initial fully convolutional networks, using the popular FlowNet2-SD model that is efficient and effective for motion pattern recognition of distinctive objects in videos. This method can obtain a fine saliency map with a refined region of interest. Moreover, we propose a means for calculating attention-guided intersection-over-union loss (shortnamed as AIoU) to supervise the CFCN-MA model in learning a saliency map with both clear edge and complete structure. Our approach is evaluated on three popular benchmark datasets, namely DAVIS, ViSal and FBMS. Experimental results demonstrate that our method outperforms many state-of-the-art techniques while meeting the real-time demand at 27 fps.

Iaith wreiddiolSaesneg
Tudalennau (o-i)465-475
Nifer y tudalennau11
CyfnodolynNeurocomputing
Cyfrol467
Dyddiad ar-lein cynnar19 Hyd 2021
Dynodwyr Gwrthrych Digidol (DOIs)
StatwsCyhoeddwyd - 07 Ion 2022

Ôl bys

Gweld gwybodaeth am bynciau ymchwil 'Progressively real-time video salient object detection via cascaded fully convolutional networks with motion attention'. Gyda’i gilydd, maen nhw’n ffurfio ôl bys unigryw.

Dyfynnu hyn