Projects per year
Abstract
Semantics and motion are two cues of essence for the success in video salient object detection. Most existing deep-learning based approaches extract semantic features by the use of only one fully convolutional network with simple stacked encoders. They simulate motion patterns of video objects with two consecutive frames being simultaneously fed into a convolutional LSTM network or a weights-sharing fully convolutional network. However, such approaches have the shortcomings of producing a coarse predicted saliency map or requiring significant computational overheads. In this paper, we present a novel approach with cascaded fully convolutional networks involving motion attention (abbreviated as CFCN-MA), to achieve real-time saliency detection in videos. Our key idea is to construct twofold fully convolutional networks in order to gain a saliency map from coarse to fine. We devise an optical flow-based motion attention mechanism to improve the prediction accuracy of the initial fully convolutional networks, using the popular FlowNet2-SD model that is efficient and effective for motion pattern recognition of distinctive objects in videos. This method can obtain a fine saliency map with a refined region of interest. Moreover, we propose a means for calculating attention-guided intersection-over-union loss (shortnamed as AIoU) to supervise the CFCN-MA model in learning a saliency map with both clear edge and complete structure. Our approach is evaluated on three popular benchmark datasets, namely DAVIS, ViSal and FBMS. Experimental results demonstrate that our method outperforms many state-of-the-art techniques while meeting the real-time demand at 27 fps.
Original language | English |
---|---|
Pages (from-to) | 465-475 |
Number of pages | 11 |
Journal | Neurocomputing |
Volume | 467 |
Early online date | 19 Oct 2021 |
DOIs | |
Publication status | Published - 07 Jan 2022 |
Keywords
- Cascaded fully convolutional networks
- Motion attention
- Optical flow
- Video salient object detection
Fingerprint
Dive into the research topics of 'Progressively real-time video salient object detection via cascaded fully convolutional networks with motion attention'. Together they form a unique fingerprint.Projects
- 1 Finished
-
Ser Cymru: Reconstruction of Missing Information in Optical Remote Sensing Images Based on Deep Learning and Knowledge Interpolation
Shen, Q. (PI)
01 Oct 2020 → 28 Feb 2023
Project: Externally funded research