Abstract
Recent years have witnessed a big leap in automatic visual saliency detection attributed to advances in deep learning, especially Convolutional Neural Networks (CNNs). However, inferring the saliency of each image part separately, as was adopted by most CNNs methods, inevitably leads to an incomplete segmentation of the salient object. In this paper, we describe how to use the property of part-object relations endowed by the Capsule Network (CapsNet) to solve the problems that fundamentally hinge on relational inference for visual saliency detection. Concretely, we put in place a two-stream strategy, termed Two-Stream Part-Object RelaTional Network (TSPORTNet), to implement CapsNet, aiming to reduce both the network complexity and the possible redundancy during capsule routing. Additionally, taking into account the correlations of capsule types from the preceding training images, a correlation-aware capsule routing algorithm is developed for more accurate capsule assignments at the training stage, which also speeds up the training dramatically. By exploring part-object relationships, TSPORTNet produces a capsule wholeness map, which in turn aids multi-level features in generating the final saliency map. Experimental results on five widely-used benchmarks show that our framework consistently achieves state-of-the-art performance.
Original language | English |
---|---|
Article number | 9334445 |
Pages (from-to) | 3688-3704 |
Number of pages | 17 |
Journal | IEEE Transactions on Pattern Analysis and Machine Intelligence |
Volume | 44 |
Issue number | 7 |
Early online date | 22 Jan 2021 |
DOIs | |
Publication status | Published - 01 Jul 2022 |
Keywords
- Object detection
- Routing
- Feature extraction
- Streaming media
- Training
- Task analysis
- Saliency detection
- Salient object detection
- capsule network
- part-object relationships
- RECOGNITION
- MODEL
- IMAGE
- ATTENTION
- EXTRACTION
- Neural Networks, Computer
- Object Attachment
- Benchmarking
- Algorithms