SiamCDA: Complementarity-and distractor-aware RGB-T tracking based on Siamese network

Tianlu Zhang, Xueru Liu, Qiang Zhang, Jungong Han

Research output: Contribution to journalArticlepeer-review

25 Citations (SciVal)
271 Downloads (Pure)


Recent years have witnessed the prevalence of using the Siamese network for RGB-T tracking because of its remarkable success in RGB object tracking. Despite their faster than real-time speeds, existing RGB-T Siamese trackers suffer from low accuracy and poor robustness, compared to other state-of-the-art RGB-T trackers. To address such issues, a new complementarity- and distractor-aware RGB-T tracker based on Siamese network (referred to as SiamCDA) is developed in this paper. To this end, several modules are presented, where the feature pyramid network (FPN) is incorporated into the Siamese network to capture the cross-level information within unimodal features extracted from the RGB or the thermal images. Next, a complementarity-aware multi-modal feature fusion module (CA-MF) is specially designed to capture the cross-modal information between RGB features and thermal features. In the final bounding box selection phase, a distractor-aware region proposal selection module (DAS) further enhances the robustness of our tracker. On top of the technical modules, we also build a large-scale, diverse synthetic RGB-T tracking dataset, containing more than 4831 pairs of synthetic RGB-T videos and 12K synthetic RGB-T images. Extensive experiments on three RGB-T tracking benchmark datasets demonstrate the outstanding performance of our proposed tracker with a tracking speed over 37 frames per second (FPS).
Original languageEnglish
Pages (from-to)1403-1417
Number of pages15
JournalIEEE Transactions on Circuits and Systems for Video Technology
Issue number3
Early online date09 Apr 2021
Publication statusPublished - 01 Mar 2022


  • RGB-T tracking
  • complementarity-aware fusion
  • distractor-aware region proposal selection
  • large-scale synthetic dataset
  • siamese network


Dive into the research topics of 'SiamCDA: Complementarity-and distractor-aware RGB-T tracking based on Siamese network'. Together they form a unique fingerprint.

Cite this