TY - JOUR
T1 - The Effectiveness of a Simplified Model Structure for Crowd Counting
AU - Gao, Xingen
AU - Chen, Lei
AU - Chao, Fei
AU - Chang, Xiang
AU - Gao, Xinghang
AU - Jiang, Huali
AU - Liu, Li
AU - Zhang, Hongyi
N1 - Publisher Copyright:
© 1963-2012 IEEE.
PY - 2025/3/26
Y1 - 2025/3/26
N2 - Crowd counting, a method for measuring crowd sizes, has seen significant advancements with deep learning techniques, which have proven highly effective in accurate estimation. However, the improvement in these methods’ accuracy is frequently achieved at the cost of more intricate model architectures. This paper discusses how to construct high-performance crowd counting models using only simple structures. We propose the Fuss-Free Structure, a simple and efficient architecture with a backbone network and multi-scale feature fusion. It exhibits notable adaptability, ensuring that slight replacing its components do not lead to a substantial decline in performance. The multi-scale feature fusion structure is an uncomplicated design that consists of three distinct pathways, each featuring only a focus transition module. It combines the features from these pathways by directly employing the concatenation operation. By selecting appropriate components, our proposed structure has been trained and evaluated across four public datasets, demonstrating an accuracy that rivals that of existing complex models. Furthermore, a comprehensive evaluation is conducted by replacing the backbones of various models such as CCTrans and the proposed structure with different networks, including MobileNet-v3, ConvNeXt-Tiny, and Swin-Transformer-Small. The experimental results further indicate that excellent crowd counting performance can be achieved with the simple structure proposed by us.
AB - Crowd counting, a method for measuring crowd sizes, has seen significant advancements with deep learning techniques, which have proven highly effective in accurate estimation. However, the improvement in these methods’ accuracy is frequently achieved at the cost of more intricate model architectures. This paper discusses how to construct high-performance crowd counting models using only simple structures. We propose the Fuss-Free Structure, a simple and efficient architecture with a backbone network and multi-scale feature fusion. It exhibits notable adaptability, ensuring that slight replacing its components do not lead to a substantial decline in performance. The multi-scale feature fusion structure is an uncomplicated design that consists of three distinct pathways, each featuring only a focus transition module. It combines the features from these pathways by directly employing the concatenation operation. By selecting appropriate components, our proposed structure has been trained and evaluated across four public datasets, demonstrating an accuracy that rivals that of existing complex models. Furthermore, a comprehensive evaluation is conducted by replacing the backbones of various models such as CCTrans and the proposed structure with different networks, including MobileNet-v3, ConvNeXt-Tiny, and Swin-Transformer-Small. The experimental results further indicate that excellent crowd counting performance can be achieved with the simple structure proposed by us.
KW - crowd counting
KW - focus transition module
KW - Fuss-Free Network
KW - multi-scale feature fusion
KW - simplified structure
UR - http://www.scopus.com/inward/record.url?scp=105002349194&partnerID=8YFLogxK
U2 - 10.1109/TIM.2025.3554288
DO - 10.1109/TIM.2025.3554288
M3 - Article
AN - SCOPUS:105002349194
SN - 0018-9456
VL - 74
JO - IEEE Transactions on Instrumentation and Measurement
JF - IEEE Transactions on Instrumentation and Measurement
M1 - 5023411
ER -