TY - JOUR
T1 - Where to Prune
T2 - Using LSTM to Guide Data-Dependent Soft Pruning
AU - Ding, Guiguang
AU - Zhang, Shuo
AU - Jia, Zizhou
AU - Zhong, Jing
AU - Han, Jungong
N1 - Funding Information:
Manuscript received March 6, 2019; revised April 18, 2020 and July 29, 2020; accepted October 2, 2020. Date of publication November 16, 2020; date of current version November 20, 2020. This work was supported by the Natural Science Foundation of China under Grant 61925107 and Grant U1936202. The associate editor coordinating the review of this manuscript and approving it for publication was Prof. Ming-Ming Cheng. (Corresponding author: Jungong Han.) Guiguang Ding, Zizhou Jia, and Jing Zhong are with the School of Software, Tsinghua University, Beijing 100084, China (e-mail: [email protected]).
Publisher Copyright:
© 1992-2012 IEEE.
PY - 2020/11/20
Y1 - 2020/11/20
N2 - While convolutional neural network (CNN) has achieved overwhelming success in various vision tasks, its heavy computational cost and storage overhead limit the practical use on mobile or embedded devices. Recently, compressing CNN models has attracted considerable attention, where pruning CNN filters, also known as the channel pruning, has generated great research popularity due to its high compression rate. In this paper, a new channel pruning framework is proposed, which can significantly reduce the computational complexity while maintaining sufficient model accuracy. Unlike most existing approaches that seek to-be-pruned filters layer by layer, we argue that choosing appropriate layers for pruning is more crucial, which can result in more complexity reduction but less performance drop. To this end, we utilize a long short-term memory (LSTM) to learn the hierarchical characteristics of a network and generate a global network pruning scheme. On top of it, we propose a data-dependent soft pruning method, dubbed Squeeze-Excitation-Pruning (SEP), which does not physically prune any filters but selectively excludes some kernels involved in calculating forward and backward propagations depending on the pruning scheme. Compared with the hard pruning, our soft pruning can better retain the capacity and knowledge of the baseline model. Experimental results demonstrate that our approach still achieves comparable accuracy even when reducing 70.1% Floating-point operation per second (FLOPs) for VGG and 47.5% for Resnet-56.
AB - While convolutional neural network (CNN) has achieved overwhelming success in various vision tasks, its heavy computational cost and storage overhead limit the practical use on mobile or embedded devices. Recently, compressing CNN models has attracted considerable attention, where pruning CNN filters, also known as the channel pruning, has generated great research popularity due to its high compression rate. In this paper, a new channel pruning framework is proposed, which can significantly reduce the computational complexity while maintaining sufficient model accuracy. Unlike most existing approaches that seek to-be-pruned filters layer by layer, we argue that choosing appropriate layers for pruning is more crucial, which can result in more complexity reduction but less performance drop. To this end, we utilize a long short-term memory (LSTM) to learn the hierarchical characteristics of a network and generate a global network pruning scheme. On top of it, we propose a data-dependent soft pruning method, dubbed Squeeze-Excitation-Pruning (SEP), which does not physically prune any filters but selectively excludes some kernels involved in calculating forward and backward propagations depending on the pruning scheme. Compared with the hard pruning, our soft pruning can better retain the capacity and knowledge of the baseline model. Experimental results demonstrate that our approach still achieves comparable accuracy even when reducing 70.1% Floating-point operation per second (FLOPs) for VGG and 47.5% for Resnet-56.
KW - Deep learning
KW - computer vision
KW - image classification
KW - model compression
UR - http://www.scopus.com/inward/record.url?scp=85096889014&partnerID=8YFLogxK
U2 - 10.1109/TIP.2020.3035028
DO - 10.1109/TIP.2020.3035028
M3 - Article
C2 - 33186105
SN - 1947-0042
VL - 30
SP - 293
EP - 304
JO - IEEE Transactions on Image Processing
JF - IEEE Transactions on Image Processing
M1 - 9258919
ER -