The stripe noise in the multispectral remote sensing images, possibly resulting from the instrument instability, slit contamination, and light interference, significantly degrades the imaging quality and impairs high-level visual tasks. The local consistency of homogeneous region in striped images is damaged because of the different gains and offsets of adjacent sensors regarding the same ground object, which leads to the structural characteristics of stripe noise. This can be characterized by the increased differences between columns in the remote sensing image. Therefore, the destriping can be viewed as a process of improving the local consistency of homogeneous region and the global uniformity of whole image. In recent years, convolutional neural network (CNN)-based models have been introduced to destriping tasks, and have achieved advanced results, relying on their powerful representation ability. Therefore, to effectively leverage both CNNs and the structural characteristics of stripe noise, we propose a multi-scaled column-spatial correction network (CSCNet) for remote sensing image destriping, in which the local structural characteristic of stripe noise and the global contextual information of the image are both explored at multiple feature scales. More specifically, the column-based correction module (CCM) and spatial-based correction module (SCM) were designed to improve the local consistency and global uniformity from the perspectives of column correction and full image correction, respectively. Moreover, a feature fusion module based on the channel attention mechanism was created to obtain discriminative features derived from different modules and scales. We compared the proposed model against both traditional and deep learning methods on simulated and real remote sensing images. The promising results indicate that CSCNet effectively removes image stripes and outperforms state-of-the-art methods in terms of qualitative and quantitative assessments.