TY - JOUR
T1 - Enhancing Dongba Pictograph Recognition Using Convolutional Neural Networks and Data Augmentation Techniques
AU - Li, Shihui
AU - Nguyen, Lan Thi
AU - Chansanam, Wirapong
AU - Iam-On, Natthakan
AU - Boongoen, Tossapon
N1 - Publisher Copyright:
© 2025 by the authors.
PY - 2025/5/31
Y1 - 2025/5/31
N2 - The recognition of Dongba pictographs presents significant challenges due to the pitfalls in traditional feature extraction methods, classification algorithms’ high complexity, and generalization ability. This study proposes a convolutional neural network (CNN)-based image classification method to enhance the accuracy and efficiency of Dongba pictograph recognition. The research begins with collecting and manually categorizing Dongba pictograph images, followed by these preprocessing steps to improve image quality: normalization, grayscale conversion, filtering, denoising, and binarization. The dataset, comprising 70,000 image samples, is categorized into 18 classes based on shape characteristics and manual annotations. A CNN model is then trained using a dataset that is split into training (with 70% of all the samples), validation (20%), and test (10%) sets. In particular, data augmentation techniques, including rotation, affine transformation, scaling, and translation, are applied to enhance classification accuracy. Experimental results demonstrate that the proposed model achieves a classification accuracy of 99.43% and consistently outperforms other conventional methods, with its performance peaking at 99.84% under optimized training conditions—specifically, with 75 training epochs and a batch size of 512. This study provides a robust and efficient solution for automatically classifying Dongba pictographs, contributing to their digital preservation and scholarly research. By leveraging deep learning techniques, the proposed approach facilitates the rapid and precise identification of Dongba hieroglyphs, supporting the ongoing efforts in cultural heritage preservation and the broader application of artificial intelligence in linguistic studies.
AB - The recognition of Dongba pictographs presents significant challenges due to the pitfalls in traditional feature extraction methods, classification algorithms’ high complexity, and generalization ability. This study proposes a convolutional neural network (CNN)-based image classification method to enhance the accuracy and efficiency of Dongba pictograph recognition. The research begins with collecting and manually categorizing Dongba pictograph images, followed by these preprocessing steps to improve image quality: normalization, grayscale conversion, filtering, denoising, and binarization. The dataset, comprising 70,000 image samples, is categorized into 18 classes based on shape characteristics and manual annotations. A CNN model is then trained using a dataset that is split into training (with 70% of all the samples), validation (20%), and test (10%) sets. In particular, data augmentation techniques, including rotation, affine transformation, scaling, and translation, are applied to enhance classification accuracy. Experimental results demonstrate that the proposed model achieves a classification accuracy of 99.43% and consistently outperforms other conventional methods, with its performance peaking at 99.84% under optimized training conditions—specifically, with 75 training epochs and a batch size of 512. This study provides a robust and efficient solution for automatically classifying Dongba pictographs, contributing to their digital preservation and scholarly research. By leveraging deep learning techniques, the proposed approach facilitates the rapid and precise identification of Dongba hieroglyphs, supporting the ongoing efforts in cultural heritage preservation and the broader application of artificial intelligence in linguistic studies.
KW - deep learning
KW - Dongba pictographs
KW - data augmentation
KW - convolutional neural network
KW - digital humanities
KW - image processing
KW - image classification
UR - http://www.scopus.com/inward/record.url?scp=105006533613&partnerID=8YFLogxK
U2 - 10.3390/info16050362
DO - 10.3390/info16050362
M3 - Article
SN - 2078-2489
VL - 16
JO - Information
JF - Information
IS - 5
M1 - 362
ER -