TY - GEN
T1 - CymruFluency - A Fusion Technique and a 4D Welsh Dataset for Welsh Fluency Analysis
AU - Bali, Arvinder Pal Singh
AU - Tam, Gary K.L.
AU - Siris, Avishek
AU - Andrews, Gareth
AU - Lai, Yukun
AU - Tiddeman, Bernie
AU - Ffrancon, Gwenno
N1 - Publisher Copyright:
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2026.
PY - 2026/1/2
Y1 - 2026/1/2
N2 - Welsh is a linguistically rich yet under-resourced minority language. Despite its cultural significance, automated fluency assessment remains largely unexplored due to limited datasets and tools. Existing models focus on high-resource languages, leaving Welsh without sufficient multi-modal resources. To address this, we introduce CymruFluency, the first 4D dataset for Welsh fluency assessment, capturing both audio and 3D lip movements with expert-annotated fluency scores. Building on this, we propose a multi-modal fluency classification framework that combines audio features (mel spectrograms) and manually annotated 3D lip landmarks. Our fusion approach significantly improves fluency prediction over unimodal models, emphasizing the critical role of 3D lip dynamics in Welsh learning. This research advances minority language processing by integrating articulatory features into fluency evaluation, offering a powerful tool for Welsh language learning, assessment, and preservation. Project page: https://github.com/arvinsingh/CymruFluency.
AB - Welsh is a linguistically rich yet under-resourced minority language. Despite its cultural significance, automated fluency assessment remains largely unexplored due to limited datasets and tools. Existing models focus on high-resource languages, leaving Welsh without sufficient multi-modal resources. To address this, we introduce CymruFluency, the first 4D dataset for Welsh fluency assessment, capturing both audio and 3D lip movements with expert-annotated fluency scores. Building on this, we propose a multi-modal fluency classification framework that combines audio features (mel spectrograms) and manually annotated 3D lip landmarks. Our fusion approach significantly improves fluency prediction over unimodal models, emphasizing the critical role of 3D lip dynamics in Welsh learning. This research advances minority language processing by integrating articulatory features into fluency evaluation, offering a powerful tool for Welsh language learning, assessment, and preservation. Project page: https://github.com/arvinsingh/CymruFluency.
UR - https://www.scopus.com/pages/publications/105027201227
U2 - 10.1007/978-3-032-07343-3_8
DO - 10.1007/978-3-032-07343-3_8
M3 - Conference Proceeding (ISBN)
AN - SCOPUS:105027201227
SN - 9783032073426
T3 - Lecture Notes in Computer Science
SP - 96
EP - 108
BT - Advanced Concepts for Intelligent Vision Systems - 22nd International Conference, ACIVS 2025, Proceedings
A2 - Blanc-Talon, Jacques
A2 - Delmas, Patrice
A2 - Takahashi, Hiroki
A2 - Yasuhiro, Minami
PB - Springer Nature
T2 - 22nd International Conference on Advanced Concepts for Intelligent Vision Systems, ACIVS 2025
Y2 - 28 July 2025 through 30 July 2025
ER -