Nguyen, Le Minh (2022) Improving Luxembourgish Speech Recognition with Cross-Lingual Speech Representations. Master thesis, Voice Technology (VT).
|
PDF
MA_s4923723_LM_Nguyen.pdf Download (919kB) | Preview |
Abstract
Luxembourgish is a West Germanic language spoken by roughly 390,000 people, mainly in Luxembourg. It remains one of Europe's under-described and under-resourced languages, not extensively investigated in the context of speech recognition. We explore the self-supervised multilingual learning of Luxembourgish speech representations to be used for the downstream speech recognition task. This thesis project improves our previous work on Luxembourgish wav2vec 2.0 models in a monolingual and transfer learning context. Our experiments show that learning cross-lingual representations are essential for low-resourced languages such as Luxembourgish. Learning cross-lingual representations and rescoring the output transcriptions with language modelling while using only 4 hours of labelled speech achieves a word error rate of 15.1% and improves the previous best result for Luxembourgish speech recognition relatively by 33.1% and absolutely by 7.5%. Increasing the amount of labelled speech to 14 hours yields a significant performance gain resulting in a 9.3% word error rate.
Item Type: | Thesis (Master) |
---|---|
Name supervisor: | Nayak, S. and Coler, M.L. |
Date Deposited: | 09 Sep 2022 08:53 |
Last Modified: | 09 Sep 2022 08:53 |
URI: | https://campus-fryslan.studenttheses.ub.rug.nl/id/eprint/223 |
Actions (login required)
View Item |