Javascript must be enabled for the correct page display

Improving Luxembourgish Speech Recognition with Cross-Lingual Speech Representations

Nguyen, Le Minh (2022) Improving Luxembourgish Speech Recognition with Cross-Lingual Speech Representations. Master thesis, Voice Technology (VT).

[img]
Preview
PDF
MA_s4923723_LM_Nguyen.pdf

Download (919kB) | Preview

Abstract

Luxembourgish is a West Germanic language spoken by roughly 390,000 people, mainly in Luxembourg. It remains one of Europe's under-described and under-resourced languages, not extensively investigated in the context of speech recognition. We explore the self-supervised multilingual learning of Luxembourgish speech representations to be used for the downstream speech recognition task. This thesis project improves our previous work on Luxembourgish wav2vec 2.0 models in a monolingual and transfer learning context. Our experiments show that learning cross-lingual representations are essential for low-resourced languages such as Luxembourgish. Learning cross-lingual representations and rescoring the output transcriptions with language modelling while using only 4 hours of labelled speech achieves a word error rate of 15.1% and improves the previous best result for Luxembourgish speech recognition relatively by 33.1% and absolutely by 7.5%. Increasing the amount of labelled speech to 14 hours yields a significant performance gain resulting in a 9.3% word error rate.

Item Type: Thesis (Master)
Name supervisor: Nayak, S. and Coler, M.L.
Date Deposited: 09 Sep 2022 08:53
Last Modified: 09 Sep 2022 08:53
URI: https://campus-fryslan.studenttheses.ub.rug.nl/id/eprint/223

Actions (login required)

View Item View Item