Faste, Sarah (2022) WAV2VEC 2.0 FOR IRISH ASR: A MULTILINGUAL APPROACH TO UNDER-RESOURCED LANGUAGES. Master thesis, Voice Technology (VT).
|
PDF
MA S4915135 S Faste.pdf Download (802kB) | Preview |
Abstract
Under-resourced languages have very little or no data recorded, making it an exceptional challenge to create automatic speech recognition systems for them. Using multilingual methods, new models have been developed to use data from other languages to create under-resourced language systems with very small datasets. The Wav2Vec 2.0 XLSR-53 is a large multilingual model that uses 53 languages to pre-train in a self-supervised manner. In this research, I conclude that it is possible to fine-tune the XLSR-53 model with less than 5 hours of data and achieve a WER of less than 50%. Using the Irish dataset from Mozilla’s Common Voice with only 4 hours of validated data, the multilingual Wav2Vec 2.0 XLSR-53 is able to achieve a WER of 46.88%.
Item Type: | Thesis (Master) |
---|---|
Name supervisor: | Nayak, S. and Coler, M.L. |
Date Deposited: | 09 Sep 2022 08:24 |
Last Modified: | 09 Sep 2022 08:24 |
URI: | https://campus-fryslan.studenttheses.ub.rug.nl/id/eprint/234 |
Actions (login required)
View Item |