Liu, Xueying (2024) Parameter-Efficient Fine-Tuning on Multilingual ASR Whisper Model for Frisian. Master thesis, Voice Technology (VT).
|
PDF
MA-5521904-X-Liu.pdf Download (1MB) | Preview |
Abstract
Despite the proven multi-language competencies of Whisper, the model faces challenges when recognizing low-resource languages (LRLs). The typical way to improve its performance on LRLs is to fully fine-tune the model with the additional target LRLs data. Still, due to the extensive parameter sets of the model and a limited amount of data, this approach is resource-intensive and prone to overfit. To compensate for the tremendous computational cost of the full fine-tuning and overfitting problem, parameter-efficient fine-tuning (PEFT) such as Low-Rank Adaptation (LoRA), is proposed as a feasible solution. In this work, I examined the effectiveness of LoRA on the Whisper model for low-resource language Frisian. The result showed that with only 1.4% of model parameters and less GPU memory, LoRA achieved comparable word error rate (WER) performance to full fine-tuning in Frisian. I also found that low-resource languages benefited more from LoRA than high-resource languages. This study brings valuable insights for practical ASR system development toward efficiency and inclusion, particularly in multilingual and low-resource contexts. Keywords:
Item Type: | Thesis (Master) |
---|---|
Name supervisor: | Nayak, S. |
Date Deposited: | 29 Jul 2024 08:30 |
Last Modified: | 29 Jul 2024 08:30 |
URI: | https://campus-fryslan.studenttheses.ub.rug.nl/id/eprint/539 |
Actions (login required)
View Item |