Parameter-Efficient Fine-Tuning on Multilingual ASR Whisper Model for Frisian

Liu, Xueying (2024) Parameter-Efficient Fine-Tuning on Multilingual ASR Whisper Model for Frisian. Master thesis, Voice Technology (VT).

Preview

PDF
MA-5521904-X-Liu.pdf
Download (1MB) | Preview

Abstract

Despite the proven multi-language competencies of Whisper, the model faces challenges when recognizing low-resource languages (LRLs). The typical way to improve its performance on LRLs is to fully fine-tune the model with the additional target LRLs data. Still, due to the extensive parameter sets of the model and a limited amount of data, this approach is resource-intensive and prone to overfit. To compensate for the tremendous computational cost of the full fine-tuning and overfitting problem, parameter-efficient fine-tuning (PEFT) such as Low-Rank Adaptation (LoRA), is proposed as a feasible solution. In this work, I examined the effectiveness of LoRA on the Whisper model for low-resource language Frisian. The result showed that with only 1.4% of model parameters and less GPU memory, LoRA achieved comparable word error rate (WER) performance to full fine-tuning in Frisian. I also found that low-resource languages benefited more from LoRA than high-resource languages. This study brings valuable insights for practical ASR system development toward efficiency and inclusion, particularly in multilingual and low-resource contexts. Keywords:

Item Type:	Thesis (Master)
Name supervisor:	Nayak, S.
Date Deposited:	29 Jul 2024 08:30
Last Modified:	29 Jul 2024 08:30
URI:	https://campus-fryslan.studenttheses.ub.rug.nl/id/eprint/539

Actions (login required)

View Item