Enhancing Automatic Speech Recognition in Vehicular Environments: A Noise-Specific Fine-Tuning Approach

Zhu, Dongwen (2024) Enhancing Automatic Speech Recognition in Vehicular Environments: A Noise-Specific Fine-Tuning Approach. Master thesis, Voice Technology (VT).

Preview

PDF
MA-5505925-D-Zhu.pdf
Download (2MB) | Preview

Abstract

The expansion of in-vehicle technologies has made it necessary for the development of advanced automatic speech recognition (ASR) systems that are capable of operating efficiently in noisy environments. This thesis explores the enhancement of ASR systems through fine-tuning for specific noise conditions, particularly focusing on vehicular noise environments. The research investigates whether ASR models fine-tuned with noise samples specific to a vehicular environment demonstrate superior performance compared to models that are generalized for noise robustness. Using the ''wav2vec2-base-960h'' model pre-trained on the LibriSpeech corpus as the baseline model, this study conducts the fine-tuning experiments with two distinct noise datasets: Vehicular Noise Speech and Public Other Noise Speech. The performance of these three models - the baseline model, the model fine-tuned by vehicular noise, and the model fine-tuned by public other noise, is evaluated across three same noise conditions to ascertain their effectiveness in real-world scenarios. The results indicate that models fine-tuned on specific noise environments significantly outperform the general noise-robust model in their targeted settings. This study contributes to the field by demonstrating the potential of environment-specific fine-tuning in enhancing ASR performance in noise-affected conditions. The findings could influence future ASR applications in vehicular systems, ensuring more reliable speech recognition and improving user interaction with in-vehicle electronics.

Item Type:	Thesis (Master)
Name supervisor:	Coler, M.L.
Date Deposited:	16 Jul 2024 13:58
Last Modified:	16 Jul 2024 13:58
URI:	https://campus-fryslan.studenttheses.ub.rug.nl/id/eprint/516

Actions (login required)

View Item