Javascript must be enabled for the correct page display

Fine-tuning ASR to specific noise environments: noise robustness in a climbing gym

Leijenhorst, Elja (2023) Fine-tuning ASR to specific noise environments: noise robustness in a climbing gym. Master thesis, Voice Technology (VT).

[img]
Preview
PDF
MSc 4979427 EF Leijenhorst.pdf

Download (1MB) | Preview

Abstract

This research aims to improve the noise robustness of automatic speech recognition (ASR), specifically in the context of climbing gyms. There is no known research on ASR performance in sports facilities, while these have been reported to often have poor acoustics (Wrobel & Pietrusiak, 2021). Sport climbing requires a safety course with personal guidance, which causes this sport to be poorly accessible for members of the deaf community. ASR could potentially be of help in these situations if it performs well enough in this loud environment. The goal of this thesis is to optimize an ASR model for the one specific acoustic environment of the climbing gym and explore whether this has an advantage over a general noise-robust ASR model. This study encourages the use of ASR in sports facilities and contributes to ASR noise robustness in general. Following methods similar to Zhu et al. (2022) and Schlotterbeck et al. (2022), two wav2vec 2.0 models (pre-trained on an English LibriSpeech dataset of 960 hours) were fine-tuned on two LibriSpeech datasets mixed with different types of noise. One model is fine-tuned on speech mixed with newly created noisy background recordings from the climbing gym, while the other is fine-tuned on speech mixed with publicly available noises from daily real-world environments like restaurants and public transit stations (multi-condition training). The noise-robust model fine-tuned on gym noise speech did outperform the general noise-robust model for speech from the climbing gym by a relative 6%. Noise robustness of both models improved with 48% in terms of WER, compared to the baseline model, demonstrating the effectiveness of fine-tuning on noisy data. These results suggest that fine-tuning an ASR system to a specific noise environment would have an advantage over a general noise-robust ASR system, posing an optional solution to a well-performing ASR application in the climbing gym.

Item Type: Thesis (Master)
Name supervisor: Verkhodanova, V. and Coler, M.L.
Date Deposited: 12 Sep 2023 11:07
Last Modified: 12 Sep 2023 11:07
URI: https://campus-fryslan.studenttheses.ub.rug.nl/id/eprint/362

Actions (login required)

View Item View Item