Dimitrova, Iva (2023) Speaker Profiling of Phonated and Whispered Speech. Master thesis, Voice Technology (VT).
PDF
MSc_Voice_Tech_Thesis.pdf Restricted to Repository staff only Download (5MB) |
Abstract
Whispered speech is a unique form of communication that serves various purposes including maintaining privacy and accommodating individuals with speech impairments in technological applications. However, the digital recognition of whispered speech has been a challenging task due to the multiple differences with phoneted speech including the lack of fundamental frequency, the elongated whispered utterances, and more. This thesis investigates how successful the WavLM model encodes a speaker’s voice characteristics. The results show mixed results when measuring the cosine similarity of each speaker in the context of whispered utterances and normal ones. By successfully creating an accurate speaker profile, one can use it to personalize whispered to normal speech conversion. This area of research has significant implications with improved communication accessibility and advances in speech synthesis technologies.
Item Type: | Thesis (Master) |
---|---|
Name supervisor: | Nayak, S. |
Date Deposited: | 12 Sep 2023 11:09 |
Last Modified: | 12 Sep 2023 11:09 |
URI: | https://campus-fryslan.studenttheses.ub.rug.nl/id/eprint/368 |
Actions (login required)
View Item |