Zhao, Guanrong (2022) Road to Deep Learning-driven Chinese Traditional Verbal Art Synthesis. Master thesis, Voice Technology (VT).
PDF
MSC 4888286 G Zhao.pdf Restricted to Repository staff only Download (2MB) |
Abstract
Singing voice synthesis and speech synthesis are built for two different purposes, with the former aiming at an expressive singing voice and the latter aiming at clear and natural speech. Previous research mainly focuses on either one of them, while many forms of verbal art fall between speech and singing. In this paper, we explore a possible way to synthesis traditional verbal art through a deep learning approach and choose a tonal language like Mandarin as the researched language. Since there is little previous research about Mandarin verbal art, no previous existing dataset can be used for model training. To solve the zero data problem, we apply the idea of transfer learning and train the model with an existing dataset of singing. Through proper feature engineering, the artistic details in traditional verbal art are quantified through proposed annotation methods as well as findings of manual adjustment on two connected musical notes. Another experiment was also conducted to explore the different time intervals of slurs in artificial vibrato synthesis as well as test the model’s robustness.
Item Type: | Thesis (Master) |
---|---|
Name supervisor: | Coler, M.L. and Hopwood, F.J. |
Date Deposited: | 17 Feb 2023 14:11 |
Last Modified: | 17 Feb 2023 14:11 |
URI: | https://campus-fryslan.studenttheses.ub.rug.nl/id/eprint/240 |
Actions (login required)
View Item |