Luks, BH (2022) End-to-End ASR with Binarized Neural Networks. Master thesis, Voice Technology (VT).
|
PDF
MA 4938712 BH Luks.pdf Download (940kB) | Preview |
Abstract
Binarized Neural Networks have demonstrated tremendous abilities in compressing and speeding up neural networks, with, in some cases, comparatively little degradation in performance. Despite their successful application in convolutional and feed-forward neural network units, little research has been conducted on the binarization of recurrent units. Furthermore, existing binarized recurrent neural networks have yet to be applied to end-to-end automatic speech recognition (ASR). This work, to my knowledge, marks the first attempt to apply binarized LSTM units, per Ardakani et al., 2018, to end-to-end ASR. Experiments are conducted on networks with Connectionist Temporal Classification (CTC), as well as Attention-based architectures. Although no experiments produce high-performant, deployment-ready BNNs, greater insight into the applicability of such networks to ASR is achieved. These insights include the improved performance of linear input layers in binarized networks, as well as the importance of bidirectionality in binarized LSTMs.
Item Type: | Thesis (Master) |
---|---|
Name supervisor: | Nayak, S. and Coler, M.L. |
Date Deposited: | 09 Sep 2022 08:25 |
Last Modified: | 09 Sep 2022 08:25 |
URI: | https://campus-fryslan.studenttheses.ub.rug.nl/id/eprint/228 |
Actions (login required)
View Item |