Fastspeech: Fast, robust and controllable text to speech Y Ren, Y Ruan, X Tan, T Qin, S Zhao, Z Zhao, TY Liu Advances in neural information processing systems 32, 2019 | 635 | 2019 |
Fastspeech 2: Fast and high-quality end-to-end text to speech Y Ren, C Hu, X Tan, T Qin, S Zhao, Z Zhao, TY Liu arXiv preprint arXiv:2006.04558, 2020 | 562 | 2020 |
Neural speech synthesis with transformer network N Li, S Liu, Y Liu, S Zhao, M Liu Proceedings of the AAAI conference on artificial intelligence 33 (01), 6706-6713, 2019 | 511 | 2019 |
Close to human quality TTS with transformer N Li, S Liu, Y Liu, S Zhao, M Liu, M Zhou arXiv preprint arXiv:1809.08895, 2018 | 100 | 2018 |
Almost unsupervised text to speech and automatic speech recognition Y Ren, X Tan, T Qin, S Zhao, Z Zhao, TY Liu International conference on machine learning, 5410-5419, 2019 | 88 | 2019 |
Hyper-structure recurrent neural networks for text-to-speech P Zhao, M Leung, K Yao, B Yan, S Zhao, FA Alleva US Patent 10,127,901, 2018 | 75 | 2018 |
Adaspeech: Adaptive text to speech for custom voice M Chen, X Tan, B Li, Y Liu, T Qin, S Zhao, TY Liu arXiv preprint arXiv:2103.00993, 2021 | 71 | 2021 |
Developing RNN-T models surpassing high-performance hybrid models with customization capability J Li, R Zhao, Z Meng, Y Liu, W Wei, S Parthasarathy, V Mazalov, Z Wang, ... arXiv preprint arXiv:2007.15188, 2020 | 64 | 2020 |
Multispeech: Multi-speaker text to speech with transformer M Chen, X Tan, Y Ren, J Xu, H Sun, S Zhao, T Qin, TY Liu arXiv preprint arXiv:2006.04664, 2020 | 60 | 2020 |
Dilated residual network with multi-head self-attention for speech emotion recognition R Li, Z Wu, J Jia, S Zhao, H Meng ICASSP 2019-2019 IEEE International Conference on Acoustics, Speech and …, 2019 | 52 | 2019 |
Lrspeech: Extremely low-resource speech synthesis and recognition J Xu, X Tan, Y Ren, T Qin, J Li, S Zhao, TY Liu Proceedings of the 26th ACM SIGKDD International Conference on Knowledge …, 2020 | 49 | 2020 |
Token-level ensemble distillation for grapheme-to-phoneme conversion H Sun, X Tan, JW Gan, H Liu, S Zhao, T Qin, TY Liu arXiv preprint arXiv:1904.03446, 2019 | 48 | 2019 |
A study of non-autoregressive model for sequence generation Y Ren, J Liu, X Tan, Z Zhao, S Zhao, TY Liu arXiv preprint arXiv:2004.10454, 2020 | 47 | 2020 |
Mbnet: Mos prediction for synthesized speech with mean-bias network Y Leng, X Tan, S Zhao, F Soong, XY Li, T Qin ICASSP 2021-2021 IEEE International Conference on Acoustics, Speech and …, 2021 | 37 | 2021 |
Chinese prosodic phrasing with extended features Z Sheng, T Jianhua, J DanLing 2003 IEEE International Conference on Acoustics, Speech, and Signal …, 2003 | 34 | 2003 |
Lightspeech: Lightweight and fast text to speech with neural architecture search R Luo, X Tan, R Wang, T Qin, J Li, S Zhao, E Chen, TY Liu ICASSP 2021-2021 IEEE International Conference on Acoustics, Speech and …, 2021 | 31 | 2021 |
Adaspeech 2: Adaptive text to speech with untranscribed data Y Yan, X Tan, B Li, T Qin, S Zhao, Y Shen, TY Liu ICASSP 2021-2021 IEEE International Conference on Acoustics, Speech and …, 2021 | 28 | 2021 |
Towards Discriminative Representation Learning for Speech Emotion Recognition. R Li, Z Wu, J Jia, Y Bu, S Zhao, H Meng IJCAI, 5060-5066, 2019 | 28 | 2019 |
Interactive TTS optimization tool JC Wang, LJ Yuan, S Zhao, FA Alleva, J Xu, C Che US Patent 8,352,270, 2013 | 22 | 2013 |
Denoispeech: Denoising text to speech with frame-level noise modeling C Zhang, Y Ren, X Tan, J Liu, K Zhang, T Qin, S Zhao, TY Liu ICASSP 2021-2021 IEEE International Conference on Acoustics, Speech and …, 2021 | 21 | 2021 |