Diffsound: Discrete diffusion model for text-to-sound generation D Yang, J Yu, H Wang, W Wang, C Weng, Y Zou, D Yu IEEE/ACM Transactions on Audio, Speech, and Language Processing 31, 1720-1733, 2023 | 229 | 2023 |
Gigaspeech: An evolving, multi-domain asr corpus with 10,000 hours of transcribed audio G Chen, S Chai, G Wang, J Du, WQ Zhang, C Weng, D Su, D Povey, ... arXiv preprint arXiv:2106.06909, 2021 | 193 | 2021 |
Recurrent deep neural networks for robust speech recognition C Weng, D Yu, S Watanabe, BHF Juang 2014 IEEE International Conference on Acoustics, Speech and Signal …, 2014 | 160 | 2014 |
Replay and synthetic speech detection with res2net architecture X Li, N Li, C Weng, X Liu, D Su, D Yu, H Meng ICASSP 2021-2021 IEEE international conference on acoustics, speech and …, 2021 | 159 | 2021 |
Deep neural networks for single-channel multi-talker speech recognition C Weng, D Yu, ML Seltzer, J Droppo IEEE/ACM Transactions on Audio, Speech, and Language Processing 23 (10 …, 2015 | 113 | 2015 |
DurIAN: Duration Informed Attention Network for Speech Synthesis. C Yu, H Lu, N Hu, M Yu, C Weng, K Xu, P Liu, D Tuo, S Kang, G Lei, D Su, ... Interspeech, 2027-2031, 2020 | 107 | 2020 |
Durian: Duration informed attention network for multimodal synthesis C Yu, H Lu, N Hu, M Yu, C Weng, K Xu, P Liu, D Tuo, S Kang, G Lei, D Su, ... arXiv preprint arXiv:1909.01700, 2019 | 102 | 2019 |
Component fusion: Learning replaceable language model component for end-to-end speech recognition system C Shan, C Weng, G Wang, D Su, M Luo, D Yu, L Xie ICASSP 2019-2019 IEEE International Conference on Acoustics, Speech and …, 2019 | 101 | 2019 |
Videocrafter1: Open diffusion models for high-quality video generation H Chen, M Xia, Y He, Y Zhang, X Cun, S Yang, J Xing, Y Liu, Q Chen, ... arXiv preprint arXiv:2310.19512, 2023 | 100 | 2023 |
Past review, current progress, and challenges ahead on the cocktail party problem Y Qian, C Weng, X Chang, S Wang, D Yu Frontiers of Information Technology & Electronic Engineering 19, 40-63, 2018 | 96 | 2018 |
Investigating end-to-end speech recognition for mandarin-english code-switching C Shan, C Weng, G Wang, D Su, M Luo, D Yu, L Xie ICASSP 2019-2019 IEEE International Conference on Acoustics, Speech and …, 2019 | 87 | 2019 |
Self-supervised text-independent speaker verification using prototypical momentum contrastive learning W Xia, C Zhang, C Weng, M Yu, D Yu ICASSP 2021-2021 IEEE international conference on acoustics, speech and …, 2021 | 81 | 2021 |
Deep learning based multi-source localization with source splitting and its effectiveness in multi-talker speech recognition AS Subramanian, C Weng, S Watanabe, M Yu, D Yu Computer Speech & Language 75, 101360, 2022 | 71 | 2022 |
Improving Attention Based Sequence-to-Sequence Models for End-to-End English Conversational Speech Recognition. C Weng, J Cui, G Wang, J Wang, C Yu, D Su, D Yu Interspeech, 761-765, 2018 | 63 | 2018 |
Instructtts: Modelling expressive tts in discrete latent space with natural language style prompt D Yang, S Liu, R Huang, C Weng, H Meng IEEE/ACM Transactions on Audio, Speech, and Language Processing, 2024 | 60 | 2024 |
Hifi-codec: Group-residual vector quantization for high fidelity audio codec D Yang, S Liu, R Huang, J Tian, C Weng, Y Zou arXiv preprint arXiv:2305.02765, 2023 | 59 | 2023 |
Mixed speech recognition D Yu, C Weng, ML Seltzer, J Droppo US Patent 9,390,712, 2016 | 59 | 2016 |
Videocrafter2: Overcoming data limitations for high-quality video diffusion models H Chen, Y Zhang, X Cun, M Xia, X Wang, C Weng, Y Shan Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2024 | 56 | 2024 |
Simple attention module based speaker verification with iterative noisy label detection X Qin, N Li, C Weng, D Su, M Li ICASSP 2022-2022 IEEE International Conference on Acoustics, Speech and …, 2022 | 53 | 2022 |
Single-channel mixed speech recognition using deep neural networks C Weng, D Yu, ML Seltzer, J Droppo 2014 IEEE International Conference on Acoustics, Speech and Signal …, 2014 | 47 | 2014 |