Purely sequence-trained neural networks for ASR based on lattice-free MMI. D Povey, V Peddinti, D Galvez, P Ghahremani, V Manohar, X Na, Y Wang, ... Interspeech, 2751-2755, 2016 | 1016 | 2016 |
AISHELL-1: An Open-Source Mandarin Speech Corpus and A Speech Recognition Baseline H Bu, J Du, X Na, B Wu, H Zheng Oriental COCOSDA, 2017 | 883 | 2017 |
Aishell-2: Transforming mandarin asr research into industrial scale J Du, X Na, X Liu, H Bu arXiv preprint arXiv:1808.10583, 2018 | 282 | 2018 |
An empirical exploration of CTC acoustic models Y Miao, M Gowayyed, X Na, T Ko, F Metze, A Waibel 2016 IEEE International Conference on Acoustics, Speech and Signal …, 2016 | 106 | 2016 |
Data Augmentation For Children's Speech Recognition--The" Ethiopian" System For The SLT 2021 Children Speech Recognition Challenge G Chen, X Na, Y Wang, Z Yan, J Zhang, S Ma, Y Wang arXiv preprint arXiv:2011.04547, 2020 | 23 | 2020 |
Incremental syllable-context phonetic vocoding M Cernak, PN Garner, A Lazaridis, P Motlicek, X Na IEEE/ACM Transactions on audio, speech, and language processing 23 (6), 1019 …, 2015 | 11 | 2015 |
Syllable-based Pitch Encoding for Low Bit Rate Speech Coding with Recognition/Synthesis Architecture M Cernak, X Na, PN Garner Interspeech, 2013 | 9 | 2013 |
Computational auditory scene analysis based voice activity detection M Tu, X Xie, X Na 2014 22nd International Conference on Pattern Recognition, 797-802, 2014 | 7 | 2014 |
Low latency parameter generation for real-time speech synthesis system X Na, X Xie, J Kuang 2014 IEEE International Conference on Multimedia and Expo (ICME), 1-6, 2014 | 6 | 2014 |
Improving voice quality of HMM-based speech synthesis using voice conversion method Y Jiao, X Xie, X Na, M Tu 2014 IEEE International Conference on Acoustics, Speech and Signal …, 2014 | 6 | 2014 |
Handling OOVWords in Mandarin Spoken Term Detection with an Hierarchical n‐Gram Language Model X Wang, P Zhang, X Na, J Pan, Y Yan Chinese Journal of Electronics 26 (6), 1239-1244, 2017 | 2 | 2017 |
CONVOLUTIONAL PITCH TARGET APPROXIMATION MODEL FOR SPEECH SYNTHESIS X Na, PN Garner | 2 | 2013 |
Contextualization of ASR with LLM using phonetic retrieval-based augmentation Z Lei, X Na, M Xu, E Pusateri, C Van Gysel, Y Zhang, S Han, Z Huang arXiv preprint arXiv:2409.15353, 2024 | 1 | 2024 |
Enhancing CTC-based speech recognition with diverse modeling units S Han, Z Lei, M Xu, X Na, Z Huang Interspeech 2024, 4583-4587, 2024 | | 2024 |
Focused Discriminative Training For Streaming CTC-Trained Automatic Speech Recognition Models A Haider, X Na, E McDermott, T Ng, Z Huang, X Zhuang arXiv preprint arXiv:2408.13008, 2024 | | 2024 |
A Treatise On FST Lattice Based MMI Training A Haider, T Ng, Z Huang, X Na, AV Rosti arXiv preprint arXiv:2210.08918, 2022 | | 2022 |
Two-stage ASGD framework for parallel training of DNN acoustic models using Ethernet Z Wang, X Na, X Li, J Pan, Y Yan 2015 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU …, 2015 | | 2015 |
Syllabic Pitch Tuning for Neutral-to-Emotional Voice Conversion L Saheer, X Na, M Cernak Idiap, 2015 | | 2015 |
Syllable-based Pitch Encoding for Low Bit Rate Speech Coding with Recognition/Synthesis Architecture, Cernak, Milos, Na, Xingyu and Garner, Philip N., Idiap-RR-24-2013 M Cernak, X Na | | 2013 |
A novel set of synthesis units with stable spectral boundaries for HMM-based Mandarin speech synthesis system Y Jiao, X Xie, M Tu, X Na 3rd International Conference on Multimedia Technology, 151-158, 2013 | | 2013 |