EESEN: End-to-end speech recognition using deep RNN models and WFST-based decoding Y Miao, M Gowayyed, F Metze 2015 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU …, 2015 | 825 | 2015 |
Extracting deep bottleneck features using stacked auto-encoders J Gehring, Y Miao, F Metze, A Waibel 2013 IEEE international conference on acoustics, speech and signal …, 2013 | 343 | 2013 |
Speaker adaptive training of deep neural network acoustic models using i-vectors Y Miao, H Zhang, F Metze IEEE/ACM Transactions on Audio, Speech, and Language Processing 23 (11 …, 2015 | 133 | 2015 |
Deep maxout networks for low-resource speech recognition Y Miao, F Metze, S Rawat 2013 IEEE Workshop on Automatic Speech Recognition and Understanding, 398-403, 2013 | 114 | 2013 |
Kaldi+ PDNN: building DNN-based ASR systems with Kaldi and PDNN Y Miao arXiv preprint arXiv:1401.6984, 2014 | 112 | 2014 |
Towards speaker adaptive training of deep neural network acoustic models Y Miao, H Zhang, F Metze Fifteenth annual conference of the international speech communication …, 2014 | 105 | 2014 |
Viral video style: A closer look at viral videos on youtube L Jiang, Y Miao, Y Yang, Z Lan, AG Hauptmann Proceedings of International Conference on Multimedia Retrieval, 193-200, 2014 | 94 | 2014 |
An empirical exploration of CTC acoustic models Y Miao, M Gowayyed, X Na, T Ko, F Metze, A Waibel 2016 IEEE International Conference on Acoustics, Speech and Signal …, 2016 | 91 | 2016 |
Simplifying long short-term memory acoustic models for fast training and decoding Y Miao, J Li, Y Wang, SX Zhang, Y Gong 2016 IEEE International Conference on Acoustics, Speech and Signal …, 2016 | 89 | 2016 |
Improving low-resource CD-DNN-HMM using dropout and multilingual DNN training. Y Miao, F Metze Interspeech 13, 2237-2241, 2013 | 87 | 2013 |
Visual features for context-aware speech recognition A Gupta, Y Miao, L Neves, F Metze 2017 IEEE International Conference on Acoustics, Speech and Signal …, 2017 | 48 | 2017 |
Improvements to speaker adaptive training of deep neural networks Y Miao, L Jiang, H Zhang, F Metze 2014 IEEE Spoken Language Technology Workshop (SLT), 165-170, 2014 | 48 | 2014 |
On speaker adaptation of long short-term memory recurrent neural networks Y Miao, F Metze Sixteenth Annual Conference of the International Speech Communication …, 2015 | 47 | 2015 |
Modular combination of deep neural networks for acoustic modeling. J Gehring, W Lee, K Kilgour, IR Lane, Y Miao, A Waibel, SV Campus INTERSPEECH, 94-98, 2013 | 42 | 2013 |
Informedia@ trecvid 2014 med and mer SI Yu, L Jiang, Z Mao, X Chang, X Du, C Gan, Z Lan, Z Xu, X Li, Y Cai, ... NIST TRECVID Video Retrieval Evaluation Workshop 24, 2014 | 40 | 2014 |
Distributed learning of multilingual DNN feature extractors using GPUs Y Miao, H Zhang, F Metze Fifteenth Annual Conference of the International Speech Communication …, 2014 | 31 | 2014 |
Open-Domain Audio-Visual Speech Recognition: A Deep Learning Approach. Y Miao, F Metze Interspeech, 3414-3418, 2016 | 30 | 2016 |
Improving language-universal feature extraction with deep maxout and convolutional neural networks Y Miao, F Metze Fifteenth Annual Conference of the International Speech Communication …, 2014 | 27 | 2014 |
Distance-aware DNNs for robust speech recognition Y Miao, F Metze Sixteenth Annual Conference of the International Speech Communication …, 2015 | 23 | 2015 |
Enhancing query-oriented summarization based on sentence wikification Y Miao, C Li Workshop of the 33 rd Annual International, 32, 2010 | 15 | 2010 |