Ego4d: Around the world in 3,000 hours of egocentric video K Grauman, A Westbury, E Byrne, Z Chavis, A Furnari, R Girdhar, ... Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2022 | 805 | 2022 |
Libri-light: A benchmark for asr with limited or no supervision J Kahn, M Riviere, W Zheng, E Kharitonov, Q Xu, PE Mazaré, J Karadayi, ... ICASSP 2020-2020 IEEE International Conference on Acoustics, Speech and …, 2020 | 647 | 2020 |
Natural human-robot interaction using speech, head pose and gestures R Stiefelhagen, C Fugen, R Gieselmann, H Holzapfel, K Nickel, A Waibel 2004 IEEE/RSJ International Conference on Intelligent Robots and Systems …, 2004 | 296 | 2004 |
Towards end-to-end spoken language understanding D Serdyuk, Y Wang, C Fuegen, A Kumar, B Liu, Y Bengio 2018 IEEE International Conference on Acoustics, Speech and Signal …, 2018 | 269 | 2018 |
Transformer-based acoustic modeling for hybrid speech recognition Y Wang, A Mohamed, D Le, C Liu, A Xiao, J Mahadeokar, H Huang, ... ICASSP 2020-2020 IEEE International Conference on Acoustics, Speech and …, 2020 | 268 | 2020 |
A one-pass decoder based on polymorphic linguistic context assignment H Soltau, F Metze, C Fugen, A Waibel IEEE Workshop on Automatic Speech Recognition and Understanding, 2001. ASRU …, 2001 | 251 | 2001 |
Knowledge transfer from weakly labeled audio using convolutional neural network for sound events and scenes A Kumar, M Khadkevich, C Fügen 2018 IEEE International Conference on Acoustics, Speech and Signal …, 2018 | 174 | 2018 |
Transformer-transducer: End-to-end speech recognition with self-attention CF Yeh, J Mahadeokar, K Kalgaonkar, Y Wang, D Le, M Jain, K Schubert, ... arXiv preprint arXiv:1910.12977, 2019 | 172 | 2019 |
Enabling multimodal human–robot interaction for the karlsruhe humanoid robot R Stiefelhagen, HK Ekenel, C Fugen, P Gieselmann, H Holzapfel, F Kraft, ... IEEE Transactions on Robotics 23 (5), 840-851, 2007 | 165 | 2007 |
Simultaneous translation of lectures and speeches C Fügen, A Waibel, M Kolss Machine translation 21, 209-252, 2007 | 160 | 2007 |
Hybrid, offline/online speech translation system NA Waibel, A Waibel, C Fuegen, K Rottman US Patent 9,430,465, 2016 | 105 | 2016 |
Prompting large language models with speech recognition abilities Y Fathullah, C Wu, E Lakomkin, J Jia, Y Shangguan, K Li, J Guo, W Xiong, ... ICASSP 2024-2024 IEEE International Conference on Acoustics, Speech and …, 2024 | 82 | 2024 |
Contextualized streaming end-to-end speech recognition with trie-based deep biasing and shallow fusion D Le, M Jain, G Keren, S Kim, Y Shi, J Mahadeokar, J Chan, ... arXiv preprint arXiv:2104.02194, 2021 | 82 | 2021 |
Deep shallow fusion for RNN-T personalization D Le, G Keren, J Chan, J Mahadeokar, C Fuegen, ML Seltzer 2021 IEEE Spoken Language Technology Workshop (SLT), 251-257, 2021 | 80 | 2021 |
Alignment restricted streaming recurrent neural network transducer J Mahadeokar, Y Shangguan, D Le, G Keren, H Su, T Le, CF Yeh, ... 2021 IEEE Spoken Language Technology Workshop (SLT), 52-59, 2021 | 69 | 2021 |
From senones to chenones: Tied context-dependent graphemes for hybrid speech recognition D Le, X Zhang, W Zheng, C Fügen, G Zweig, ML Seltzer 2019 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU …, 2019 | 65 | 2019 |
Joint Grapheme and Phoneme Embeddings for Contextual End-to-End ASR. Z Chen, M Jain, Y Wang, ML Seltzer, C Fuegen Interspeech, 3490-3494, 2019 | 63 | 2019 |
Spoken language translation A Waibel, C Fugen IEEE Signal Processing Magazine 25 (3), 70-79, 2008 | 59 | 2008 |
End-to-end contextual speech recognition using class language models and a token passing decoder Z Chen, M Jain, Y Wang, ML Seltzer, C Fuegen ICASSP 2019-2019 IEEE International Conference on Acoustics, Speech and …, 2019 | 58 | 2019 |
Integrating emotional cues into a framework for dialogue management H Holzapfel, C Fuegen, M Denecke, A Waibel Proceedings. Fourth IEEE International Conference on Multimodal Interfaces …, 2002 | 49 | 2002 |