Follow
Sharan Narang
Sharan Narang
Research Engineer, Meta AI
Verified email at meta.com
Title
Cited by
Cited by
Year
Exploring the limits of transfer learning with a unified text-to-text transformer
C Raffel, N Shazeer, A Roberts, K Lee, S Narang, M Matena, Y Zhou, W Li, ...
The Journal of Machine Learning Research 21 (1), 5485-5551, 2020
68602020
Deep speech 2: End-to-end speech recognition in english and mandarin
D Amodei, S Ananthanarayanan, R Anubhai, J Bai, E Battenberg, C Case, ...
International conference on machine learning, 173-182, 2016
31182016
Mixed precision training
P Micikevicius, S Narang, J Alben, G Diamos, E Elsen, D Garcia, ...
arXiv preprint arXiv:1710.03740, 2017
11752017
Deep voice 3: Scaling text-to-speech with convolutional sequence learning
W Ping, K Peng, A Gibiansky, SO Arik, A Kannan, S Narang, J Raiman, ...
arXiv preprint arXiv:1710.07654, 2017
690*2017
Palm: Scaling language modeling with pathways
A Chowdhery, S Narang, J Devlin, M Bosma, G Mishra, A Roberts, ...
arXiv preprint arXiv:2204.02311, 2022
5372022
Deep learning scaling is predictable, empirically
J Hestness, S Narang, N Ardalani, G Diamos, H Jun, H Kianinejad, ...
arXiv preprint arXiv:1712.00409, 2017
3992017
Exploring sparsity in recurrent neural networks
S Narang, E Elsen, G Diamos, S Sengupta
arXiv preprint arXiv:1704.05119, 2017
2882017
DSD: regularizing deep neural networks with dense-sparse-dense training flow
S Han, J Pool, S Narang, H Mao, S Tang, E Elsen, B Catanzaro, J Tran, ...
277*2016
Byt5: Towards a token-free future with pre-trained byte-to-byte models
L Xue, A Barua, N Constant, R Al-Rfou, S Narang, M Kale, A Roberts, ...
Transactions of the Association for Computational Linguistics 10, 291-306, 2022
1302022
Block-sparse recurrent neural networks
S Narang, E Undersander, G Diamos
arXiv preprint arXiv:1711.02782, 2017
1152017
Wt5?! training text-to-text models to explain their predictions
S Narang, C Raffel, K Lee, A Roberts, N Fiedel, K Malkan
arXiv preprint arXiv:2004.14546, 2020
832020
Scaling instruction-finetuned language models
HW Chung, L Hou, S Longpre, B Zoph, Y Tay, W Fedus, E Li, X Wang, ...
arXiv preprint arXiv:2210.11416, 2022
602022
Do transformer modifications transfer across implementations and applications?
S Narang, HW Chung, Y Tay, W Fedus, T Fevry, M Matena, K Malkan, ...
arXiv preprint arXiv:2102.11972, 2021
472021
Scaling up models and data with t5x and seqio
A Roberts, HW Chung, A Levskaya, G Mishra, J Bradbury, D Andor, ...
arXiv preprint arXiv:2203.17189 13, 2022
402022
Scale efficiently: Insights from pre-training and fine-tuning transformers
Y Tay, M Dehghani, J Rao, W Fedus, S Abnar, HW Chung, S Narang, ...
arXiv preprint arXiv:2109.10686, 2021
332021
Neural assistant: Joint action prediction, response generation, and latent knowledge reasoning
A Neelakantan, S Yavuz, S Narang, V Prasad, B Goodrich, D Duckworth, ...
arXiv preprint arXiv:1910.14613, 2019
152019
Scaling Laws vs Model Architectures: How does Inductive Bias Influence Scaling?
Y Tay, M Dehghani, S Abnar, HW Chung, W Fedus, J Rao, S Narang, ...
arXiv preprint arXiv:2207.10551, 2022
142022
Exploring the limits of transfer learning with a unified text-to-text transformer
A Roberts, C Raffel, K Lee, M Matena, N Shazeer, PJ Liu, S Narang, W Li, ...
62019
Predicting deep learning scaling
J Hestness, G Diamos, HW Jun, S Narang, N Ardalani, MMA Patwary, ...
US Patent App. 16/206,910, 2020
52020
Systems and methods for neural text-to-speech using convolutional sequence learning
P Wei, P Kainan, S NARANG, A KANNAN, A GIBIANSKY, J RAIMAN, ...
US Patent 10,796,686, 2020
42020
The system can't perform the operation now. Try again later.
Articles 1–20