Follow
Pengcheng He
Pengcheng He
Verified email at microsoft.com
Title
Cited by
Cited by
Year
On the variance of the adaptive learning rate and beyond
L Liu, H Jiang, P He, W Chen, X Liu, J Gao, J Han
ICLR 2019, 2019
19022019
Deberta: Decoding-enhanced bert with disentangled attention
P He, X Liu, J Gao, W Chen
ICLR 2021, 2020
17132020
Multi-task deep neural networks for natural language understanding
X Liu, P He, W Chen, J Gao
ACL 2019, 2019
12952019
Debertav3: Improving deberta using electra-style pre-training with gradient-disentangled embedding sharing
P He, J Gao, W Chen
ICLR 2023, 2021
4722021
Smart: Robust and efficient fine-tuning for pre-trained natural language models through principled regularized optimization
H Jiang, P He, W Chen, X Liu, J Gao, T Zhao
ACL 2020, 2019
3872019
Instruction tuning with gpt-4
B Peng, C Li, P He, M Galley, J Gao
arXiv preprint arXiv:2304.03277, 2023
3242023
Improving multi-task deep neural networks via knowledge distillation for natural language understanding
X Liu, P He, W Chen, J Gao
arXiv preprint arXiv:1904.09482, 2019
1872019
Check your facts and try again: Improving large language models with external knowledge and automated feedback
B Peng, M Galley, P He, H Cheng, Y Xie, Y Hu, Q Huang, L Liden, Z Yu, ...
arXiv preprint arXiv:2302.12813, 2023
1802023
Adversarial training for large neural language models
X Liu, H Cheng, P He, W Chen, Y Wang, H Poon, J Gao
arXiv preprint arXiv:2004.08994, 2020
1512020
Generation-augmented retrieval for open-domain question answering
Y Mao, P He, X Liu, Y Shen, J Gao, J Han, W Chen
arXiv preprint arXiv:2009.08553, 2020
1362020
Diffusion-GAN: Training GANs with Diffusion
Z Wang, H Zheng, P He, W Chen, M Zhou
ICLR 2023, 2022
1022022
X-SQL: reinforce schema representation with context
P He, Y Mao, K Chakrabarti, W Chen
arXiv preprint arXiv:1908.08113, 2019
982019
Adaptive budget allocation for parameter-efficient fine-tuning
Q Zhang, M Chen, A Bukharin, P He, Y Cheng, W Chen, T Zhao
arXiv preprint arXiv:2303.10512, 2023
922023
On the variance of the adaptive learning rate and beyond. arXiv 2019
L Liu, H Jiang, P He, W Chen, X Liu, J Gao, J Han
arXiv preprint arXiv:1908.03265, 2019
772019
NeurIPS 2020 EfficientQA competition: Systems, analyses and lessons learned
S Min, J Boyd-Graber, C Alberti, D Chen, E Choi, M Collins, K Guu, ...
NeurIPS 2020, 2021
592021
Exploiting structured knowledge in text via graph-guided representation learning
T Shen, Y Mao, P He, G Long, A Trischler, W Chen
arXiv preprint arXiv:2004.14224, 2020
582020
The microsoft toolkit of multi-task deep neural networks for natural language understanding
X Liu, Y Wang, J Ji, H Cheng, X Zhu, E Awa, P He, W Chen, H Poon, ...
arXiv preprint arXiv:2002.07972, 2020
502020
Godel: Large-scale pre-training for goal-directed dialog
B Peng, M Galley, P He, C Brockett, L Liden, E Nouri, Z Yu, B Dolan, ...
arXiv preprint arXiv:2206.11309, 2022
452022
Human parity on commonsenseqa: Augmenting self-attention with external attention
Y Xu, C Zhu, S Wang, S Sun, H Cheng, X Liu, J Gao, P He, M Zeng, ...
arXiv preprint arXiv:2112.03254, 2021
442021
Super tickets in pre-trained language models: From model compression to improving generalization
C Liang, S Zuo, M Chen, H Jiang, X Liu, P He, T Zhao, W Chen
arXiv preprint arXiv:2105.12002, 2021
392021
The system can't perform the operation now. Try again later.
Articles 1–20