Follow
Zhuohan Li
Zhuohan Li
Verified email at berkeley.edu - Homepage
Title
Cited by
Cited by
Year
Vicuna: An open-source chatbot impressing gpt-4 with 90%* chatgpt quality
WL Chiang, Z Li, Z Lin, Y Sheng, Z Wu, H Zhang, L Zheng, S Zhuang, ...
See https://vicuna. lmsys. org (accessed 14 April 2023), 2023
4582023
Train big, then compress: Rethinking model size for efficient training and inference of transformers
Z Li, E Wallace, S Shen, K Lin, K Keutzer, D Klein, J Gonzalez
International Conference on Machine Learning, 5958-5968, 2020
2232020
Judging LLM-as-a-judge with MT-Bench and Chatbot Arena
L Zheng, WL Chiang, Y Sheng, S Zhuang, Z Wu, Y Zhuang, Z Lin, Z Li, ...
arXiv preprint arXiv:2306.05685, 2023
1812023
Understanding and improving transformer from a multi-particle dynamic system point of view
Y Lu, Z Li, D He, Z Sun, B Dong, T Qin, L Wang, TY Liu
arXiv preprint arXiv:1906.02762, 2019
1372019
Alpa: Automating Inter- and Intra-Operator Parallelism for Distributed Deep Learning
L Zheng, Z Li, H Zhang, Y Zhuang, Z Chen, Y Huang, Y Wang, Y Xu, ...
arXiv preprint arXiv:2201.12023, 2022
1122022
Efficient training of BERT by progressively stacking
L Gong, D He, Z Li, T Qin, L Wang, T Liu
International conference on machine learning, 2337-2346, 2019
1072019
Fast structured decoding for sequence models
Z Sun, Z Li, H Wang, D He, Z Lin, Z Deng
Advances in Neural Information Processing Systems 32, 2019
1062019
Hint-based training for non-autoregressive machine translation
Z Li, Z Lin, D He, F Tian, T Qin, L Wang, TY Liu
702018
Terapipe: Token-level pipeline parallelism for training large-scale language models
Z Li, S Zhuang, S Guo, D Zhuo, H Zhang, D Song, I Stoica
International Conference on Machine Learning, 6543-6552, 2021
572021
Towards binary-valued gates for robust lstm training
Z Li, D He, F Tian, W Chen, T Qin, L Wang, T Liu
International Conference on Machine Learning, 2995-3004, 2018
572018
Efficient memory management for large language model serving with pagedattention
W Kwon, Z Li, S Zhuang, Y Sheng, L Zheng, CH Yu, J Gonzalez, H Zhang, ...
Proceedings of the 29th Symposium on Operating Systems Principles, 611-626, 2023
362023
FlexGen: high-throughput generative inference of large language models with a single GPU
Y Sheng, L Zheng, B Yuan, Z Li, M Ryabinin, B Chen, P Liang, C Ré, ...
International Conference on Machine Learning, 31094-31116, 2023
32*2023
Hoplite: efficient and fault-tolerant collective communication for task-based distributed systems
S Zhuang, Z Li, D Zhuo, S Wang, E Liang, R Nishihara, P Moritz, I Stoica
Proceedings of the 2021 ACM SIGCOMM 2021 Conference, 641-656, 2021
192021
AlpaServe: Statistical Multiplexing with Model Parallelism for Deep Learning Serving
Z Li, L Zheng, Y Zhong, V Liu, Y Sheng, X Jin, Y Huang, Z Chen, H Zhang, ...
arXiv preprint arXiv:2302.11665, 2023
92023
Lmsys-chat-1m: A large-scale real-world llm conversation dataset
L Zheng, WL Chiang, Y Sheng, T Li, S Zhuang, Z Wu, Y Zhuang, Z Li, ...
arXiv preprint arXiv:2309.11998, 2023
62023
On optimizing the communication of model parallelism
Y Zhuang, H Zhao, L Zheng, Z Li, E Xing, Q Ho, J Gonzalez, I Stoica, ...
Proceedings of Machine Learning and Systems 5, 2023
42023
vllm: Easy, fast, and cheap llm serving with pagedattention
W Kwon, Z Li, S Zhuang, Y Sheng, L Zheng, C Yu, J Gonzalez, H Zhang, ...
32023
Rearchitecting in-memory object stores for low latency
D Zhuo, K Zhang, Z Li, S Zhuang, S Wang, A Chen, I Stoica
Proceedings of the VLDB Endowment 15 (3), 555-568, 2021
12021
Simple and Automatic Distributed Machine Learning on Ray
H Zhang, Z Li, L Zheng, I Stoica
Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data …, 2021
12021
Student Cluster Competition 2017, Team Peking University: Reproducing vectorization of the Tersoff multi-body potential on the Intel Broadwell architecture
Z Fu, L Yang, W Hou, Z Li, Y Wu, Y Cheng, X Wang, Y Liang
Parallel Computing 78, 28-32, 2018
2018
The system can't perform the operation now. Try again later.
Articles 1–20