Follow
Qinghua Zhou
Title
Cited by
Cited by
Year
RWKV: Reinventing RNNs for the Transformer Era
B Peng, E Alcaide, Q Anthony, A Albalak, S Arcadinho, H Cao, X Cheng, ...
arXiv preprint arXiv:2305.13048, 2023
1402023
Designing high-performance mpi libraries with on-the-fly compression for modern gpu clusters
Q Zhou, C Chu, NS Kumar, P Kousha, SM Ghazimirsaeed, H Subramoni, ...
2021 IEEE International Parallel and Distributed Processing Symposium (IPDPS …, 2021
242021
Accelerating MPI All-to-All Communication with Online Compression on Modern GPU Clusters
Q Zhou, P Kousha, Q Anthony, KS Khorassani, A Shafi, H Subramoni, ...
High Performance Computing: 37th International Conference, ISC High …, 2022
102022
Accelerating MPI all-to-all communication with online compression on modern GPU clusters
Q Zhou, P Kousha, Q Anthony, K Shafie Khorassani, A Shafi, ...
International Conference on High Performance Computing, 3-25, 2022
102022
Dynamic kernel fusion for bulk non-contiguous data transfer on gpu clusters
CH Chu, KS Khorassani, Q Zhou, H Subramoni, DK Panda
2020 IEEE International Conference on Cluster Computing (CLUSTER), 130-141, 2020
82020
A hierarchical and load-aware design for large message neighborhood collectives
SM Ghazimirsaeed, Q Zhou, A Ruhela, M Bayatpour, H Subramoni, ...
SC20: International Conference for High Performance Computing, Networking …, 2020
52020
RWKV: Reinventing RNNs for the Transformer Era
P Bo, A Eric, A Quentin, A Alon, A Samuel, C Huanqi, C Xin, C Michael, ...
Conference on Empirical Methods in Natural Language Processing, 2023
32023
Accelerating Distributed Deep Learning Training with Compression Assisted Allgather and Reduce-Scatter Communication
Q Zhou, Q Anthony, L Xu, A Shafi, M Abduljabbar, H Subramoni, ...
2023 IEEE International Parallel and Distributed Processing Symposium (IPDPS …, 2023
22023
Designing Efficient Pipelined Communication Schemes using Compression in MPI Libraries
B Ramesh, Q Zhou, A Shafi, M Abduljabbar, H Subramoni, DK Panda
2022 IEEE 29th International Conference on High Performance Computing, Data …, 2022
12022
Benchmarking Modern Databases for Storing and Profiling Very Large Scale HPC Communication Data
P Kousha, Q Zhou, H Subramoni, DK Panda
International Symposium on Benchmarking, Measuring and Optimization, 104-119, 2023
2023
MPI-xCCL: A Portable MPI Library over Collective Communication Libraries for Various Accelerators
CC Chen, K Shafie Khorassani, P Kousha, Q Zhou, J Yao, H Subramoni, ...
Proceedings of the SC'23 Workshops of The International Conference on High …, 2023
2023
Accelerating Broadcast Communication with GPU Compression for Deep Learning Workloads
Q Zhou, Q Anthony, A Shafi, H Subramoni, DKDK Panda
2022 IEEE 29th International Conference on High Performance Computing, Data …, 2022
2022
The system can't perform the operation now. Try again later.
Articles 1–12