Follow
Chulhee Yun
Chulhee Yun
Assistant Professor, KAIST Kim Jaechul Graduate School of AI
Verified email at kaist.ac.kr - Homepage
Title
Cited by
Cited by
Year
Are Transformers universal approximators of sequence-to-sequence functions?
C Yun, S Bhojanapalli, AS Rawat, SJ Reddi, S Kumar
ICLR 2020 (arXiv:1912.10077), 2019
3082019
Minimum width for universal approximation
S Park, C Yun, J Lee, J Shin
ICLR 2021 (arXiv:2006.08859), 2020
1332020
Small nonlinearities in activation functions create bad local minima in neural networks
C Yun, S Sra, A Jadbabaie
ICLR 2019 (arXiv:1802.03487), 2018
129*2018
Small ReLU networks are powerful memorizers: a tight analysis of memorization capacity
C Yun, S Sra, A Jadbabaie
NeurIPS 2019 (arXiv:1810.07770), 2019
1222019
Global optimality conditions for deep neural networks
C Yun, S Sra, A Jadbabaie
ICLR 2018 (arXiv:1707.02444), 2017
1162017
A Unifying View on Implicit Bias in Training Linear Neural Networks
C Yun, S Krishnan, H Mobahi
ICLR 2021 (arXiv:2010.02501), 2020
782020
Low-Rank Bottleneck in Multi-head Attention Models
S Bhojanapalli, C Yun, AS Rawat, SJ Reddi, S Kumar
ICML 2020 (arXiv:2002.07028), 2020
762020
Connections are Expressive Enough: Universal Approximability of Sparse Transformers
C Yun, YW Chang, S Bhojanapalli, AS Rawat, SJ Reddi, S Kumar
NeurIPS 2020 (arXiv:2006.04862), 2020
662020
SGD with shuffling: optimal rates without component convexity and large epoch requirements
K Ahn*, C Yun*, S Sra
NeurIPS 2020 (arXiv:2006.06946), 2020
652020
Minibatch vs local SGD with shuffling: Tight convergence bounds and beyond
C Yun, S Rajput, S Sra
ICLR 2022 (arXiv:2110.10342), 2021
362021
Provable memorization via deep neural networks using sub-linear parameters
S Park, J Lee, C Yun, J Shin
COLT 2021 (arXiv:2010.13363), 2021
302021
Minimax bounds on stochastic batched convex optimization
J Duchi, F Ruan, C Yun
Conference On Learning Theory, 3065-3162, 2018
292018
Linear attention is (maybe) all you need (to understand transformer optimization)
K Ahn, X Cheng, M Song, C Yun, A Jadbabaie, S Sra
ICLR 2024 (arXiv:2310.01082), 2023
202023
Open Problem: Can Single-Shuffle SGD be Better than Reshuffling SGD and GD?
C Yun, S Sra, A Jadbabaie
COLT 2021 (arXiv:2103.07079), 2021
17*2021
Tighter Lower Bounds for Shuffling SGD: Random Permutations and Beyond
J Cha, J Lee, C Yun
ICML 2023 (arXiv:2303.07160), 2023
142023
Are deep ResNets provably better than linear predictors?
C Yun, S Sra, A Jadbabaie
NeurIPS 2019 (arXiv:1907.03922), 2019
142019
PLASTIC: Improving Input and Label Plasticity for Sample Efficient Reinforcement Learning
H Lee, H Cho, H Kim, D Gwak, J Kim, J Choo, SY Yun, C Yun
NeurIPS 2023 (arXiv:2306.10711), 2023
13*2023
Efficiently testing local optimality and escaping saddles for ReLU networks
C Yun, S Sra, A Jadbabaie
ICLR 2019 (arXiv:1809.10858), 2018
92018
Practical Sharpness-Aware Minimization Cannot Converge All the Way to Optima
D Si, C Yun
NeurIPS 2023 (arXiv:2306.09850), 2023
72023
Provable Benefit of Mixup for Finding Optimal Decision Boundaries
J Oh, C Yun
ICML 2023 (arXiv:2306.00267), 2023
62023
The system can't perform the operation now. Try again later.
Articles 1–20