Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context M Reid, N Savinov, D Teplyashin, D Lepikhin, T Lillicrap, J Alayrac, ... arXiv preprint arXiv:2403.05530, 2024 | 701 | 2024 |
MusicLM: Generating Music From Text A Agostinelli, TI Denk, Z Borsos, J Engel, M Verzetti, A Caillon, Q Huang, ... arXiv preprint arXiv:2301.11325, 2023 | 572 | 2023 |
Audiolm: a language modeling approach to audio generation Z Borsos, R Marinier, D Vincent, E Kharitonov, O Pietquin, M Sharifi, ... IEEE/ACM transactions on audio, speech, and language processing 31, 2523-2533, 2023 | 526 | 2023 |
Coresets via Bilevel Optimization for Continual Learning and Streaming Z Borsos, M Mutný, A Krause NeurIPS 2020 - Advances in Neural Information Processing Systems, 2020 | 241 | 2020 |
Speak, read and prompt: High-fidelity text-to-speech with minimal supervision E Kharitonov, D Vincent, Z Borsos, R Marinier, S Girgin, O Pietquin, ... Transactions of the Association for Computational Linguistics 11, 1703-1718, 2023 | 168 | 2023 |
Audiopalm: A large language model that can speak and listen PK Rubenstein, C Asawaroengchai, DD Nguyen, A Bapna, Z Borsos, ... arXiv preprint arXiv:2306.12925, 2023 | 155 | 2023 |
SoundStorm: Efficient Parallel Audio Generation Z Borsos, M Sharifi, D Vincent, E Kharitonov, N Zeghidour, M Tagliasacchi arXiv preprint arXiv:2305.09636, 2023 | 91 | 2023 |
SpeechPainter: Text-conditioned Speech Inpainting Z Borsos, M Sharifi, M Tagliasacchi arXiv preprint arXiv:2202.07273, 2022 | 33 | 2022 |
Online Variance Reduction for Stochastic Optimization Z Borsos, A Krause, KY Levy Proceedings of the 31st Conference On Learning Theory 75, 324--357, 2018 | 29 | 2018 |
Dealing with overlap and imbalance: a new metric and approach Z Borsos, C Lemnaru, R Potolea Pattern Analysis and Applications, 1-15, 2016 | 29 | 2016 |
Semi-supervised Batch Active Learning via Bilevel Optimization Z Borsos, M Tagliasacchi, A Krause ICASSP 2021-2021 IEEE International Conference on Acoustics, Speech and …, 2021 | 23 | 2021 |
LMCodec: A Low Bitrate Speech Codec with Causal Transformer Models T Jenrungrot, M Chinen, WB Kleijn, J Skoglund, Z Borsos, N Zeghidour, ... ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and …, 2023 | 18 | 2023 |
Online Variance Reduction with Mixtures Z Borsos, S Curi, KY Levy, A Krause ICML 2019 - Proceedings of the 36th International Conference on Machine …, 2019 | 17 | 2019 |
MusicRL: Aligning Music Generation to Human Preferences G Cideron, S Girgin, M Verzetti, D Vincent, M Kastelic, Z Borsos, ... arXiv preprint arXiv:2402.04229, 2024 | 12 | 2024 |
Implementing Modular FFTs in FPGAs--A Basic Block for Lattice-Based Cryptography T Györfi, O Cret, Z Borsos Digital System Design (DSD), 2013 Euromicro Conference on, 305-308, 2013 | 11 | 2013 |
TokenSplit: Using Discrete Speech Representations for Direct, Refined, and Transcript-Conditioned Speech Separation and Recognition H Erdogan, S Wisdom, X Chang, Z Borsos, M Tagliasacchi, N Zeghidour, ... arXiv preprint arXiv:2308.10415, 2023 | 9 | 2023 |
Disentangling speech from surroundings with neural embeddings A Omran, N Zeghidour, Z Borsos, F de Chaumont Quitry, M Slaney, ... ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and …, 2023 | 9 | 2023 |
Data summarization via bilevel optimization Z Borsos, M Mutný, M Tagliasacchi, A Krause Journal of Machine Learning Research 25 (73), 1-53, 2024 | 8 | 2024 |
Inference of the three-dimensional chromatin structure and its temporal behavior BC Cristescu, Z Borsos, J Lygeros, MR Martínez, MA Rapsomaniki arXiv preprint arXiv:1811.09619, 2018 | 8 | 2018 |
Disentangling speech from surroundings in a neural audio codec A Omran, N Zeghidour, Z Borsos, F de Chaumont Quitry, M Slaney, ... arXiv preprint ArXiv:2203.15578, 2022 | 6 | 2022 |