Unsupervised cross-domain image generation Y Taigman, A Polyak, L Wolf arXiv preprint arXiv:1611.02200, 2016 | 1253 | 2016 |
Make-a-video: Text-to-video generation without text-video data U Singer, A Polyak, T Hayes, X Yin, J An, S Zhang, Q Hu, H Yang, ... arXiv preprint arXiv:2209.14792, 2022 | 1085 | 2022 |
Make-a-scene: Scene-based text-to-image generation with human priors O Gafni, A Polyak, O Ashual, S Sheynin, D Parikh, Y Taigman European Conference on Computer Vision, 89-106, 2022 | 487 | 2022 |
On generative spoken language modeling from raw audio K Lakhotia, E Kharitonov, WN Hsu, Y Adi, A Polyak, B Bolte, TA Nguyen, ... Transactions of the Association for Computational Linguistics 9, 1336-1354, 2021 | 330 | 2021 |
Audiogen: Textually guided audio generation F Kreuk, G Synnaeve, A Polyak, U Singer, A Défossez, J Copet, D Parikh, ... arXiv preprint arXiv:2209.15352, 2022 | 304 | 2022 |
Speech resynthesis from discrete disentangled self-supervised representations A Polyak, Y Adi, J Copet, E Kharitonov, K Lakhotia, WN Hsu, A Mohamed, ... arXiv preprint arXiv:2104.00355, 2021 | 301 | 2021 |
VoiceLoop: Voice Fitting and Synthesis via a Phonological Loop Y Taigman, L Wolf, A Polyak, E Nachmani 6th International Conference on Learning Representations, 2017 | 218* | 2017 |
Pick-a-pic: An open dataset of user preferences for text-to-image generation Y Kirstain, A Polyak, U Singer, S Matiana, J Penna, O Levy Advances in Neural Information Processing Systems 36, 36652-36663, 2023 | 217 | 2023 |
Channel-level acceleration of deep face representations A Polyak, L Wolf IEEE Access 3, 2163-2175, 2015 | 216 | 2015 |
Direct speech-to-speech translation with discrete units A Lee, PJ Chen, C Wang, J Gu, S Popuri, X Ma, A Polyak, Y Adi, Q He, ... arXiv preprint arXiv:2107.05604, 2021 | 162 | 2021 |
A Universal Music Translation Network N Mor, L Wolf, A Polyak, Y Taigman 7th International Conference on Learning Representations, 2019 | 157 | 2019 |
Text-free prosody-aware generative spoken language modeling E Kharitonov, A Lee, A Polyak, Y Adi, J Copet, K Lakhotia, TA Nguyen, ... arXiv preprint arXiv:2109.03264, 2021 | 118 | 2021 |
Text-to-4d dynamic scene generation U Singer, S Sheynin, A Polyak, O Ashual, I Makarov, F Kokkinos, N Goyal, ... arXiv preprint arXiv:2301.11280, 2023 | 117 | 2023 |
Knn-diffusion: Image generation via large-scale retrieval S Sheynin, O Ashual, A Polyak, U Singer, O Gafni, E Nachmani, ... arXiv preprint arXiv:2204.02849, 2022 | 115 | 2022 |
Scaling autoregressive multi-modal models: Pretraining and instruction tuning L Yu, B Shi, R Pasunuru, B Muller, O Golovneva, T Wang, A Babu, B Tang, ... arXiv preprint arXiv:2309.02591 2 (3), 2023 | 110 | 2023 |
Fitting New Speakers Based on a Short Untranscribed Sample E Nachmani, A Polyak, Y Taigman, L Wolf Proceedings of the 35th International Conference on Machine Learning, 2018 | 106 | 2018 |
Emu edit: Precise image editing via recognition and generation tasks S Sheynin, A Polyak, U Singer, Y Kirstain, A Zohar, O Ashual, D Parikh, ... Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2024 | 67 | 2024 |
Textless speech emotion conversion using discrete and decomposed representations F Kreuk, A Polyak, J Copet, E Kharitonov, TA Nguyen, M Rivière, WN Hsu, ... arXiv preprint arXiv:2111.07402, 2021 | 66 | 2021 |
Unsupervised creation of parameterized avatars L Wolf, Y Taigman, A Polyak Proceedings of the IEEE International Conference on Computer Vision, 1530-1538, 2017 | 56 | 2017 |
Unsupervised cross-domain singing voice conversion A Polyak, L Wolf, Y Adi, Y Taigman arXiv preprint arXiv:2008.02830, 2020 | 54 | 2020 |