Blink: Multimodal large language models can see but not perceive X Fu, Y Hu, B Li, Y Feng, H Wang, X Lin, D Roth, NA Smith, WC Ma, ... arXiv preprint arXiv:2404.12390, 2024 | 16 | 2024 |
Imagenhub: Standardizing the evaluation of conditional image generation models M Ku, T Li, K Zhang, Y Lu, X Fu, W Zhuang, W Chen arXiv preprint arXiv:2310.01596, 2023 | 16 | 2023 |
Karthikeyan k, jamaal hay, michael shur, jennifer sheffield, and dan roth. 2019b. university of pennsylvania lorehlt 2019 submission S Mayhew, T Tsygankova, F Marini, Z Wang, J Lee, X Yu, X Fu, W Shi, ... Tech. Rep., Technical report, Technical report, 2019 | 11 | 2019 |
Design challenges in low-resource cross-lingual entity linking X Fu, W Shi, X Yu, Z Zhao, D Roth arXiv preprint arXiv:2005.00692, 2020 | 10 | 2020 |
Generate then select: Open-ended visual question answering guided by world knowledge X Fu, S Zhang, G Kwon, P Perera, H Zhu, Y Zhang, AH Li, WY Wang, ... arXiv preprint arXiv:2305.18842, 2023 | 9 | 2023 |
There’sa time and place for reasoning beyond the image X Fu, B Zhou, I Chandratreya, C Vondrick, D Roth Proceedings of the 60th Annual Meeting of the Association for Computational …, 2022 | 9 | 2022 |
Constrained sequence-to-sequence semitic root extraction for enriching word embeddings A El-Kishky, X Fu, A Addawood, N Sobh, C Voss, J Han Proceedings of the Fourth Arabic Natural Language Processing Workshop, 88-96, 2019 | 6 | 2019 |
Deceptive Semantic Shortcuts on Reasoning Chains: How Far Can Models Go without Hallucination? B Li, B Zhou, F Wang, X Fu, D Roth, M Chen Proceedings of the 2024 Conference of the North American Chapter of the …, 2024 | 2 | 2024 |
Deceiving Semantic Shortcuts on Reasoning Chains: How Far Can Models Go without Hallucination? B Li, B Zhou, F Wang, X Fu, D Roth, M Chen arXiv preprint arXiv:2311.09702, 2023 | 2 | 2023 |
Interpretable by design visual question answering X Fu, B Zhou, S Chen, M Yatskar, D Roth arXiv preprint arXiv:2305.14882 4, 2023 | 2 | 2023 |
Visual Sketchpad: Sketching as a Visual Chain of Thought for Multimodal Language Models Y Hu, W Shi, X Fu, D Roth, M Ostendorf, L Zettlemoyer, NA Smith, ... arXiv preprint arXiv:2406.09403, 2024 | 1 | 2024 |
MuirBench: A Comprehensive Benchmark for Robust Multi-image Understanding F Wang, X Fu, JY Huang, Z Li, Q Liu, X Liu, MD Ma, N Xu, W Zhou, ... arXiv preprint arXiv:2406.09411, 2024 | 1 | 2024 |
FamiCom: Further Demystifying Prompts for Language Models with Task-Agnostic Performance Estimation B Li, B Zhou, X Fu, F Wang, D Roth, M Chen arXiv preprint arXiv:2406.11243, 2024 | | 2024 |
Commonsense-T2I Challenge: Can Text-to-Image Generation Models Understand Commonsense? X Fu, M He, Y Lu, WY Wang, D Roth arXiv preprint arXiv:2406.07546, 2024 | | 2024 |
Dynamic Clue Bottlenecks: Towards Interpretable-by-Design Visual Question Answering X Fu, B Zhou, S Chen, M Yatskar, D Roth arXiv preprint arXiv:2305.14882, 2023 | | 2023 |
LoReHLT 2019 Submission S Mayhew, T Tsygankova, F Marini, Z Wang, J Lee, X Yu, X Fu, W Shi, ... | | |