To find where you talk: Temporal sentence localization in video with attention based location regression Y Yuan, T Mei, W Zhu Proceedings of the AAAI Conference on Artificial Intelligence 33 (01), 9159-9166, 2019 | 349 | 2019 |
Semantic Conditioned Dynamic Modulation for Temporal Sentence Grounding in Videos Y Yuan, L Ma, J Wang, W Liu, W Zhu IEEE Transactions on Pattern Analysis and Machine Intelligence 44 (5), 2725 …, 2022 | 256 | 2022 |
Video summarization by learning deep side semantic embedding Y Yuan, T Mei, P Cui, W Zhu IEEE Transactions on Circuits and Systems for Video Technology 29 (1), 226-237, 2017 | 108 | 2017 |
A Closer Look at Temporal Sentence Grounding in Videos: Dataset and Metric Y Yuan, X Lan, X Wang, L Chen, Z Wang, W Zhu The 2nd International Workshop on Human-centric Multimedia Analysis (HUMA '21), 2021 | 64 | 2021 |
A Survey on Temporal Sentence Grounding in Videos X Lan, Y Yuan, X Wang, W Zhu ACM Transactions on Multimedia Computing, Communications and Applications 19 …, 2023 | 57 | 2023 |
Cross-modal dual learning for sentence-to-video generation Y Liu, X Wang, Y Yuan, W Zhu Proceedings of the 27th ACM international conference on multimedia, 1239-1247, 2019 | 38 | 2019 |
Sentence specified dynamic video thumbnail generation Y Yuan, L Ma, W Zhu Proceedings of the 27th ACM international conference on multimedia, 2332-2340, 2019 | 31 | 2019 |
Controllable video captioning with an exemplar sentence Y Yuan, L Ma, J Wang, W Zhu Proceedings of the 28th ACM International Conference on Multimedia, 1085-1093, 2020 | 20 | 2020 |
Curriculum multi-negative augmentation for debiased video grounding X Lan, Y Yuan, H Chen, X Wang, Z Jie, L Ma, Z Wang, W Zhu Proceedings of the AAAI Conference on Artificial Intelligence 37 (1), 1213-1221, 2023 | 15 | 2023 |
Syntax Customized Video Captioning by Imitating Exemplar Sentences Y Yuan, L Ma, W Zhu IEEE Transactions on Pattern Analysis and Machine Intelligence 44 (12 …, 2021 | 7 | 2021 |
Weakly-Supervised 3D Visual Grounding based on Visual Linguistic Alignment X Xu, Y Yuan, Q Zhang, W Wu, Z Jie, L Ma, X Wang arXiv preprint arXiv:2312.09625, 2023 | 4 | 2023 |
Fewer Tokens and Fewer Videos: Extending Video Understanding Abilities in Large Vision-Language Models S Chen, Y Yuan, S Chen, Z Jie, L Ma arXiv preprint arXiv:2406.08024, 2024 | 1 | 2024 |
VidCompress: Memory-Enhanced Temporal Compression for Video Understanding in Large Language Models X Lan, Y Yuan, Z Jie, L Ma arXiv preprint arXiv:2410.11417, 2024 | | 2024 |
3D Weakly Supervised Semantic Segmentation with 2D Vision-Language Guidance X Xu, Y Yuan, J Li, Q Zhang, Z Jie, L Ma, H Tang, N Sebe, X Wang ECCV 2024, 2024 | | 2024 |
A Closer Look at Debiased Temporal Sentence Grounding in Videos: Dataset, Metric, and Approach X Lan, Y Yuan, X Wang, L Chen, Z Wang, L Ma, W Zhu ACM Transactions on Multimedia Computing, Communications, and Applications …, 2023 | | 2023 |
Supplementary Materials for 3D Weakly Supervised Semantic Segmentation with 2D Vision-Language Guidance X Xu, Y Yuan, J Li, Q Zhang, Z Jie, L Ma, H Tang, N Sebe, X Wang | | |