Follow
Faisal Ahmed, PhD
Faisal Ahmed, PhD
Verified email at microsoft.com
Title
Cited by
Cited by
Year
Uniter: Universal image-text representation learning
YC Chen, L Li, L Yu, A El Kholy, F Ahmed, Z Gan, Y Cheng, J Liu
European conference on computer vision, 104-120, 2020
22832020
Uniter: Learning universal image-text representations
YC Chen, L Li, L Yu, A El Kholy, F Ahmed, Z Gan, Y Cheng, J Liu
4202019
Towards end-to-end reinforcement learning of dialogue agents for information access
B Dhingra, L Li, X Li, J Gao, YN Chen, F Ahmed, L Deng
arXiv preprint arXiv:1609.00777, 2016
3872016
Mm-react: Prompting chatgpt for multimodal reasoning and action
Z Yang, L Li, J Wang, K Lin, E Azarnasab, F Ahmed, Z Liu, C Liu, M Zeng, ...
arXiv preprint arXiv:2303.11381, 2023
3192023
Swinbert: End-to-end transformers with sparse attention for video captioning
K Lin, L Li, CC Lin, F Ahmed, Z Gan, Z Liu, Y Lu, L Wang
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2022
2722022
Bbq-networks: Efficient exploration in deep reinforcement learning for task-oriented dialogue systems
Z Lipton, X Li, J Gao, L Li, F Ahmed, L Deng
Proceedings of the AAAI Conference on Artificial Intelligence 32 (1), 2018
1952018
End-to-end learning of dialogue agents for information access
L Li, B Dhingra, J Gao, X Li, YN Chen, L Deng, F Ahmed
US Patent 10,546,066, 2020
1522020
Unitab: Unifying text and box outputs for grounded vision-language modeling
Z Yang, Z Gan, J Wang, X Hu, F Ahmed, Z Liu, Y Lu, L Wang
European Conference on Computer Vision, 521-539, 2022
1172022
The five Ws for information visualization with application to healthcare informatics
Z Zhang, B Wang, F Ahmed, IV Ramakrishnan, R Zhao, A Viccellio, ...
IEEE transactions on visualization and computer graphics 19 (11), 1895-1910, 2013
1132013
Efficient exploration for dialogue policy learning with bbq networks & replay buffer spiking
ZC Lipton, J Gao, L Li, X Li, F Ahmed, L Deng
arXiv preprint arXiv:1608.05081 3, 2016
672016
Accessible skimming: faster screen reading of web pages
F Ahmed, Y Borodin, A Soviak, M Islam, IV Ramakrishnan, T Hedgpeth
Proceedings of the 25th annual ACM symposium on User interface software and …, 2012
632012
Why read if you can skim: towards enabling faster screen reading
F Ahmed, Y Borodin, Y Puzis, IV Ramakrishnan
Proceedings of the International Cross-Disciplinary Conference on Web …, 2012
482012
Mm-vid: Advancing video understanding with gpt-4v (ision)
K Lin, F Ahmed, L Li, CC Lin, E Azarnasab, Z Yang, J Wang, L Liang, ...
arXiv preprint arXiv:2310.19773, 2023
462023
Crossing the format boundary of text and boxes: Towards unified vision-language modeling
Z Yang, Z Gan, J Wang, X Hu, F Ahmed, Z Liu, Y Lu, L Wang
arXiv preprint arXiv:2111.12085 3, 2021
402021
Hearsay: a new generation context-driven multi-modal assistive web browser
Y Borodin, F Ahmed, MA Islam, Y Puzis, V Melnyk, S Feng, ...
Proceedings of the 19th international conference on World wide web, 1233-1236, 2010
292010
Efficient exploration for dialog policy learning with deep BBQ networks\& replay buffer spiking
ZC Lipton, J Gao, L Li, X Li, F Ahmed, L Deng
CoRR abs/1608.05081, 2016
272016
Assistive web browsing with touch interfaces
F Ahmed, MA Islam, Y Borodin, IV Ramakrishnan
Proceedings of the 12th international ACM SIGACCESS conference on Computers …, 2010
212010
Non-visual skimming on touch-screen devices
F Ahmed, A Soviak, Y Borodin, IV Ramakrishnan
Proceedings of the 2013 international conference on Intelligent user …, 2013
122013
An intuitive accessible web automation user interface
Y Puzis, Y Borodin, F Ahmed, IV Ramakrishnan
Proceedings of the International Cross-Disciplinary Conference on Web …, 2012
122012
Bridging the web accessibility divide
IV Ramakrishnan, J Mahmud, Y Borodin, MA Islam, F Ahmed
Electronic Notes in Theoretical Computer Science 235, 107-124, 2009
112009
The system can't perform the operation now. Try again later.
Articles 1–20