Wenyan Li

Biography

Hi! I am Wenyan. My research focuses on building and interpreting multimodal and NLP models. I completed my PhD with a focus on multimodal learning at the CoAStaL NLP Group, University of Copenhagen, where I was supervised by Anders Søgaard.

I was also a senior NLP Researcher at Sensetime and Comcast AI Research Lab. Before that, I spent a wonderful time at University of Maryland, College Park for my MS, where I worked with Prof. Jordan Boyd-Graber on Natural Language Processing.

Feel free to reach out for collaboration on related projects.

In my free time, I enjoy painting, cooking, yoga, and table tennis :)

NEWS

05/2026 – ExpAlign is accepted to ICML 2026!
01/2026 – RAVENEA is accepted to ICLR!
10/2025 – I have successfully defended my PhD! 🎓
08/2025 – Two papers (one main and one findings) are accepted to EMNLP 2025!
07/2025 – CultureCLIP is accepted to COLM 2025!
I’m currently on my research visit at RycoLab in Zürich until end of June 2025.
11/2024 – Invited talk and short visit at MIT.
11/2024 – Our W1KP paper won the Outstanding Paper Award at EMNLP 2024! 🏆
11/2024 – I will present FoodieQA in EMNLP 2024, see you in Miami!
09/2024 – FoodieQA and W1KP are accepted to EMNLP 2024 main conference!
05/2024 – One paper accepted to ACL 2024 main conference!

Interests

Multimodal Learning
Natural Language Processing
Information Retrieval
Speech Technology

Education

Ph.D. in Computer Science, 2022-2025

University of Copenhagen, Denmark
M.S. in EECS (Thesis Track), 2016-2018

University of Maryland, College Park

Selected Publications

Jiaang Li, Yifei Yuan, Wenyan Li, Mohammad Aliannejadi, Daniel Hershcovich, Anders Søgaard, Ivan Vulić, Wenxuan Zhang, Paul Pu Liang, Yang Deng, Serge Belongie (2026). RAVENEA: A Benchmark for Multimodal Retrieval-Augmented Visual Culture Understanding. To appear in ICLR 2026.

PDF Cite Code Dataset Project

Wenyan Li, Raphael Tang, Chengzu Li, Caiqi Zhang, Ivan Vulić, Anders Søgaard (2025). Lost in Embeddings: Information Loss in Vision-Language Models. In EMNLP 2025 findings.

PDF Cite Code

Raphael Tang, Crystina Zhang, Lixinyu Xu, Yao Lu, Wenyan Li, Pontus Stenetorp, Jimmy Lin, Ferhan Ture (2024). Words Worth a Thousand Pictures: Measuring and Understanding Perceptual Variability in Text-to-Image Generation. In EMNLP 2024. Outstanding Paper Award.

PDF Cite Website

Wenyan Li, Xinyu Zhang, Jiaang Li, Qiwei Peng, Raphael Tang, Li Zhou, Weijia Zhang, Guimin Hu, Yifei Yuan, Anders Søgaard, Daniel Hershcovich, Desmond Elliott (2024). FoodieQA: A Multimodal Dataset for Fine-Grained Understanding of Chinese Food Culture. In EMNLP 2024.

PDF Cite Code Dataset

Wenyan Li, Jiaang Li, Rita Ramos, Raphael Tang, Desmond Elliott (2024). Understanding Retrieval Robustness for Retrieval-Augmented Image Captioning. In ACL 2024 (oral, 8%).

PDF Cite Code

Wenyan Li, Jonas F Lotz, Chen Qiu, Desmond Elliott (2024). The Role of Data Curation in Image Captioning. EACL 2024 (oral).

PDF Cite Code Slides

Wenyan Li, Dong Li, Wanjing Li, Yuanjie Wang, Hai Jie, Yiran Zhong (2023). MAP: Low-data Regime Multimodal Learning with Adapter-based Pre-training and Prompting. In Learning with Small Data (LSD), 2023.

PDF Cite

Wenyan Li, Ferhan Ture, Jose Casillas, Tom Des Jardins (2022). Systems and Methods for Training Voice Query Models. U.S. filed patent.

Cite

Wenyan Li, Alvin Grissom II, Jordan Boyd-Graber (2020). An Attentive Recurrent Model for Incremental Prediction of Sentence-final Verbs. In EMNLP findings 2020.

PDF Cite DOI

Wenyan Li, Ferhan Ture (2020). Auto-annotation for Voice-enabled Entertainment Systems. In SIGIR 2020.

PDF Cite Slides Video DOI

Experience

Senior NLP Researcher (Artificial General Intelligence team)

Sensetime

Sep 2021 – Jun 2022 Shanghai, China

Knowledge-enhanced QA and dialogue system * Multimodal and prompt learning

Senior Machine Learning Research Engineer (NLP)

Comcast Applied AI Research Lab

Jan 2019 – Jul 2021 Washington D.C.

Designed an unsupervised auto-annotation system for voice queries with user behavioral modeling to automatically identify errors in speech recognition and NLP systems and suggest corrections
Built an active learning pipeline with auto-labeled user transcriptions to improve ASR system for comcast X1, increasing system recognition accuracy by 9% (summarized the work into a conference paper as the first-author and filed a patent as the main inventor)
Developed a context-based approach that discovered misclassified user queries in question answering systems by performing semantic search with Sentence-BERT
Leveraged subword-level query representation and adversarial training in customer care dialogue system for misspelled user queries, which improved classification accuracy by 18% and increased user experience stability
Mentored interns and new-hires on projects relevant to multi-task learning and query representation

Projects

Predicting Phenotype from Genomic Sequence with Deep Neural Networks

Predicting pairwise gene interactions using attention-based RNN/CNN, and compared to baseline approach with random forest procedure with gene ontotype described in Yu, et al. (Cell Systems, 2016) on the S. cerevisiae data from Costanzo, et al. (Science, 2010).

Certificates

Applied Text Mining in Python

Coursera Apr 2018

See certificate

Using Python for Research

edX Jul 2017

See certificate

Object Oriented Programming in Java

Coursera Jul 2016

See certificate

Wenyan Li

Biography

Selected Publications

Experience

Projects

Recent Posts

Certificates

Contact

Wenyan Li

Biography

Selected Publications

Experience

Projects

Recent Posts

Certificates

tags

Contact