Wenyan Li

Wenyan Li

Biography

Hi! I am Wenyan, a PhD student at the CoAStaL NLP Group, University of Copenhagen. I’m supervised by Anders Søgaard.

Before this, I worked as a Senior NLP Researcher at Sensetime and Comcast AI Research Lab (mentored by Ferhan Ture).

I completed my MS at University of Maryland, College Park, where I worked with Prof. Jordan Boyd-Graber on Natural Language Processing.

In my free time, I enjoy painting, cooking, yoga, and table tennis :)

NEWS:

  • 11/2024 – I will be presenting FoodieQA at EMNLP 2024, see you in Miami!
  • 09/2024 – FoodieQA and W1KP are accepted to EMNLP 2024 main conference!
  • 08/2024 – I will be presenting our paper at ACL 2024, see you in Bangkok!
  • 05/2024 – One paper accepted to ACL 2024 main conference!
Interests
  • Natural Language Processing
  • Multimodal Learning
  • Information Retrieval
  • Speech Technology
Education
  • M.S. in EECS (Thesis Track), 2016-2018

    University of Maryland, College Park

  • B.E. in Electrical Engineering, 2012-2016

    Northwestern Polytechnical University

Experience

 
 
 
 
 
Senior NLP Researcher (Artificial General Intelligence team)
Sep 2021 – Jun 2022 Shanghai, China
  • Knowledge-enhanced QA and dialogue system * Multimodal and prompt learning
 
 
 
 
 
Senior Machine Learning Research Engineer (NLP)
Jan 2019 – Jul 2021 Washington D.C.
  • Designed an unsupervised auto-annotation system for voice queries with user behavioral modeling to automatically identify errors in speech recognition and NLP systems and suggest corrections
  • Built an active learning pipeline with auto-labeled user transcriptions to improve ASR system for comcast X1, increasing system recognition accuracy by 9% (summarized the work into a conference paper as the first-author and filed a patent as the main inventor)
  • Developed a context-based approach that discovered misclassified user queries in question answering systems by performing semantic search with Sentence-BERT
  • Leveraged subword-level query representation and adversarial training in customer care dialogue system for misspelled user queries, which improved classification accuracy by 18% and increased user experience stability
  • Mentored interns and new-hires on projects relevant to multi-task learning and query representation

Certificates

Coursera
Applied Text Mining in Python
See certificate
edX
Using Python for Research
See certificate
Coursera
Object Oriented Programming in Java
See certificate

Contact