Wenyan Li
Wenyan Li
Home
Publications
Projects
Posts
CV
Contact
Multimodal
Data Curation for Image Captioning with Text-to-Image Generative Models
Recent advances in image captioning are mainly driven by large-scale vision-language pretraining, relying heavily on computational …
Wenyan Li
,
Jonas F Lotz
,
Chen Qiu
,
Desmond Elliott
PDF
Cite
MAP: Low-data Regime Multimodal Learning with Adapter-based Pre-training and Prompting
Pretrained vision-language (VL) models have shown impressive results on various multi-modal downstream tasks recently. Many of the …
Wenyan Li
,
Dong Li
,
Wanjing Li
,
Yuanjie Wang
,
Hai Jie
,
Yiran Zhong
PDF
Cite
Cite
×