EN

GANG SHEN

教授

个人信息 更多+
  • 教师英文名称: SHEN GANG
  • 性别: 男
  • 在职信息: 在职
  • 所在单位: 软件学院
  • 学历: 研究生(博士)毕业
  • 学位: 哲学博士学位

其他联系方式

暂无内容

论文成果

当前位置: 中文主页 - 科学研究 - 论文成果

LisaCLIP: Locally Incremental Semantics Adaptation towards Zero-shot Text-driven Image Synthesis

发布时间:2023-11-21
点击次数:
论文类型:
论文集
发表刊物:
Proceedings of the International Joint Conference on Neural Networks
收录刊物:
EI、CPCI-S
刊物所在地:
澳大利亚
学科门类:
工学
一级学科:
计算机科学与技术
文献类型:
C
关键字:
image synthesis, style transfer, CLIP model, adaptive patch selection
DOI码:
10.1109/IJCNN54540.2023.10191516
发表时间:
2023-06-18
摘要:
The automatic transfer of a plain photo into a desired synthetic style has attracted numerous users in the application fields of photo editing, visual art, and entertainment. By connecting images and texts, the Contrastive Language-Image Pre-Training (CLIP) model facilitates the text-driven style transfer without exploring the image's latent domain. However, the trade-off between content fidelity and stylization remains challenging. In this paper, we present LisaCLIP, a CLIP-based image synthesis framework that only exploits the CLIP model to guide the imagery manipulations with a depth-adaptive encoder-decoder network. Since an image patch's semantics depend on its size, LisaCLIP progressively downsizes the patches while adaptively selecting the most significant ones for further stylization. We introduce a multi-stage training strategy to speed up LisaCLIP's convergence by decoupling the optimization objectives. Various experiments on public datasets demonstrated that LisaCLIP supported a wide range of style transfer tasks and outperformed other state-of-the-art methods in maintaining the balance between content and style.