CN

SHEN GANGGANG SHEN

Professor      

  • Professional Title:Professor
  • Gender:Male
  • Status:Employed
  • Department:School of Software Engineering
  • Education Level:Postgraduate (Doctoral)

Paper Publications

Current position: 英文主页 > Scientific Research > Paper Publications

LisaCLIP: Locally Incremental Semantics Adaptation towards Zero-shot Text-driven Image Synthesis

Release time:2023-11-21
Hits:
Indexed by:
Essay collection
Journal:
Proceedings of the International Joint Conference on Neural Networks
Included Journals:
EI、CPCI-S
Place of Publication:
澳大利亚
Discipline:
Engineering
First-Level Discipline:
Computer Science and Technology
Document Type:
C
Key Words:
image synthesis, style transfer, CLIP model, adaptive patch selection
DOI number:
10.1109/IJCNN54540.2023.10191516
Date of Publication:
2023-06-18
Abstract:
The automatic transfer of a plain photo into a desired synthetic style has attracted numerous users in the application fields of photo editing, visual art, and entertainment. By connecting images and texts, the Contrastive Language-Image Pre-Training (CLIP) model facilitates the text-driven style transfer without exploring the image's latent domain. However, the trade-off between content fidelity and stylization remains challenging. In this paper, we present LisaCLIP, a CLIP-based image synthesis framework that only exploits the CLIP model to guide the imagery manipulations with a depth-adaptive encoder-decoder network. Since an image patch's semantics depend on its size, LisaCLIP progressively downsizes the patches while adaptively selecting the most significant ones for further stylization. We introduce a multi-stage training strategy to speed up LisaCLIP's convergence by decoupling the optimization objectives. Various experiments on public datasets demonstrated that LisaCLIP supported a wide range of style transfer tasks and outperformed other state-of-the-art methods in maintaining the balance between content and style.