Audio-Guided Image Manipulation for Artistic Paintings


Seung hyun Lee1Nahyuk Lee1Chanyoung Kim1Wonjeong Ryoo1Jinkyu Kim1Sang Ho Yoon2Sangpil Kim1

Korea University1   KAIST2


Overview

We propose a novel audio-guided image manipulation approach for artistic paintings, generating semantically meaningful latent manipulations that give an audio input. To our best knowledge, our work is the first to explore generating semantically meaningful image manipulations from various audio sources. Our proposed approach consists of two main steps. First, we train a set of encoders with a different modality (i.e., audio, text, and image) to produce the matched latent representations. Second, we use direct code optimization to modify a source latent code in response to a user-provided audio input. This methodology enables various manipulations for art paintings conditioned on driving audio inputs, such as wind, fire, explosion, thunderstorm, rain, folk music, and Latin music.

Examples

Original Art
Sound
Manipulated Art

Fire Sound ->

Latin Music Sound ->

Wind Sound ->

BibTeX

If you use our code or data, please cite:

    @article{seunghyun2021audio,
      author       = {Seung hyun Lee, Nahyuk Lee, Chanyoung Kim, Wonjeong Ryoo, Jinkyu Kim, Sang Ho Yoon, Sangpil Kim}
      title        = {Audio-Guided Image Manipulation for Artistic Paintings},
      booktitle    = {arxiv 2021 arxiv},
      year         = {2021}
    }