Audio-Guided Image Manipulation for Artistic Paintings

Seung hyun Lee¹ Nahyuk Lee¹ Chanyoung Kim¹ Wonjeong Ryoo¹ Jinkyu Kim¹ Sang Ho Yoon² Sangpil Kim¹

Korea University¹ KAIST²

Overview

We propose a novel audio-guided image manipulation approach for artistic paintings, generating semantically meaningful latent manipulations that give an audio input. To our best knowledge, our work is the first to explore generating semantically meaningful image manipulations from various audio sources. Our proposed approach consists of two main steps. First, we train a set of encoders with a different modality (i.e., audio, text, and image) to produce the matched latent representations. Second, we use direct code optimization to modify a source latent code in response to a user-provided audio input. This methodology enables various manipulations for art paintings conditioned on driving audio inputs, such as wind, fire, explosion, thunderstorm, rain, folk music, and Latin music.

Examples

Original Art

Sound

Manipulated Art

Fire Sound ->

Latin Music Sound ->

Wind Sound ->

BibTeX

If you use our code or data, please cite:

    @inproceedings{seunghyun2021audio,
      title        = {Audio-Guided Image Manipulation for Artistic Paintings},
      author       = {Seung hyun Lee, Nahyuk Lee, Chanyoung Kim, Wonjeong Ryoo, Jinkyu Kim, Sang Ho Yoon, Sangpil Kim},
      booktitle    = {NIPS Workshop on Machine Learning for Creativity and Design},
      year         = {2021}
    }