研究生: |
周猷翔 Chou, Yu-Hsiang |
---|---|
論文名稱: |
基於金字塔式對抗生成網路的可控制筆觸風格轉移 A Controllable-Brushstroke Style-Transfer Method using Pyramid Generative Adversarial Networks |
指導教授: |
黃婷婷
Hwang, Ting-Ting |
口試委員: |
吳中浩
Wu, Allen C.-H. 劉一宇 Liu, Yi-Yu |
學位類別: |
碩士 Master |
系所名稱: |
電機資訊學院 - 資訊工程學系 Computer Science |
論文出版年: | 2020 |
畢業學年度: | 109 |
語文別: | 英文 |
論文頁數: | 31 |
中文關鍵詞: | 風格轉換 |
外文關鍵詞: | Style transfer |
相關次數: | 點閱:2 下載:0 |
分享至: |
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
在本論文中,我們提出了一種快速、筆劃可控的風格轉換架構,並且能轉換單一藝術家的風格。我們引入了類似金字塔架構的對抗生成網路,來捕獲不同的感受野,使這些不同的感受野可以生成各種不同筆觸大小的圖像。我們的模型可以模仿一個藝術家的藝術風格,而不僅僅是單ㄧㄧ幅繪畫的風格。接著我們使用遮罩陣列將各種筆觸大小的圖像融合為一張圖像,並解決不同筆觸大小圖像之間的色調一致性問題。最後,我們進行了一系列實驗來證明我們提出的方法的有效。
In this thesis, we propose a fast, stroke controllable styletransfer with an artist’s art style. Using the GAN as the base model, we introduce a pyramidliked architecture to capture the different receptive fields which can produce images with various brushstroke sizes. We also can imitate an artist’s art style, not only a single style instance. We then use a mask array to fuse the regions of various brushstroke sizes into one image, and solve the color tone consistency problem between the regions of various brushstroke sizes. Finally, we perform a series of experiments to demonstrate the effectiveness of our proposed method.
[1] Chen, T., Kornblith, S., Norouzi, M., and Hinton, G. A simple framework for contrastive
learning of visual representations. In International conference on machine learning (2020),
PMLR, pp. 1597–1607.
[2] Chen, T. Q., and Schmidt, M. Fast patchbased style transfer of arbitrary style. arXiv
preprint arXiv:1612.04337 (2016).
[3] Efros, A. A., and Freeman, W. T. Image quilting for texture synthesis and transfer. In Proceedings of the 28th annual conference on Computer graphics and interactive techniques
(2001), pp. 341–346.
[4] Efros, A. A., and Leung, T. K. Texture synthesis by nonparametric sampling. In Proceedings of the seventh IEEE international conference on computer vision (1999), vol. 2,
IEEE, pp. 1033–1038.
[5] Gatys, L., Ecker, A. S., and Bethge, M. Texture synthesis using convolutional neural
networks. Advances in neural information processing systems 28 (2015), 262–270.
[6] Gatys, L. A., Ecker, A. S., and Bethge, M. Image style transfer using convolutional neural
networks. In Proceedings of the IEEE conference on computer vision and pattern recognition (2016), pp. 2414–2423.
[7] Gatys, L. A., Ecker, A. S., Bethge, M., Hertzmann, A., and Shechtman, E. Controlling
perceptual factors in neural style transfer. In Proceedings of the IEEE Conference on
Computer Vision and Pattern Recognition (2017), pp. 3985–3993.
[8] Goodfellow, I., PougetAbadie, J., Mirza, M., Xu, B., WardeFarley, D., Ozair, S.,
Courville, A., and Bengio, Y. Generative adversarial nets. Advances in neural information processing systems 27 (2014).
[9] Heeger, D. J., and Bergen, J. R. Pyramidbased texture analysis/synthesis. In Proceedings
of the 22nd annual conference on Computer graphics and interactive techniques (1995),
pp. 229–238.
[10] Huang, X., and Belongie, S. Arbitrary style transfer in realtime with adaptive instance
normalization. In Proceedings of the IEEE International Conference on Computer Vision
(2017), pp. 1501–1510.
[11] Isola, P., Zhu, J.Y., Zhou, T., and Efros, A. A. Imagetoimage translation with conditional
adversarial networks. In Proceedings of the IEEE conference on computer vision and
pattern recognition (2017), pp. 1125–1134.
30
[12] Jing, Y., Liu, Y., Yang, Y., Feng, Z., Yu, Y., Tao, D., and Song, M. Stroke controllable fast
style transfer with adaptive receptive fields. In Proceedings of the European Conference
on Computer Vision (ECCV) (2018), pp. 238–254.
[13] Johnson, J., Alahi, A., and FeiFei, L. Perceptual losses for realtime style transfer and
superresolution. In European conference on computer vision (2016), Springer, pp. 694–
711.
[14] Kingma, D. P., and Ba, J. Adam: A method for stochastic optimization. arXiv preprint
arXiv:1412.6980 (2014).
[15] LeCun, Y., Bottou, L., Bengio, Y., and Haffner, P. Gradientbased learning applied to
document recognition. Proceedings of the IEEE 86, 11 (1998), 2278–2324.
[16] Li, Y., Fang, C., Yang, J., Wang, Z., Lu, X., and Yang, M.H. Universal style transfer via
feature transforms. arXiv preprint arXiv:1705.08086 (2017).
[17] Mao, X., Li, Q., Xie, H., Lau, R. Y., Wang, Z., and Paul Smolley, S. Least squares
generative adversarial networks. In Proceedings of the IEEE international conference on
computer vision (2017), pp. 2794–2802.
[18] Park, D. Y., and Lee, K. H. Arbitrary style transfer with styleattentional networks. In
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition
(2019), pp. 5880–5888.
[19] Park, T., Efros, A. A., Zhang, R., and Zhu, J.Y. Contrastive learning for unpaired imagetoimage translation. In European Conference on Computer Vision (2020), Springer,
pp. 319–345.
[20] Risser, E., Wilmot, P., and Barnes, C. Stable and controllable neural texture synthesis and
style transfer using histogram losses. arXiv preprint arXiv:1701.08893 (2017).
[21] Sanakoyeu, A., Kotovenko, D., Lang, S., and Ommer, B. A styleaware content loss for
realtime hd style transfer. In proceedings of the European conference on computer vision
(ECCV) (2018), pp. 698–714.
[22] Shaham, T. R., Dekel, T., and Michaeli, T. Singan: Learning a generative model from
a single natural image. In Proceedings of the IEEE/CVF International Conference on
Computer Vision (2019), pp. 4570–4580.
[23] Wada, K. labelme: Image Polygonal Annotation with Python. https://github.com/
wkentaro/labelme, 2016.
[24] Yao, Y., Ren, J., Xie, X., Liu, W., Liu, Y.J., and Wang, J. Attentionaware multistroke
style transfer. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (2019), pp. 1467–1475.
[25] Zhang, H., Goodfellow, I., Metaxas, D., and Odena, A. Selfattention generative adversarial networks. In International conference on machine learning (2019), PMLR, pp. 7354–
7363.
[26] Zhu, J.Y., Park, T., Isola, P., and Efros, A. A. Unpaired imagetoimage translation using
cycleconsistent adversarial networks. In Proceedings of the IEEE international conference on computer vision (2017), pp. 2223–2232.