研究生: |
賴承薰 Lai, Cheng-Hsun. |
---|---|
論文名稱: |
基於生成對抗式網路之虛擬服裝穿戴合成技術 Virtual Try-on Image Synthesis Based on Generative Adversarial Networks |
指導教授: |
林嘉文
Lin, Chia-Wen |
口試委員: |
鄭文皇
Cheng, Wen-Huang 胡敏君 Hu, Min-Chun 邱維辰 Chiu, Wei-Chen |
學位類別: |
碩士 Master |
系所名稱: |
電機資訊學院 - 電機工程學系 Department of Electrical Engineering |
論文出版年: | 2019 |
畢業學年度: | 107 |
語文別: | 英文 |
論文頁數: | 34 |
中文關鍵詞: | 影像合成 、虛擬試穿 、對抗式生成網路 |
外文關鍵詞: | Image Synthesis, Virtual try-on, generative adversarial network |
相關次數: | 點閱:2 下載:0 |
分享至: |
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
影像合成是在電腦視覺的研究領域中逐漸發展的一項技術,其中相關的應用不勝枚舉,不論是在學界或業界都有很大的需求。而隨著卷積神經網路和對抗式生成網路等技術和架構的發展和演進,我們愈來愈能夠合成出以假亂真的影像。而隨著現今電子商務的蓬勃發展,我們愈來愈仰賴線上購物,在瀏覽購物網站時,消費者總是希望盡可能的獲得商品相關資訊。結合以上兩點,我們可以推測,以圖像為基礎的虛擬試穿技術,正是最符合線上服飾購物網站消費者需求所需要的技術。
然而,此項技術仍然相當具有挑戰性。首先是人體架構的資訊保存,如何能夠在轉換過程中維持肢體上的連貫性,並保持自然而不突兀的姿勢。另外一項主要挑戰是,欲穿搭的衣物中包含的顏色、圖案、花紋或其他特徵,在轉換過程中都要完整的保留。這些對現今的網路架構來說,仍然有很多值得改善的部分。
在這篇論文中,我們提出了一個虛擬試穿衣物的網路。在這個虛擬試穿的網路中,我們會先輸入一張自己的照片,以此圖像作為穿搭的對象,並提取中當中身體架構的資訊。並另外輸入一張欲穿搭的衣物的影像,將此張影像中所包含的服飾作為目標穿搭,透過虛擬試穿技術的網路,可以成功的將原先輸入含有人體的影像,成功更換身上的衣著。並且良好的保持生成影像的真實性、衣著的合身度以及欲穿搭衣物中的花紋、圖樣以及其他細節資訊。我們的架構中包含第一部分的衣物扭曲,在保持衣物細節資訊的狀況下,符合實際穿上後身體的形狀和角度,也試圖模擬實穿後的皺褶和陰影。以及第二部分的試穿模型,藉助遮罩的幫助,我們可以成功地讓圖中的模特兒穿上衣服且維持清晰地輪廓和邊界 。
而就合成結果而言,視覺上我們能夠達到較好的成果,改善了在過去合成圖像中身體和衣物中模糊邊界。實驗也顯示在使用者調查的結果中,多數的受試對象主觀上都會比較傾向選擇我們的網路架構所產生出的結果。
Image Synthesis systems have attracted increasing research attention in the area fo computer vision. Lots of applications are easily associated and massively needed both in academia or in industry. Computer vision techniques like convolutional neural network and generative adversarial network techniques improves everyday. It is getting harder for us to tell synthesized image from the real images. In the other hand, e-commerce and online shopping have became more and more popular. We intuitively eager to obtain more information about the merchandise before we buy it especially when buying clothes. As we know, clothing styles can be various and diversified. It is reasonable that we wants to learning further information that whether we are suitable for specific patter or not. Combining these two situation, we predict that image-based virtual try-on systems for fitting an in-shop clothes exactly meet customer’s requirements.
However, the virtual try-on fitting systems is yet to become fully developed. First challenge is to preserve the information of human body due to the occlusion and point of view. It is challenging to maintain the pose-invariance and making the clothes fits the human body. Second, the patten of the clothes, like color, logo or texture is also an issue. Completely transforming the original clothes into the target person image is still challenging. Finally, distortion is highly possible to happen during the synthesis processing. So maintaining photo-realistic and structural coherence is one of the most import thing to make our work persuasive.
In this paper, we proposed a network which can redress the person in input image with the target clothes. Through our virtual try-on model, the clothes should perfectly fit human body in output image, the model wearing the assigned clothes with clothing detail well preserved. Also, the image would still keep the images photo-realistic. First part of our architecture is warping model. The model learn to simulate the shape deformation and slightly angle changing but preserve the pattern details while wearing. Also, it would try to generate the shade and wrinkle on worn clothes. With parsing map’s help, second part of our architecture successfully let the person try-on the target clothes while maintaining clear contour and boundary.
The experiment results show that the performance in visualization results and quantitive results are improved. In user study, majority of the participants prefer the results generated by our proposed method.
Ronneberger, Olaf, Philipp Fischer, and Thomas Brox. "U-net: Convolutional networks for biomedical image segmentation." International Conference on Medical image computing and computer-assisted intervention. Springer, Cham, 2015.
Liu, Ziwei, et al. "Deepfashion: Powering robust clothes recognition and retrieval with rich annotations." Proceedings of the IEEE conference on computer vision and pattern recognition. 2016.
Han, Xintong, et al. "Viton: An image-based virtual try-on network." arXiv preprint arXiv:1711.08447 (2017).
Zhu, Shizhan, et al. "Be your own Prada: Fashion synthesis with structural coherence." arXiv preprint arXiv:1710.07346 (2017).
Goodfellow, Ian, et al. "Generative adversarial nets." Advances in neural information processing systems. 2014.
Esser, Patrick, Ekaterina Sutter, and Björn Ommer. "A Variational U-Net for Conditional Appearance and Shape Generation." Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2018.
Liang, Xiaodan, et al. "Look into Person: Joint Body Parsing & Pose Estimation Network and A New Benchmark." IEEE Transactions on Pattern Analysis and Machine Intelligence (2018).
Ledig, Christian, et al. "Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network." CVPR. Vol. 2. No. 3. 2017.
I. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. Courville, and Y. Bengio, “Generative adversarial nets,” in Proc. Adv. Neural Inf. Process. Syst., 2014, pp. 2672–2680.
Johnson, Justin, Alexandre Alahi, and Li Fei-Fei. "Perceptual losses for real-time style transfer and super-resolution." European Conference on Computer Vision. Springer, Cham, 2016.
Wang, Bochao, et al. "Toward characteristic-preserving image-based virtual try-on network." arXiv preprint arXiv:1807.07688 (2018).
Kinga, D., Adam, J.B.: A method for stochastic optimization. In: International Conference on Learning Representations (ICLR) (2015)
Robbins, Herbert, and Sutton Monro. "A stochastic approximation method." Herbert Robbins Selected Papers. Springer, New York, NY, 1985. 102-109.
Ma, Liqian, et al. "Pose guided person image generation." Advances in Neural Information Processing Systems. 2017.
Zhu, Jun-Yan, et al. "Unpaired image-to-image translation using cycle-consistent adversarial networks." arXiv preprint (2017).
Wu, Zhonghua, et al. "M2E-Try On Net: Fashion from Model to Everyone." arXiv preprint arXiv:1811.08599(2018).
Salimans, Tim, et al. “Improved techniques for training GANs." Advances in Neural Information Processing Systems. 2016.
Barratt, Shane, and Rishi Sharma. "A Note on the Inception Score." arXiv preprint arXiv:1801.01973(2018).
Cao, Zhe, et al. "OpenPose: Realtime Multi-Person 2D Pose Estimation using Part Affinity Fields." arXiv preprint arXiv:1812.08008 (2018).
Liu, Kuan-Hsien, Ting-Yen Chen, and Chu-Song Chen. "Mvc: A dataset for view-invariant clothing retrieval and attribute prediction." Proceedings of the 2016 ACM on International Conference on Multimedia Retrieval. ACM, 2016.
Xian, Wenqi, et al. "TextureGAN: Controlling deep image synthesis with texture patches." arXiv preprint(2017).
Jetchev, Nikolay, and Urs Bergmann. "The conditional analogy gan: Swapping fashion articles on people images." ICCVW 2.6 (2017): 8.
Zhao, Bo, et al. "Multi-view image generation from a single-view." 2018 ACM Multimedia Conference on Multimedia Conference. ACM, 2018.
learnMa, Liqian, et al. "Disentangled person image generation." Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2018.
Hsiao, Wei-Lin, and Kristen Grauman. "Learning the latent“look”: Unsupervised discovery of a style-coherent embedding from fashion images." Proc. ICCV. 2017.
Dong, Qi, Shaogang Gong, and Xiatian Zhu. "Multi-task curriculum transfer deep learning of clothing attributes." Applications of Computer Vision (WACV), 2017 IEEE Winter Conference on. IEEE, 2017.
Wang, Xintao, et al. "Recovering realistic texture in image super-resolution by deep spatial feature transform." arXiv preprint arXiv:1804.02815 (2018).
Choi, Yunjey, et al. "StarGAN: Unified generative adversarial networks for multi-domain image-to-image translation." arXiv preprint 1711 (2017).
Mo, Sangwoo, Minsu Cho, and Jinwoo Shin. "InstaGAN: Instance-aware Image-to-Image Translation." (2018).
Raj, Amit, et al. "SwapNet: Image Based Garment Transfer." European Conference on Computer Vision. Springer, Cham, 2018.
Zalando shoes and fashion online shopping website: http://www.zalando.com/