研究生: |
林奕辰 Lin, Eason |
---|---|
論文名稱: |
應用兩階段深度學習方法建構開放類別商標識別系統 Develop an Open Set Logo Recognition System Using Two-stage Deep Learning Method |
指導教授: |
張瑞芬
Trappey, Amy J. C. |
口試委員: |
王建智
Wang, Chien-Chih 林裕訓 Lin, Yu-Hsun |
學位類別: |
碩士 Master |
系所名稱: |
工學院 - 工業工程與工程管理學系 Department of Industrial Engineering and Engineering Management |
論文出版年: | 2022 |
畢業學年度: | 110 |
語文別: | 中文 |
論文頁數: | 52 |
中文關鍵詞: | 深度學習 、商標偵測 、電子商務 、影像相似度測量 |
外文關鍵詞: | Deep learning, Logo recognition, E-commerce, Image similarity measurement |
相關次數: | 點閱:86 下載:0 |
分享至: |
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
物件識別模型需要執行兩項任務,物件的定位以及物件的分類。隨著近年來深度學習模型的進步,物件識別技術也取得大幅進展,能夠準確的於複雜影像中找出指定物件。本研究使用兩階段深度學習方法建立開放類別的商標識別系統,用於圖像型商標的智慧財產保護,此系統可以自動的收集網路商店上的商品圖像,並且找出商品影像中的目標商標。在網路商店中商品影像常被用於促銷或廣告用途,而這些影像中可能會包含已註冊的商標,未經授權或是使用具混淆性商標會構成商標侵權,而過往商標的分類方法主要著重於建立全球統一的商標管理規範,難以用於找出網路空間上的商標侵權事件。本研究提出的商標保護系統可以自動化的判斷哪些影像有侵權疑慮,協助商標權利人保護智慧財產權。本系統的第一階段為商標偵測及定位(Logo Detection and Localization, LDL),利用YOLO v4模型將影像中的所有商標不分類別的定位並剪裁,第二階段為商標相似度測量 (Trademark Similarity Measurement, TMSM),此階段將商標偵測及定位模型定位到的商標與已註冊商標模板圖像進行相似度比對,如與特定已註冊商標模板圖像高度相似,則可將此被定位的商標分類為此樣本的類別。YOLO v4模型訓練時使用6,928張影像,商標相似度測量模型使用49,704張商標影像訓練。本研究於Flickrlogos-32資料集上進行測試(3,960張影像),達成62.2%的mAP,並於電子商務影像測試中(1,243張影像)獲得98.3%的精確度。
The object recognition model needs to carry out two tasks, object localization and object classification. With the advancement of deep learning models in recent years, object recognition algorithms had gained great progress, enabling models to find objects in complex images accurately. This study uses a two-stage deep learning method to construct an open set capable logo recognition system for figurative trademark protection. The proposed system can automatically collect images from e-commerce sites and detect target trademarks in collected images. On e-commerce sites, product images containing registered trademarks are often used for promotional or advertising purposes. Trademark rights are infringed if trademarks are used without authorization or using highly similar trademarks. Past trademark classification schemes aimed to establish a globally unified trademark management standard, which cannot be used to find trademark infringement events on the internet. The proposed system can detect which images have infringement concerns, assisting the trademark owner to protect their intellectual property. The first stage of the system is logo detection and localization (LDL), which uses YOLO v4 model to localize and crop all logos in the image without classification. The second stage is trademark similarity measurement (TMSM). At this stage, similarity between localized logos and trademark templates are measured. The YOLO v4 model was trained with 6,928 images, and the TMSM model was trained with 49,704 logo images. This study reaches 62.2% mAP on Flickrlogos-32 testing set (3,960 images) and 98.3% precision on Amazon testing images (1,243 images).
1. Aloysius, N., & Geetha, M. (2017). A review on deep convolutional neural networks. In 2017 international conference on communication and signal processing (ICCSP), Chennai, India, 6-8 April, 2017, (pp. 0588-0592). IEEE. doi: 10.1109/ICCSP.2017.8286426.
2. Bastan, M., Wu, H. Y., Cao, T., Kota, B., & Tek, M. (2021). Large scale open-set deep logo detection. arXiv preprint arXiv:1911.07440. https://arxiv.org/abs/1911.0
7440.
3. Bloice, M. D., Stocker, C., & Holzinger, A. (2017). Augmentor: an image augmentation library for machine learning. arXiv preprint arXiv:1708.04680. https://arxiv.org/abs/1708.04680.
4. Bochkovskiy, A., Wang, C. Y., & Liao, H. Y. M. (2020). Yolov4: Optimal speed and accuracy of object detection. arXiv preprint arXiv:2004.10934. https://arxiv.org/abs/2004.10934.
5. Cao, M. T., Tran, Q. V., Nguyen, N. M., & Chang, K. T. (2020). Survey on performance of deep learning models for detecting road damages using multiple dashcam image resources. Advanced Engineering Informatics, 46, 101182. doi: 10.1016/j.aei.2020.101182.
6. Cartucho, J., Ventura, R., & Veloso, M. (2018, October). Robust object recognition through symbiotic deep learning in mobile robots. In 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) (pp. 2336-2341). IEEE. doi: 10.1109/IROS.2018.8594067.
7. Chen, S. H., & Tsai, C. C. (2021). SMD LED chips defect detection using a YOLOv3-dense model. Advanced Engineering Informatics, 47, 101255. doi: 10.1016/j.aei.2021.101255.
8. Chollet, F. (2017). The limitations of deep learning. Deep Learning With Python.
9. Chopra, S., Hadsell, R., & LeCun, Y. (2005, June). Learning a similarity metric discriminatively, with application to face verification. In 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05), San Diego, California, USA, 20-26 June, 2005, (Vol. 1, pp. 539-546). doi: 10.1109/CVPR.2005.202.
10. Deng, J., Guo, J., Xue, N., & Zafeiriou, S. (2019). Arcface: Additive angular margin loss for deep face recognition. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 4690-4699).
11. Eakins, J. P., Boardman, J. M., & Graham, M. E. (1998). Similarity retrieval of trademark images. IEEE multimedia, 5(2), 53-63. doi: 10.1109/93.682526.
12. Everingham, M., Van Gool, L., Williams, C., Winn, J., & Zisserman, A.. The PASCAL Visual Object Classes Challenge 2012 (VOC2012) Results.
13. Fehérvári, I., & Appalaraju, S. (2019). Scalable logo recognition using proxies. In 2019 IEEE Winter Conference on Applications of Computer Vision (WACV), Waikoloa Village, Hawaii, USA, 7-11 January, 2019, (pp. 715-725). doi: 10.1109/WACV.2019.00081.
14. Felzenszwalb, P. F., & Huttenlocher, D. P. (2004). Efficient graph-based image segmentation. International journal of computer vision, 59(2), 167-181. doi: 10.1023/B:VISI.0000022288.19776.77.
15. Garg, A., Will, T., Darling, W., Richert, W., & Marschner, C. (2017). Scalable Object Detection for Stylized Objects. arXiv preprint arXiv:1711.09822. https://arxiv.org/abs/1711.09822.
16. Girshick, R. (2015). Fast r-cnn. In Proceedings of the IEEE international conference on computer vision, Santiago, Chile, 7-13 December 2015, (pp. 1440-1448). doi: 10.1109/ICCV.2015.169.
17. Girshick, R., Donahue, J., Darrell, T., & Malik, J. (2014). Rich feature hierarchies for accurate object detection and semantic segmentation. In Proceedings of the IEEE conference on Computer Vision and Pattern Recognition, Columbus, Ohio, USA, 24-27 June, 2014, (pp. 580-587). doi: 10.1109/CVPR.2014.81.
18. Goodfellow, I. J., Mirza, M., Xiao, D., Courville, A., & Bengio, Y. (2013). An empirical investigation of catastrophic forgetting in gradient-based neural networks. arXiv preprint arXiv:1312.6211. https://arxiv.org/abs/1312.6211.
19. He, K., Zhang, X., Ren, S., & Sun, J. (2015). Spatial pyramid pooling in deep convolutional networks for visual recognition. IEEE transactions on pattern analysis and machine intelligence, 37(9), 1904-1916. doi: 10.1109/TPAMI.2015.
2389824.
20. He, K., Zhang, X., Ren, S., & Sun, J. (2016). Deep residual learning for image recognition. In Proceedings of the IEEE conference on Computer Vision and Pattern Recognition, Las Vegas, Nevada, USA, 27-30 June, 2016, (pp. 770-778). doi: 10.1109/CVPR.2016.90.
21. Hoffer, E., & Ailon, N. (2015). Deep metric learning using triplet network. In International workshop on Similarity-based Pattern Recognition, Copenhagen, Denmark, 12-14 October, 2015, (pp. 84-92). Springer, Cham. doi: 10.1007/978-3-319-24261-3_7.
22. Huang, S. Y. S. (2012). A Study on Secondary Trademark Liability of C2C Online Auction Service Providers. Soochow Law Review, 24(2), 83.
23. Jaderberg, M., Simonyan, K., & Zisserman, A. (2015). Spatial transformer networks. Advances in neural information processing systems, 28.
24. Kingma, D. P., & Ba, J. (2017). Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980. https://arxiv.org/abs/1412.6980.
25. Kodinariya, T. M., & Makwana, P. R. (2013). Review on determining number of Cluster in K-Means Clustering. International Journal of Advance Research in Computer Science and Management Studies, 1(6), 90-95.
26. Konstantinidis, K., Gasteratos, A., & Andreadis, I. (2005). Image retrieval based on fuzzy color histogram processing. Optics Communications, 248(4-6), 375-386. doi: 10.1016/j.optcom.2004.12.029.
27. Lampert, C. H., Blaschko, M. B., & Hofmann, T. (2008). Beyond sliding windows: Object localization by efficient subwindow search. In 2008 IEEE conference on Computer Vision and Pattern Recognition, Anchorage, Alaska, USA, 24-26 June, 2008, (pp. 1-8). doi: 10.1109/CVPR.2008.4587586.
28. Lin, T. Y., Dollár, P., Girshick, R., He, K., Hariharan, B., & Belongie, S. (2017). Feature pyramid networks for object detection. In Proceedings of the IEEE conference on Computer Vision and Pattern Recognition, Honolulu, Hawaii, USA, 21-26 July, 2017, (pp. 2117-2125). doi: 10.1109/CVPR.2017.106.
29. Lin, T. Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., ... & Zitnick, C. L. (2014, September). Microsoft coco: Common objects in context. In European conference on Computer Vision, Zurich, Switzerland, 6-12 September, 2014, (pp. 740-755). Springer, Cham. doi: 10.1007/978-3-319-10602-1_48
30. Liu, S., Qi, L., Qin, H., Shi, J., & Jia, J. (2018). Path aggregation network for instance segmentation. In Proceedings of the IEEE conference on Computer Vision and Pattern Recognition, Salt Lake City, Utah, USA, 18-22 June, 2018, (pp. 8759-8768). doi: 10.1109/CVPR.2018.00913.
31. Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S., Fu, C. Y., & Berg, A. C. (2016). Ssd: Single shot multibox detector. In European conference on Computer Vision, Amsterdam, Netherlands, 8-16 October, 2016, (pp. 21-37). Springer, Cham. doi: 10.1007/978-3-319-46448-0_2.
32. Movshovitz-Attias, Y., Toshev, A., Leung, T. K., Ioffe, S., & Singh, S. (2017). No fuss distance metric learning using proxies. In Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy, 22-29 October, 2017, (pp. 360-368). doi: 10.1109/ICCV.2017.47.
33. Pi, Y., Nath, N. D., & Behzadan, A. H. (2020). Convolutional neural networks for object detection in aerial imagery for disaster response and recovery. Advanced Engineering Informatics, 43, 101009. doi: 10.1016/j.aei.2019.101009.
34. Polaroid Corp. v. Polarad Electronics Corp., 287 F.2d 492 (2d Cir. 1961)
35. Redmon, J., & Farhadi, A. (2017). YOLO9000: better, faster, stronger. In Proceedings of the IEEE conference on Computer Vision and Pattern Recognition, Honolulu, Hawaii, USA, 21-26 July, 2017, (pp. 7263-7271). doi: 10.1109/CVPR.2
017.690.
36. Redmon, J., & Farhadi, A. (2018). Yolov3: An incremental improvement. arXiv preprint arXiv:1804.02767. https://arxiv.org/abs/1804.02767.
37. Redmon, J., Divvala, S., Girshick, R., & Farhadi, A. (2016). You only look once: Unified, real-time object detection. In Proceedings of the IEEE conference on Computer Vision and Pattern Recognition, Las Vegas, Nevada, USA, 27-30 June, 2016, (pp. 779-788). doi: 10.1109/CVPR.2016.91.
38. Ren, S., He, K., Girshick, R., & Sun, J. (2015). Faster r-cnn: Towards real-time object detection with region proposal networks. Advances in Neural Information processing Systems, 28, 91-99.
39. Romberg, S., Pueyo, L. G., Lienhart, R., & Van Zwol, R. (2011). Scalable logo recognition in real-world images. In Proceedings of the 1st ACM International Conference on Multimedia Retrieval, Trento, Italy, 18-20 April, 2011, (pp. 1-8). doi: 10.1145/1991996.1992021.
40. Schroff, F., Kalenichenko, D., & Philbin, J. (2015). Facenet: A unified embedding for face recognition and clustering. In Proceedings of the IEEE conference on Computer Vision and Pattern Recognition, Boston, Massachusetts, USA,7-12 June, 2015, (pp. 815-823). doi: 10.1109/CVPR.2015.7298682.
41. Simonyan, K., & Zisserman, A. (2014). Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556. https://arxiv.org/a
bs/1409.1556.
42. Tiffany (NJ) Inc. v. eBay Inc., 600 F.3d 93 (2d Cir. 2010)
43. Trappey, A. J., Trappey, C. V., & Shih, S. (2021). An intelligent content-based image retrieval methodology using transfer learning for digital IP protection. Advanced Engineering Informatics, 48, 101291. doi: 10.1016/j.aei.2021.101291.
44. Trappey, C. V., Trappey, A. J., & Lin, S. C. C. (2020). Intelligent trademark similarity analysis of image, spelling, and phonetic features using machine learning methodologies. Advanced Engineering Informatics, 45, 101120. doi: 10.1016/j.aei.2020.101120.
45. Tüzkö, A., Herrmann, C., Manger, D., & Beyerer, J. (2017). Open set logo detection and retrieval. arXiv preprint arXiv:1710.10891. https://arxiv.org/abs/171
0.10891.
46. Voulodimos, A., Doulamis, N., Doulamis, A., & Protopapadakis, E. (2018). Deep learning for computer vision: A brief review. Computational intelligence and neuroscience, 2018, 1-13. doi: 10.1155/2018/7068349.
47. Wang, J., Min, W., Hou, S., Ma, S., Zheng, Y., & Jiang, S. (2020). LogoDet-3K: A Large-Scale Image Dataset for Logo Detection. arXiv preprint arXiv:2008.05359. https://arxiv.org/abs/2008.05359.
48. Wang, J., Song, Y., Leung, T., Rosenberg, C., Wang, J., Philbin, J., ... & Wu, Y. (2014). Learning fine-graied image similarity with deep ranking. In Proceedings of the IEEE conference on Computer Vision and Pattern Recognition, Columbus, Ohio, USA, 24-27 June, 2014, (pp. 1386-1393). doi: 10.1109/CVPR.2014.180.
49. Wilms, C., & Frintrop, S. (2018, December). AttentionMask: Attentive, efficient object proposal generation focusing on small objects. In Asian Conference on Computer Vision, Perth, Australia, 2-6 December, 2018, (pp. 678-694). Springer, Cham. doi: 10.1007/978-3-030-20890-5_43.
50. Wilms, C., Heid, R., Sadeghi, M. A., Ribbrock, A., & Frintrop, S. (2021). Which Airline is This? Airline Logo Detection in Real-World Weather Conditions. In 2020 25th International Conference on Pattern Recognition (ICPR), Virtual-Milano,10-15 January, 2021, (pp. 4996-5003). doi: 10.1109/ICPR48806.2021.941
2030.
51. World Intellectual Property Organization. Accessed August 6, 2021. https://www.wipo.int/classifications/nice/en/
52. World Intellectual Property Organization. Accessed August 6, 2021. https://www.wipo.int/classifications/vienna/en/
53. Zhao, Z. Q., Zheng, P., Xu, S. T., & Wu, X. (2019). Object detection with deep learning: A review. IEEE transactions on neural networks and learning systems, 30(11), 3212-3232. doi: 10.1109/TNNLS.2018.2876865.
54. 施宣酉(2021)。應用深度遷移式學習建構以語意相似度為基之商標圖像檢索系統。國立清華大學工業工程與工程管理學系碩士論文,新竹市。 取自https://hdl.handle.net/11296/qtv3w5.
55. 張哲倫(2010)。商標權之性質及其對商標侵權判斷之影響—以 [混淆誤認之虞] 為中心。