透過共注意力與共激發來實現單樣本物件偵測

簡易檢索 / 詳目顯示

回結果列表

研究生：	謝廷翊 Hsieh, Ting-I
論文名稱：	透過共注意力與共激發來實現單樣本物件偵測 One-Shot Object Detection with Co-Attention and Co-Excitation
指導教授：	陳煥宗 Chen, Hwann-Tzong
口試委員:	林彥宇 Lin, Yen-Yu 陳嘉平 Chen, Chia-Ping 劉庭祿 Liu, Tyng-Luh
學位類別：	碩士 Master
系所名稱：	電機資訊學院 - 資訊工程學系 Computer Science
論文出版年：	2020
畢業學年度：	108
語文別：	英文
論文頁數：	30
中文關鍵詞：	共注意力、共激發、單樣本物件偵測、物件偵測、透過共注意力與共激發來實現單樣本物件偵測
外文關鍵詞：	One-Shot Object Detection with Co-Attention and Co-Excitation, One-Shot, Object Detection, Co-Attention, Co-Excitation
相關次數：	點閱：116 下載：0
分享至:	分享至facebook 分享至twitter

查詢本校圖書館目錄查詢臺灣博碩士論文知識加值系統勘誤回報

本論文提出一套透過共注意力與共激發來實現單樣本物件偵測的方法。在
現實生活中，人類能夠基於少量樣本所提供的視覺資訊，達到很高的物件偵
測和辨識率，但對於深度學習模型來說，只依賴少量樣本要達到可靠的物件
偵測能力，卻是非常困難的挑戰。在本論文中我們探討關於單樣本的強化學
習，利用共注意力與共激發的方式提升模型的學習能力。方法上，我們以
Faster R-CNN 做為模型的基本架構，對於目標影像上的每個特徵區塊利用
樣本的特徵比對相似度，並強化潛在物體的特徵區塊。最後，使用樣本的特
徵來選擇最有用的特徵，提高有用的特徵，捨棄無用的特徵，進而增加相似
度的判斷可靠度。我們在單樣本物件偵測的成果可以達到現今最佳方法的水
準，並且已經將實驗所需的程式碼開源，供後續的研究使用。

This thesis aims to tackle the challenging problem of one-shot object de-tection. Given a query image patch whose class label is not included in thetraining data, the goal of the task is to detect all instances of the same class ina target image. To this end, we develop a novelco-attention and co-excitation(CoAE) framework that makes contributions in three key technical aspects.First, we propose to use the non-local operation to explore the co-attention em-bodied in each query-target pair and yield region proposals accounting for theone-shot situation. Second, we formulate a squeeze-and-co-excitation schemethat can adaptively emphasize correlated feature channels to help uncover rel-evant proposals and eventually the target objects. Third, we design a margin-based ranking loss for implicitly learning a metric to predict the similarity ofa region proposal to the underlying query, no matter its class label is seen orunseen in training. The resulting model is therefore a two-stage detector thatyields a strong baseline on both VOC and MS-COCO under one-shot settingof detecting objects from both seen and never-seen classes

List of Tables
List of Tables
摘 要
Abstract
Introduction
Related work
Proposed method
Experiments
Ablation studies
Conclusion
Bibliography
                                

[1] Dzmitry Bahdanau, Kyunghyun Cho, and Yoshua Bengio. Neural machine translation
by jointly learning to align and translate. In 3rd International Conference on Learning
Representations, ICLR 2015, San Diego, CA, USA, May 7-9, 2015, Conference Track
Proceedings, 2015.
[2] Zhaowei Cai and Nuno Vasconcelos. Cascade R-CNN: delving into high quality
object detection. In 2018 IEEE Conference on Computer Vision and Pattern Recog-
nition, CVPR 2018, Salt Lake City, UT, USA, June 18-22, 2018, pages 6154–6162,
2018.
[3] Miaobin Cen and Cheolkon Jung. Fully convolutional siamese fusion networks for
object tracking. In 2018 IEEE International Conference on Image Processing, ICIP
2018, Athens, Greece, October 7-10, 2018, pages 3718–3722, 2018.
[4] Hao Chen, Yali Wang, Guoyou Wang, and Yu Qiao. LSTD: A low-shot transfer
detector for object detection. In Proceedings of the Thirty-Second AAAI Conference
on Artificial Intelligence, (AAAI-18), the 30th innovative Applications of Artificial
Intelligence (IAAI-18), and the 8th AAAI Symposium on Educational Advances in
Artificial Intelligence (EAAI-18), New Orleans, Louisiana, USA, February 2-7, 2018,
pages 2836–2843, 2018.
[5] Ross B. Girshick. Fast R-CNN. In 2015 IEEE International Conference on Computer
Vision, ICCV 2015, Santiago, Chile, December 7-13, 2015, pages 1440–1448, 2015.
[6] Ross B. Girshick, Jeff Donahue, Trevor Darrell, and Jitendra Malik. Rich feature
hierarchies for accurate object detection and semantic segmentation. In 2014 IEEE
Conference on Computer Vision and Pattern Recognition, CVPR 2014, Columbus,
OH, USA, June 23-28, 2014, pages 580–587, 2014.
[7] Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. Spatial pyramid pooling
in deep convolutional networks for visual recognition. IEEE Trans. Pattern Anal.
Mach. Intell., 37(9):1904–1916, 2015.
[8] Kaiming He, Georgia Gkioxari, Piotr Dollár, and Ross B. Girshick. Mask R-CNN.
In IEEE International Conference on Computer Vision, ICCV 2017, Venice, Italy,
October 22-29, 2017, pages 2980–2988, 2017.
[9] Jie Hu, Li Shen, and Gang Sun. Squeeze-and-excitation networks. In 2018 IEEE
Conference on Computer Vision and Pattern Recognition, CVPR 2018, Salt Lake City,
UT, USA, June 18-22, 2018, pages 7132–7141, 2018.
[10] Bingyi Kang, Zhuang Liu, Xin Wang, Fisher Yu, Jiashi Feng, and Trevor Darrell.
Few-shot object detection via feature reweighting. CoRR, abs/1812.01866, 2018.
[11] Gregory R. Koch. Siamese neural networks for one-shot image recognition. 2015.
[12] Tao Kong, Fuchun Sun, Huaping Liu, Yuning Jiang, and Jianbo Shi. Foveabox: Beyond
anchor-based object detector. CoRR, abs/1904.03797, 2019.
[13] Hei Law and Jia Deng. Cornernet: Detecting objects as paired keypoints. In Computer
Vision - ECCV 2018 - 15th European Conference, Munich, Germany, September 8-14,
2018, Proceedings, Part XIV, pages 765–781, 2018.
[14] Bo Li, Junjie Yan, Wei Wu, Zheng Zhu, and Xiaolin Hu. High performance visual
tracking with siamese region proposal network. In 2018 IEEE Conference on Computer
Vision and Pattern Recognition, CVPR 2018, Salt Lake City, UT, USA, June
18-22, 2018, 2018.
[15] Tsung-Yi Lin, Piotr Dollár, Ross B. Girshick, Kaiming He, Bharath Hariharan, and
Serge J. Belongie. Feature pyramid networks for object detection. In 2017 IEEE
Conference on Computer Vision and Pattern Recognition, CVPR 2017, Honolulu, HI,
USA, July 21-26, 2017, pages 936–944, 2017.
[16] Tsung-Yi Lin, Priya Goyal, Ross B. Girshick, Kaiming He, and Piotr Dollár. Focal
loss for dense object detection. In IEEE International Conference on Computer
Vision, ICCV 2017, Venice, Italy, October 22-29, 2017, pages 2999–3007, 2017.
[17] Wei Liu, Dragomir Anguelov, Dumitru Erhan, Christian Szegedy, Scott E. Reed,
Cheng-Yang Fu, and Alexander C. Berg. SSD: single shot multibox detector. In
Computer Vision - ECCV 2016 - 14th European Conference, Amsterdam, The Netherlands,
October 11-14, 2016, Proceedings, Part I, pages 21–37, 2016.
[18] Claudio Michaelis, Ivan Ustyuzhaninov, Matthias Bethge, and Alexander S. Ecker.
One-shot instance segmentation. CoRR, abs/1811.11507, 2018.
[19] Stephen E. Palmer. Vision science : photons to phenomenology. MIT Press, Cambridge,
Mass., 1999.
[20] Joseph Redmon, Santosh Kumar Divvala, Ross B. Girshick, and Ali Farhadi. You
only look once: Unified, real-time object detection. In 2016 IEEE Conference on
Computer Vision and Pattern Recognition, CVPR 2016, Las Vegas, NV, USA, June
27-30, 2016, pages 779–788, 2016.
[21] Shaoqing Ren, Kaiming He, Ross B. Girshick, and Jian Sun. Faster R-CNN: towards
real-time object detection with region proposal networks. In Advances in Neural Information
Processing Systems 28: Annual Conference on Neural Information Processing
Systems 2015, December 7-12, 2015, Montreal, Quebec, Canada, pages 91–99, 2015.
[22] Eli Schwartz, Leonid Karlinsky, Joseph Shtok, Sivan Harary, Mattias Marder,
Sharathchandra Pankanti, Rogério Schmidt Feris, Abhishek Kumar, Raja Giryes, and
Alexander M. Bronstein. Repmet: Representative-based metric learning for classification
and one-shot object detection. CoRR, abs/1806.04728, 2018.
[23] Pierre Sermanet, David Eigen, Xiang Zhang, Michaël Mathieu, Rob Fergus, and Yann
LeCun. Overfeat: Integrated recognition, localization and detection using convolutional
networks. In 2nd International Conference on Learning Representations, ICLR
2014, Banff, AB, Canada, April 14-16, 2014, Conference Track Proceedings, 2014.
[24] Jake Snell, Kevin Swersky, and Richard S. Zemel. Prototypical networks for fewshot
learning. In Advances in Neural Information Processing Systems 30: Annual
Conference on Neural Information Processing Systems 2017, 4-9 December 2017,
Long Beach, CA, USA, pages 4080–4090, 2017.
[25] Flood Sung, Yongxin Yang, Li Zhang, Tao Xiang, Philip H. S. Torr, and Timothy M.
Hospedales. Learning to compare: Relation network for few-shot learning. In 2018
IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2018, Salt
Lake City, UT, USA, June 18-22, 2018, pages 1199–1208, 2018.
[26] Oriol Vinyals, Meire Fortunato, and Navdeep Jaitly. Pointer networks. In Advances
in Neural Information Processing Systems 28: Annual Conference on Neural Information
Processing Systems 2015, December 7-12, 2015, Montreal, Quebec, Canada,
pages 2692–2700, 2015.
[27] Oriol Vinyals, Charles Blundell, Tim Lillicrap, Koray Kavukcuoglu, and Daan Wierstra.
Matching networks for one shot learning. In Advances in Neural Information
Processing Systems 29: Annual Conference on Neural Information Processing Systems
2016, December 5-10, 2016, Barcelona, Spain, pages 3630–3638, 2016.
[28] Xiaolong Wang, Ross B. Girshick, Abhinav Gupta, and Kaiming He. Non-local neural
networks. In 2018 IEEE Conference on Computer Vision and Pattern Recognition,
CVPR 2018, Salt Lake City, UT, USA, June 18-22, 2018, pages 7794–7803, 2018.
[29] Tengfei Zhang, Yue Zhang, Xian Sun, Hao Sun, Menglong Yan, Xue Yang, and
Kun Fu. Comparison network for one-shot conditional object detection. CoRR,
abs/1904.02317, 2019.

簡易檢索 / 詳目顯示

相關論文