研究生: |
張楷岳 Chang, Kai-Yueh |
---|---|
論文名稱: |
顯著性導向的物件切割,偵測與辨識 Saliency-Guided Object Segmentation, Detection, and Recognition |
指導教授: |
賴尚宏
劉庭祿 |
口試委員: |
林文杰
廖弘源 莊永裕 陳朝欽 王聖智 陳煥宗 鮑興國 |
學位類別: |
博士 Doctor |
系所名稱: |
電機資訊學院 - 資訊工程學系 Computer Science |
論文出版年: | 2012 |
畢業學年度: | 100 |
語文別: | 英文 |
論文頁數: | 91 |
中文關鍵詞: | 視覺顯著性 、共同影像切割 、物件偵測 、物件辨識 、馬可夫隨機場 、圖形切割演算法 |
外文關鍵詞: | saliency, co-segmentation, object detection, object recognition, Markov random field, graph cuts |
相關次數: | 點閱:3 下載:0 |
分享至: |
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
這篇論文旨在探討視覺顯著性及其在電腦視覺方面的應用。明確地說,我們有效地利用顯著性資訊來建立數學模型,去解釋一些具有挑戰性的電腦視覺問題,像是物件切割、偵測與辨識,雖然最終的數學模型非常不同,但他們確實都是由一個共通的概念所啟發:顯著性資訊能夠提供有用的線索來幫助我們找出影像中潛在的感興趣區域。根據不同的應用,我們將這篇論文分成三大主題:影像共同切割、顯著物件偵測、與多類別物件辨識。
我們開始於多張影像共同切割的問題,並且特別針對在兩個關鍵議題上。第一個議題是是否有一個完全的無監督演算法可以圓滿地解決這個問題,在沒有使用者的引導下,切割出共同物件所占的區域是相當具有挑戰的,尤其是在物件的外觀、形狀和大小都是可以改變的情況下。第二個議題是效率上的考量,在實際應用上我們需要一個有效率的方法。根據這兩個考量,我們建立了一個馬可夫隨機場模型,其所對應到的能量函數具有很好的性質,並且可以有效地解決上述兩個議題。不同於依賴使用者的引導,我們的方法引入了一個共同的顯著性先驗知識,此先驗知識可以提供關於潛在前景區域的線索,並且當作馬可夫隨機場模型中的資料項。為了完成整個架構,我們設計了一個新穎的全域項,其能夠更適當地表示共同切割的問題並且使得能量函數具有子模性,因此,我們所提出的馬可夫隨機場模型能夠利用圖形切割演算法來得到最佳解。我們藉由幾個基準資料集來展現我們方法的優點。
延續著我們在影像共同分割上的研究,我們提出一個新的計算模型去探索物件性與顯著性的關係,這兩個特性在視覺注意力上的研究各自扮演一個重要的角色。基本上,我們所提出的架構結合了這兩個概念,這個架構藉由一個圖形模型來表達他們彼此之間的關係,利用迭代的方式來最佳化這圖形模型所對應到的能量函數能夠同時提升估計他們的準確率。更明確地說,這能量函數包括了物件性能量、顯著性能量與交互性能量,前面兩項用來正規化這兩個概念,最後一項在解釋他們互相影響的情形。固定住物件性或顯著性其中一項來最佳化能量函數,會變成在解另一項的問題,不過固定住的那一項,其資訊可以藉由交互性能量來幫助另一項的衡量。在兩個測試資料集中,實驗結果顯示了,在顯著物件偵測上,我們所提出的模型能夠同時得到較佳的顯著性地圖與較有意義的物件性估計。
最後,我們展示了上述顯著性物件偵測的架構可以擴展到解決多類別物件辨識的問題。我們特別考慮一個實際的情形:物件跟背景的顏色或紋理很難區別,而影像中的深度資訊是可取得的時後。深度資訊可以用來學習單點像素分類器,其分類結果可以用來改善自下而上的顯著性地圖,另外深度資訊也可用來更正確地定義在估計物件層級的顯著性時,所需用到的周遭區域。為了要完成物件辨識的工作,我們提出了一個馬可夫隨機場模型來同時解決物件切割與偵測的問題。在這模型中,每個超像素會與幾個可能的物件分割的結果相關聯,因此在做推論時,每個超像素需要對應到兩個標籤:一個代表這個超像素是屬於哪一種類別,另一個是代表當這個超像素真的是某個物件的一部分時,哪個物件分割的結果最適合解釋此情形。物件辨識因此可以利用這兩類標籤所提供的訊息來完成。
The main theme of this thesis concerns the study of visual saliency and its applications to computer vision. Specifically, we strive to establish formulations that effectively utilize the saliency information to better address a number of challenging vision tasks such as object segmentation, detection and recognition. While the resulting computation models may significantly differ, they indeed are motivated by a common central idea that the saliency information is used to provide useful cues in identifying potential "regions of interest" in an image. With that, we can application-wise divide the thesis into three topics: image co-segmentation, salient object detection, and multi-class object recognition.
We start with the problem of co-segmentation over multiple images, and particularly focus on two crucial issues. The first is whether a pure unsupervised algorithm can satisfactorily solve this problem. Without the user's guidance, segmenting the foregrounds implied by the common object is quite a challenging task, especially when substantial variations in the object's appearance, shape, and scale are allowed. The second issue is about the efficiency, if the technique can indeed lead to practical uses. With these in mind, we establish an MRF optimization model that has an energy function with nice properties and can be shown to effectively resolve the two aforementioned difficulties. Instead of relying on the user inputs, our approach introduces a co-saliency prior as the hint about possible foreground locations, and uses it to construct the MRF data terms. To complete the optimization framework, we design a novel global term that is more appropriate to co-segmentation and results in a submodular energy function. The proposed MRF model can thus be optimally solved by graph cuts. We demonstrate these advantages by testing our method on several benchmark datasets.
Motivated by the promising results in tackling image co-segmentation, we next work on establishing a novel computational model for salient object detection. The framework explores the relatedness of objectness and saliency. It conceptually integrates the two concepts via constructing a graphical model to account for the underlying relationships, and concurrently improves their estimations by iteratively optimizing a novel energy function. Specifically, the function comprises the objectness, the saliency, and the interaction energy, respectively corresponding to explaining their individual regularities and the mutual effects. Minimizing the energy by fixing one or the other would elegantly transform the model into solving the problem of objectness or saliency estimation, while the useful information from the other concept can be utilized through the interaction term. Experimental results on two benchmark datasets demonstrate that our method can simultaneously yield a saliency map of better quality and a more meaningful objectness output for salient object detection.
Finally, we show that the above framework for salient object detection can be extended to solving the problem of multi-class object recognition. In particular, we consider a practical scenario that the object colors or textures are difficult to be differentiated from the background, and the information from a depth camera is available. The knowledge about the depth can be used in learning pixel-wise classifiers for improving the quality of a bottom-up saliency map, and also in more accurately specifying the surround area of the object-level saliency estimation. To accomplish the task of object recognition, we propose a unified MRF model to simultaneously solve the segmentation and detection problems. A number of possible object segmentations (segment proposals) are linked to each superpixel in the formulation. For each superpixel, inference is carried out to derive two labels: one is the label of object class and the other is to select which proposal of object segment is most suitable if this superpixel is indeed a part of an object. Then, object recognition can be achieved by gathering information about the two types of labels from all superpixels.
[1] R. Achanta, F. Estrada, P. Wils, and S. Süsstrunk. Salient region detection and segmentation. In Proc. Int’l Conf. on Computer Vision Systems, pages 66–75, 2008.
[2] R. Achanta, S. Hemami, F. Estrada, and S. Süsstrunk. Frequency-tuned salient region detection. In Proc. Conf. Computer Vision and Pattern Recognition, pages 1597–1604, 2009.
[3] B. Alexe, T. Deselaers, and V. Ferrari. What is an object? In Proc. Conf. Computer Vision and Pattern Recognition, pages 73–80, 2010.
[4] C. Ancuti, C. Ancuti, and P. Bekaert. Enhancing by saliency-guided decolorization. In Proc. Conf. Computer Vision and Pattern Recognition, pages 257–264, 2011.
[5] D. Batra, A. Kowdle, D. Parikh, J. Luo, and T. Chen. iCoseg: Interactive cosegmentation with intelligent scribble guidance. In Proc. Conf. Computer Vision and Pattern Recognition, pages 3169–3176, 2010.
[6] S. Bengio, F. Pereira, Y. Singer, and D. Strelow. Group sparse coding. In Y. Bengio, D. Schuurmans, J. Lafferty, C. K. I. Williams, and A. Culotta, editors, Advances in Neural Information Processing Systems 22, pages 82–89. 2009.
[7] B. Blakeslee and M. McCourt. Similar mechanisms underlie simultaneous brightness contrast and grating induction. Vision Research, 37(20):2849–2869, 1997.
[8] E. Borenstein and S. Ullman. Combined top-down/bottom-up segmentation. IEEE Trans. Pattern Analysis and Machine Intelligence, 30(12):2109–2125, December 2008.
[9] Y. Boykov, O. Veksler, and R. Zabih. Fast approximate energy minimization via graph cuts. IEEE Trans. Pattern Analysis and Machine Intelligence, 23(11):1222–1239, November 2001.
[10] N. Bruce and J. Tsotsos. Saliency based on informationmaximization. In Y.Weiss, B. Schölkopf, and J. Platt, editors, Advances in Neural Information Processing Systems 18, pages 155–162. MIT Press, Cambridge, MA, 2006.
[11] J. Carreira and C. Sminchisescu. CPMC: Automatic object segmentation using constrained parametric min-cuts. IEEE Trans. Pattern Analysis and Machine Intelligence, 34(7):1312–1328, July 2012.
[12] D. Chandler. Introduction to Modern Statistical Mechanics. Oxford University Press, 1987.
[13] H. Chen. Preattentive co-saliency detection. In Proc. 2010 IEEE Int’l Conf. Image Processing, 2010.
[14] L. Chen, X. Xie, X. Fan, W. Ma, H. Zhang, and H. Zhou. A visual attention model for adapting images on small displays. Multimedia Systems, 9(4):353–364, October 2003.
[15] A. Cohen. Asymmetries in visual search for conjunctive targets. Journal of Experimental Psychology: Human Perception and Performance, 19(4):775–797, 1993.
[16] A. Cohen and R. Ivry. Density effects in conjunction search: Evidence for coarse location mechanism of feature integration. Journal of Experimental Psychology: Human Perception and Performance, 17(4):891–901, 1991.
[17] J. Cui, Q. Yang, F. Wen, Q.Wu, C. Zhang, L. van Gool, and X. Tang. Transductive object cutout. In Proc. Conf. Computer Vision and Pattern Recognition, pages 1–8, 2008.
[18] J. Duncan and G. Humphreys. Visual search and stimulus similarity. Psychological Review, 96:433–458, 1989.
[19] I. Endres and D. Hoiem. Category independent object proposals. In Proc. Eleventh European Conf. Computer Vision, pages V: 575–588, 2010.
[20] P. Felzenszwalb and D. Huttenlocher. Efficient graph-based image segmentation. Int’l J. Computer Vision, 59(2):167–181, September 2004.
[21] P. Felzenszwalb, R. Girshick, D. McAllester, and D. Ramanan. Object detection with discriminatively trained part-based models. IEEE Trans. Pattern Analysis and Machine Intelligence, 32(9):1627–1645, September 2010.
[22] J. Feng, Y. Wei, L. Tao, C. Zhang, and J. Sun. Salient object detection by composition. In Proc. 13th IEEE Int’l Conf. Computer Vision, pages 1–8, 2011.
[23] D. Foster and P. Ward. Asymmetries in oriented-line detection indicate two orthogonal filters in early vision. Proceedings of the Royal Society (London B), 243:75–81, 1991.
[24] Y. Fu, J. Cheng, Z. Li, and H. Lu. Saliency cuts: An automatic approach to object segmentation. In Proc. Int’l Conf. Pattern Recognition, pages 1–4, 2008.
[25] D. Gao and N. Vasconcelos. Discriminant saliency for visual recognition from cluttered scenes. In Advances in Neural Information Processing Systems 17, pages 481–488, 2005.
[26] D. Gao, V.Mahadevan, and N. Vasconcelos. On the plausibility of the discriminant center-surround hypothesis for visual saliency. Journal of Vision, 8(7):13–18, 2008.
[27] D. Gao, S. Han, and N. Vasconcelos. Discriminant saliency, the detection of suspicious coincidences, and applications to visual recognition. IEEE Trans. Pattern Analysis and Machine Intelligence, 31(6):989–1005, June 2009.
[28] S. Goferman, L. Zelnik Manor, and A. Tal. Context-aware saliency detection. In Proc. Conf. Computer Vision and Pattern Recognition, pages 2376–2383, 2010.
[29] M. Grant and S. Boyd. CVX: Matlab software for disciplined convex programming, version 1.21. http://cvxr.com/cvx, Oct. 2010.
[30] S. Grossberg, E.Mingolla, andW. Ross. A neural theory of attentive visual search: Interactions of boundary, surface, spatial and object representations. Psychological Review, 101(3):470–489, 1994.
[31] V. Gulshan, C. Rother, A. Criminisi, A. Blake, and A. Zisserman. Geodesic star convexity for interactive image segmentation. In Proc. Conf. Computer Vision and Pattern Recognition, pages 3129–3136, 2010.
[32] J. Harel, C. Koch, and P. Perona. Graph-based visual saliency. In Advances in Neural Information Processing Systems 19, pages 545–552, 2007.
[33] D. Hochbaum and V. Singh. An efficient algorithm for co-segmentation. In Proc. 12th IEEE Int’l Conf. Computer Vision, pages 269–276, 2009.
[34] X. Hou and L. Zhang. Saliency detection: A spectral residual approach. In Proc. Conf. Computer Vision and Pattern Recognition, 2007.
[35] X. Hou and L. Zhang. Dynamic visual attention: searching for coding length increments. In Advances in Neural Information Processing Systems 21, pages 681–688, 2009.
[36] G. Humphreys and H. Muller. SEarch via Recursive Rejection (SERR): A connectionist model of visual search. Cognitive Psychology, 25:43–110, 1993.
[37] L. Itti and C. Koch. A saliency-based search mechanism for overt and covert shifts of visual attention. Vision Research, 40:1489–1506, 2000.
[38] L. Itti and C. Koch. Computational modelling of visual attention. Nature reviews. Neuroscience, 2(3):194–203, March 2001.
[39] L. Itti, C. Koch, and E. Niebur. A model of saliency-based visual attention for rapid scene analysis. IEEE Trans. Pattern Analysis and Machine Intelligence, 20(11):1254–1259, November 1998.
[40] A. Janoch, S. Karayev, Y. Jia, J. Barron, M. Fritz, K. Saenko, and T. Darrell. A category-level 3-D object dataset: Putting the kinect to work. In IEEE Workshop on Consumer Depth Cameras for Computer Vision, pages 1168–1174, 2011.
[41] A. Joulin, F. Bach, and J. Ponce. Discriminative clustering for image cosegmentation. In Proc. Conf. Computer Vision and Pattern Recognition, pages 1943–1950, 2010.
[42] T. Judd, K. Ehinger, F. Durand, and A. Torralba. Learning to predict where humans look. In Proc. 12th IEEE Int’l Conf. Computer Vision, pages 2106–2113, 2009.
[43] T. Kadir and M. Brady. Saliency, scale and image description. Int’l J. Computer Vision, 45(2):83–105, November 2001.
[44] C. Kanan, M. Tong, L. Zhang, and G. Cottrell. SUN: Top-down saliency using natural statistics. Visual Cognition, 17(6):979–1003, 2009.
[45] F. Khan, J. van de Weijer, and M. Vanrell. Top-down color attention for object recognition. In Proc. 12th IEEE Int’l Conf. Computer Vision, pages 979–986, 2009.
[46] P. Khuwuthyakorn, A. Robles Kelly, and J. Zhou. Object of interest detection by saliency learning. In Proc. Eleventh European Conf. Computer Vision, pages II: 636–649, 2010.
[47] W. Kienzle, F.Wichmann, B. Schölkopf, and M. Franz. A nonparametric approach to bottom-up visual saliency. In Advances in Neural Information Processing Systems 19, pages 689–696, 2007.
[48] R. Kinchla. Detecting targets in multi-element arrays: A confusability model. Perception and Psychophysics, 15:149–158, 1974.
[49] D. Klein and S. Frintrop. Center-surround divergence of feature statistics for salient object detection. In Proc. 13th IEEE Int’l Conf. Computer Vision, pages 1–6, 2011.
[50] V. Kolmogorov and R. Zabih. What energy functions can be minimized via graph cuts? IEEE Trans. Pattern Analysis and Machine Intelligence, 26(2):147–159, February 2004.
[51] L. Ladicky , P. Sturgess, K. Alahari, C. Russell, and P. Torr. What, where and how many? combining object detectors and CRFs. In Proc. Eleventh European Conf. Computer Vision, pages IV: 424–437, 2010.
[52] K. Lai, L. Bo, X. Ren, and D. Fox. A large-scale hierarchical multi-view RGB-D object dataset. In IEEE International Conference on on Robotics and Automation, 2011.
[53] T. Leung and J. Malik. Representing and recognizing the visual appearance of materials using three-dimensional textons. Int’l J. Computer Vision, 43(1):29–44, June 2001.
[54] A. Levin and Y. Weiss. Learning to combine bottom-up and top-down segmentation. Int’l J. Computer Vision, 81(1), January 2009.
[55] T. Liu, J. Sun, N. Zheng, X. Tang, and H. Shum. Learning to detect a salient object. In Proc. Conf. Computer Vision and Pattern Recognition, 2007.
[56] D. Lowe. Object recognition from local scale-invariant features. In Proc. Seventh IEEE Int’l Conf. Computer Vision, pages 1150–1157, 1999.
[57] Y.Ma and H. Zhang. Contrast-based image attention analysis by using fuzzy growing. In Proc. ACM Conf. Multimedia, pages 374–381, 2003.
[58] V. Mahadevan and N. Vasconcelos. Saliency-based discriminant tracking. In Proc. Conf. Computer Vision and Pattern Recognition, pages 1007–1013, 2009.
[59] M. Maire, P. Arbelaez, C. Fowlkes, and J. Malik. Using contours to detect and localize junctions in natural images. In Proc. Conf. Computer Vision and Pattern Recognition, pages 1–8, 2008.
[60] L. Marchesotti, C. Cifarelli, and G. Csurka. A framework for visual saliency detection with applications to image thumbnailing. In Proc. 12th IEEE Int’l Conf. Computer Vision, 2009.
[61] P. Mehrani and O. Veksler. Saliency segmentation based on learning and graph cut refinement. In Proc. British Conf. Machine Vision, pages 1–12, 2010.
[62] F. Moosmann, E. Nowak, and F. Jurie. Randomized clustering forests for image classification. IEEE Trans. Pattern Analysis and Machine Intelligence, 30(9): 1632–1646, September 2008.
[63] L. Mukherjee, V. Singh, and C. Dyer. Half-integrality based algorithms for cosegmentation of images. In Proc. Conf. Computer Vision and Pattern Recognition, pages 2028–2035, 2009.
[64] K. Murphy, Y. Weiss, and M. Jordan. Loopy belief propagation for approximate inference: An empirical study. In Proceedings of Uncertainty in AI, volume 9, pages 467–475, 1999.
[65] N. Murray, M. Vanrell, X. Otazu, and C. Parraga. Saliency estimation using a nonparametric low-level vision model. In Proc. Conf. Computer Vision and Pattern Recognition, pages 433–440, 2011.
[66] A. Nagy and R. Sanchez. Critical color differences determined with a visual search task. J. Opt. Soc. Am. A, 7(7):1209–1217, July 1990.
[67] U. Neisser. Cognitive Psychology. New York: Appleton-Century-Crofts, 1967.
[68] A. Nuthmann and J. Henderson. Object-based attentional selection in scene viewing. Journal of Vision, 10(8), 2010.
[69] X. Otazu, C. Párraga, andM. Vanrell. Toward a unified chromatic inductionmodel. Journal of Vision, 10(12)(6), 2010.
[70] E. Rahtu, J. Kannala, M. Salo, and J. Heikkila. Segmenting salient objects from images and videos. In Proc. Eleventh European Conf. Computer Vision, pages V: 366–379, 2010.
[71] S. Ramanathan, H. Katti, N. Sebe, M. Kankanhalli, and T. Chua. An eye fixation database for saliency detection in images. In Proc. Eleventh European Conf. Computer Vision, pages IV: 30–43, 2010.
[72] R. Rao and D. Ballard. Predictive coding in the visual cortex: a functional interpretation of some extra-classical receptive-field effects. Nature neuroscience, 2(1): 79–87, January 1999.
[73] C. Rother, V. Kolmogorov, and A. Blake. GrabCut: interactive foreground extraction using iterated graph cuts. ACM Trans. Graph., 23(3):309–314, 2004.
[74] C. Rother, T. Minka, A. Blake, and V. Kolmogorov. Cosegmentation of image pairs by histogram matching: Incorporating a global constraint into MRFs. In Proc. Conf. Computer Vision and Pattern Recognition, pages I: 993–1000, 2006.
[75] U. Rutishauser, D. Walther, C. Koch, and P. Perona. Is bottom-up attention useful for object recognition? In Proc. Conf. Computer Vision and Pattern Recognition, pages II: 37–44, 2004.
[76] W. Schneider and R. Shiffrin. Controlled and automatic human information processing: I. detection, search, and attention. Psychological Review, 84:1–66, 1977.
[77] G. Sharma, F. Jurie, and C. Schmid. Discriminative spatial saliency for image classification. In Proc. Conf. Computer Vision and Pattern Recognition, 2012.
[78] R. Shiffrin and G. Gardner. Visual processing capacity and attentional control. Journal of Experimental Psychology, 93(1):72–82, 1972.
[79] R. Shiffrin and W. Schneider. Controlled and automatic human information processing: Ii. perceptual learning, automatic attending, and a general theory. Psychological Review, 84:127–190, 1977.
[80] J. Shotton, A. Fitzgibbon, M. Cook, T. Sharp, M. Finocchio, R.Moore, A. Kipman, and A. Blake. Real-time human pose recognition in parts from single depth images. In Proc. Conf. Computer Vision and Pattern Recognition, pages 1297–1304, 2011.
[81] N. Silberman and R. Fergus. Indoor scene segmentation using a structured light sensor. In Proceedings of the International Conference on Computer Vision - Workshop on 3D Representation and Recognition, 2011.
[82] Y. Sugano, Y.Matsushita, and Y. Sato. Calibration-free gaze sensing using saliency maps. In Proc. Conf. Computer Vision and Pattern Recognition, pages 2667–2674, 2010.
[83] A. Torralba, A. Oliva, M. Castelhano, and J. Henderson. Contextual guidance of eye movements and attention in real-world scenes: the role of global features in object search. Psychological Review, 113(4):766–786, October 2006.
[84] A. Treisman and G. Gelade. A feature-integration theory of attention. Cognitive Psychology, 12(1):97–136, January 1980.
[85] A. Treisman and S. Gormican. Feature analysis in early vision: Evidence from search asymmetries. Psychological Review, 95:15–48, 1988.
[86] A. Treisman and S. Sato. Conjunction search revisited. Journal of Experimental Psychology: Human Perception and Performance, 16(3):459–478, 1990.
[87] A. Treisman and J. Souther. Search asymmetry: A diagnostic for preattentive processing of separable features. Journal of Experimental Psychology: General, 114: 285–310, 1985.
[88] Z. Tu, X. Chen, A. Yuille, and S. Zhu. Image parsing: Unifying segmentation, detection, and recognition. Int’l J. Computer Vision, 63(2):113–140, July 2005.
[89] M. Varma and A. Zisserman. A statistical approach to texture classification from single images. Int’l J. Computer Vision, 62(1-2):61–81, April 2005.
[90] O. Veksler. Star shape prior for graph-cut image segmentation. In Proc. Tenth European Conf. Computer Vision, pages III: 454–467, 2008.
[91] S. Vicente, V. Kolmogorov, and C. Rother. Cosegmentation revisited: Models and optimization. In Proc. Eleventh European Conf. Computer Vision, pages II: 465–479, 2010.
[92] G. Wang and D. Forsyth. Joint learning of visual attributes, object classes and visual saliency. In Proc. 12th IEEE Int’l Conf. Computer Vision, pages 537–544, 2009.
[93] M. Wang, J. Konrad, P. Ishwar, K. Jing, and H. Rowley. Image saliency: From intrinsic to extrinsic context. In Proc. Conf. Computer Vision and Pattern Recognition, pages 417–424, 2011.
[94] Y. Wang, C. Tai, O. Sorkine, and T. Lee. Optimized scale-and-stretch for image resizing. ACM Siggraph Asia, 27:118:1–118:8, December 2008.
[95] Y. Wang, H. Lin, O. Sorkine, and T. Lee. Motion-based video retargeting with optimized crop-and-warp. ACM Trans. Graph., 29:90:1–90:9, July 2010.
[96] Y. Wang, J. Hsiao, O. Sorkine, and T. Lee. Scalable and coherent video resizing with per-frame optimization. ACM Trans. Graph., 30:88:1–88:8, August 2011.
[97] J. Wolfe. The parallel guidance of visual attention. Current Directions in Psychological Science, 1(4):125–128, 1992.
[98] J. Wolfe. Guided search 2.0: A revised model of visual search. Psychonomic Bulletin and Review, 1(2):202–238, 1994.
[99] J.Wolfe. Visual search. In Attention. H. Pashler (Ed.), London: University College London Press, 1998.
[100] J. Wolfe, S. Friedman-Hill, M. Stewart, and K. O’Connell. The role of categorization in visual search for orientation. Journal of Experimental Psychology: Human Perception and Performance, 18(1):34–49, 1992.
[101] A. Yarbus. Eye movement and vision. Plenum Press, New York, 1967.
[102] L. Zhang, T. Marks, M. Tong, S. H., and G. Cottrell. SUN: A bayesian framework for saliency using natural statistics. Journal of Vision, 8(7):1–20, 2008.
[103] J. Zhu, J. Wu, Y. Wei, E. Chang, and Z. Tu. Unsupervised object class discovery via saliency-guided multiple class learning. In Proc. Conf. Computer Vision and Pattern Recognition, 2012.