研究生: |
李爵宇 Chueh-Yu Li |
---|---|
論文名稱: |
利用區域資訊進行內涵式影像檢索的檢索歷程學習機制之研究 A Studty on Region-based Issues for Intrasession and Intersession Learning in Content-based Image Retrieval |
指導教授: |
許秋婷
Chiou-Ting Hsu |
口試委員: | |
學位類別: |
博士 Doctor |
系所名稱: |
電機資訊學院 - 資訊工程學系 Computer Science |
論文出版年: | 2006 |
畢業學年度: | 95 |
語文別: | 英文 |
論文頁數: | 71 |
中文關鍵詞: | 區域方法 、相關回饋 、檢索歷程 、檢索目標 、使用者意涵 、隱含語意空間 、內涵式影像檢索 |
外文關鍵詞: | region-based approaches, relevance, query session, target query, user conception, hidden semantic space, content-based image retrieval |
相關次數: | 點閱:2 下載:0 |
分享至: |
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
「區域方法」與「相關回饋」是「內涵式影像檢索」的重要議題。區域表示法不僅能包含影像
內局部資訊, 更能表示區域之間的空間相鄰關係, 是以相較於一般以整張影像為基礎的方法, 區域方
法提供了更佳的影像比對準則與檢索功能性。由於使用者提供的範例影像常無法完整表達搜尋的意
圖, 相關回饋技術旨在藉由與使用者的互動, 令其標記目前檢索結果中各影像的重要性, 以更新檢索
結果。目前相關回饋的方法主要可分為兩類, 第一種方法是檢索過程所學得的參數不會使用在其他
檢索過程; 第二種方法則是利用過往檢索所學得的參數, 以利更快得到好的檢索結果。
本研究主要探討如何在相關回饋中使用區域表示法技術。在單一檢索歷程中, 我們提出的方
法主要根植於Bayesian 架構, 並包含了會隨時間變化的使用者模型。使用者模型主要包含「檢索
目標」與「使用者意涵」兩種概念。「檢索目標」意指藉由分析回饋影像的共同特徵, 來描述使用者
心目中的搜尋目的; 而「使用者意涵」則代表一組在相關回饋中學得的參數, 主要用於更新比對影像
的度量。影像的比對是在區域表示法的架構下進行,藉由考慮影像區域之間的空間相鄰關係,以求得
較好的影像間區域對應結果。此外, 我們亦使用了一種能將回饋影像區域分群的方法,此方法可使在
數學上, 將「檢索目標」與「使用者意涵」的學習, 完全以區域表示法架構進行。
我們亦討論如何利用過往檢索歷程所學得的參數進行更佳的學習。本研究所提出的檢索, 能
在區域表示法架構下, 藉由過往檢索結果推得一「隱含語意空間」。我們主要的想法是, 一個區域就
代表了一個隱含的語意概念, 故一次檢索事實上是包含了數個語意概念。我們藉由過往短期檢索的
結果初始化隱含語意空間,並以此空間進行長期檢索,且不斷將新學得的語意概念加入此空間中。由
於長期下來, 此空間容易包含不一致、重覆或擾亂的資訊, 故我們使用一種維度削減的技術, 在區域
表示法架構下, 將「隱含語意空間」化簡至比較緊密的表示法。
實驗結果顯示出, 透過使用者模型與區域表示法, 短期檢索能達到相當好的檢索效能。而長期
檢索則能更進一步藉由「隱含語意空間」改進檢索結果。實驗也顯示出, 我們提出的維度削減技術,
的確能消除冗餘的隱含語意概念。
Region-based approaches and relevance feedback have been indispensable issues in
content-based image retrieval (CBIR). Region-based representation combines both local
information and their spatial organization so as to provide better image representation
and matching criterion. Relevance feedback allows users to rate retrieved images and
refine the retrieved results interactively. Current relevance feedback methods are mainly
divided into intrasession and intersession learning, depending on whether or not the
learned information from historical query sessions is accumulated to subsequent query
sessions.
This study addresses region-based issues for both intrasession and intersession
learning. The proposed intrasession learning technique is a generalized Bayesian framework
which incorporates a time-varying user model. The user model includes a target
query to specify the user’s ideal query and a user conception to adjust the time-varying
matching criterion. We include spatially adjacent relationship to estimate the region
correspondence between images for better image matching criterion. In addition, we also
propose to update the target query as well as the user conception in region level based
on the estimated region correspondence.
For intersession learning, we aim to infer the hidden semantic space in region level
by accumulating the knowledge learned from previous query sessions. The main idea
is that a region in the query possesses a hidden semantic concept, and hence a query
session will generate several concepts in our work. We initialize the hidden semantic
space based on a series of query sessions. With the constructed hidden semantic space,
we then perform retrieval and keep accumulating newly learned concepts into the hidden
semantic space. Since the hidden semantic space may contain inconsistency, overlapping
or mislabeled concepts, we employ a dimension reduction technique in region level to
construct a more compact space.
Experiments demonstrate that the proposed intrasession learning method combined
with time-varying user model and region-based representation achieves satisfactory results.
The results also show that our intersession learning method based on the inferred
hidden semantic space further improves the retrieval accuracy, and the proposed dimension
reduction technique removes redundant hidden semantic concepts effectively.
[1] J. Li, J. Z. Wang, and G. Wiederhold, “IRM: integrated region matching for image
retrieval,” in ACM Multimedia, 2000.
[2] H. Greenspan, G. Dvir, and Y. Rubner, “Region correspondence for image matching
via EMD flow,” in Proc. IEEE Workshop on Content-Based Access of Image and
Video Libraries, 2000.
[3] X. He, O. King, W. Y. Ma, M. Li, and H. J. Zhang, “Learning a semantic space
from user’s relevance feedback for image retrieval,” IEEE Transactions on Image
Processing, vol. 13, no. 1, pp. 39–48, Jan. 2003.
[4] I. J. Cox, M. L. Miller, T. P. Minka, T. V. Papathomas, and P. N. Yianilos,
“The Bayesian image retrieval system, PicHunter: theory, implementation, and psychophysical
experiments,” IEEE Transactions on Image Processing, vol. 9, no. 1, pp.
20–37, Jan. 2000.
[5] Y. Ishikawa, R. Subramanya, and C. Faloutsos, “MindReader: querying databases
through multiple examples,” in Proc. International Conference on Very Large Data
Bases, 1998.
[6] Y. Rui and T. S. Huang, “Optimizing learning in image retrieval,” in Proc. IEEE
International Conference on Computer Vision and Pattern Recognition, 2000.
[7] B. Ko and H. Byun, “Integrated region-based image retrieval using region’s spatial
relationships,” in Proc. IEEE International Conference on Pattern Recognition,
2002.
[8] C. Y. Li and C. T. Hsu, “Soft region correspondence estimation for graph-theoretic
image retrieval using quadratic programming approach,” in Proc. IEEE International
Conference on Multimedia and Expo, 2005.
[9] K. Kailing, H. P. Kriegel, and S. Schぴonauer, “Content-based image retrieval using
multiple representations,” in Proc. International Conference on Knowledge-Based
Intelligent Information and Engineering Systems, 2004.
[10] B. Fischer, C. Thies, M. O. Gぴuld, and T. M. Lehmann, “Content-based image
retrieval by matching hierarchical attributed region adjacency graphs,” in Proc.
SPIE Medical Imaging, 2004.
[11] R. Baeza-Yates and G. Valiente, “An image similarity measure based on graph
matching,” in Proc. IEEE International Symposium on String Processing Information
Retrieval, 2000.
[12] I. J. Cox, M. L. Miller, T. P. Minka, and P. N. Yianilos, “An optimized interaction
strategy for Bayesian relevance feedback,” in Proc. IEEE Computer Society
Conference on Computer Vision and Pattern Recognition, 1998.
[13] Y. Rui, T. S. Huang, and S. Mehrotra, “Content-based image retrieval with relevance
feedback in MARS,” in Proc. IEEE International Conference on Image Processing,
1997.
[14] M. L. Kherfi, D. Ziou, and A. Bernardi, “Learning from negative example in relevance
feedback for content-based image retrieval,” in Proc. IEEE International
Conference on Pattern Recognition, 2002.
[15] T. Wang, Y. Rui, and S. M. Hu, “Optimal adaptive learning for image retrieval,” in
Proc. IEEE Computer Society Conference on Computer Vision and Pattern Recognition,
2001.
[16] G. Aggarwal, Ashwin T.V., and S. Ghosal, “An image retrieval system with automatic
query modification,” IEEE Transactions on Multimedia, vol. 4, no. 2, pp.
201–214, June 2002.
[17] S. C. C. C. Zhang and M. L. Shyu, “Multiple object retrieval for image databases using
multiple instance learning and relevance feedback,” in Proc. IEEE International
Conference on Multimedia and Expo, 2004.
[18] F. Jing, M. Li, H.J. Zhang, and B. Zhang, “An efficient and effective region-based
image retrieval framework,” IEEE Transactions on Image Processing, vol. 13, no. 5,
pp. 699–709, May 2004.
[19] R. Zhang and Z. Zhang, “Hidden semantic concept discovery in region based image
retrieval,” 2004.
[20] J. Han, K. N. Ngan, M. Li, and H. J. Zhang, “A memory learning framework for
effective image retrieval,” IEEE Transactions on Image Processing, vol. 14, no. 4,
pp. 511–524, Apr. 2005.
[21] X. Zhou, Q. Zhang, L. Liu, L. Zhang, and B. Shi, “An image retrieval method based
on analysis of feedback sequence log,” Pattern Recognition Letters, vol. 24, no. 14,
pp. 2499–2508, Oct. 2003.
[22] I. Bartolini, P. Ciaccia, and F. Waas, “Feedbackbypass: a new approach to interactive
similarity query processing,” in Proc. International Conference on Very Large
Data Bases, 2001.
[23] J. Fournier and M. Cord, “Long-term similarity learning in content-based image
retrieval,” in Proc. IEEE International Conference on Image Processing, 2002.
[24] R. A. Horn and C. R. Johnson, Matrix analysis. Cambridge University Press, 1985.
[25] B. Luo and E. R. Hancock, “Structural graph matching using the em algorithm and
singular value decomposition,” IEEE Transactions on Pattern Analysis and Machine
Intelligence, vol. 23, no. 10, pp. 1120–1136, Oct. 2001.
[26] N. Critstianini and J. Shawe-Taylor, An introduction to support vector machines and
other kernel-based learning methods. Cambridge University Press, 2000.
[27] J. C. Platt, “Probabilistic outputs for support vector machines and comparisons
to regularized likelihood methods,” in Advances in Large Margin Classiers, A. J.
Smola, P. Bartlett, B. Schoelkopf, D. Schuurmans, Ed., 2000.
[28] A. Strehl and J. Ghosh, “Cluster ensembles – a knowledge reuse framework for
combining multiple partitions,” Journal of Machine Learning Research, vol. 3, pp.
583–667, Dec. 2002.
[29] G. Karypis and V. Kumar, “A fast and high quality multilevel scheme for partitioning
irregular graphs,” SIAM Journal on Scientific Computing, vol. 20, pp. 359–392,
Aug. 1998.
[30] D. Comaniciu and P. Meer, “Mean shift: a robust approach toward feature space
analysis,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 23,
no. 5, pp. 603–619, May 2002.
[31] L. G. Shapiro and G. C. Stockman, Computer vision. Prentice Hall, 2001.
[32] A. D. Bimbo, Visual information retrieval. Morgan Kaufmann, 1999.
[33] V. H. M. Sonka and R. Boyle, Image processing, analysis, and machine vision,
2nd ed. Brook/Cole, 1999.
[34] R. C. Gonzalez and R. E. Woods, Digital image processing, 2nd ed. Prentice Hall,
2002.
[35] Z. Su, H. J. Zhang, S. Li, and S. Ma, “Relevance feedback in content-based image
retrieval: Bayesian framework, feature subspaces, and progressive learning,” IEEE
Transactions on Image Processing, vol. 12, no. 8, pp. 924–937, Aug. 2003.
[36] J. W. Hsieh and W. E. L. Grimson, “Spatial template extraction for image retrieval
by region matching,” IEEE Transactions on Image Processing, vol. 12, no. 11, pp.
1404–1415, Nov. 2003.
[37] A. Ben-Hur, D. Horn, H. T. Siegelmann, and V. Vapnik, “Support vector clustering,”
Journal of Machine Learning Research, vol. 2, pp. 125–137, Dec. 2001.
[38] J. S.-T. B. Scholkぴopf, J. C. Platt and A. J. Smola, “Estimating the support of a
high-dimensional distribution,” Neural Computation, vol. 13, pp. 1443–1471, July
2001.
[39] A. Hanjalic and H. Zhang, “An integrated scheme for automated video abstraction
based on unsupervised cluster-validity analysis,” vol. 9, no. 8, pp. 1280–1289, Dec.
1999.
[40] S. Z. Selim and M. A. Ismail, “K-means-type algorithms: a generalized convergence
theorem and characterization of local optimality,” IEEE Transactions on Pattern
Analysis and Machine Intelligence, vol. 6, no. 1, pp. 81–87, Jan. 1984.
[41] Y. Linde, A. Buzo, and R. M. Gray, “An algorithm for vector quantization design,”
IEEE Transactions on Communications, vol. 28, no. 1, pp. 84–95, Jan. 1980.
[42] J. W. Tung and C. T. Hsu, “Learning hidden semantic cues using support vector
clustering,” in Proc. IEEE International Conference on Image Processing, 2005.
[43] J. Yang, V. Estivill-Castro, and S. K. Chalup, “Support vector clustering through
proximity graph modelling,” in Proc. IEEE Internaltional Conference on Neural
Information Processing, 2002.
[44] T. P. Minka and R. W. Picard, “Interactive learning with a “society of models”,” in
Proc. IEEE Computer Society Conference on Computer Vision and Pattern Recognition,
1996.
[45] A. Jaimes, A. B. Benitez, S. F. Chang, and A. C. Loui, “Discovering recurrent visual
semantics in consumer photographs,” in Proc. IEEE International Conference on
Image Processing, 2000.
[46] B. Huet and E. R. Hancock, “Inexact graph retrieval,” in Proc. IEEE Workshop on
Content-Based Access of Image and Video Libraries, 1999.
[47] H. Y. Bang, C. Zhang, and T. Chen, “Semantic propagation from relevance feedbacks,”
in Proc. IEEE International Conference on Multimedia and Expo, 2004.
[48] T. M. Mitchell, Machine learning. McGraw Hill, 1997.