研究生: |
張立典 Li-Tien Chang |
---|---|
論文名稱: |
以知識表徵為基之文件分群法 An Ontology-based Document Clustering Methodology |
指導教授: |
張瑞芬
Dr. Amy J.C. Trappey |
口試委員: | |
學位類別: |
碩士 Master |
系所名稱: |
工學院 - 工業工程與工程管理學系 Department of Industrial Engineering and Engineering Management |
論文出版年: | 2005 |
畢業學年度: | 93 |
語文別: | 英文 |
論文頁數: | 79 |
中文關鍵詞: | 知識表徵 、文件分群 、模糊推論 |
外文關鍵詞: | Ontology, Fuzzy inference control, Document clustering |
相關次數: | 點閱:1 下載:0 |
分享至: |
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
此論文主要是提出一個分析以及分群知識文件的方法論,現今有很多分析知識文件的方法,都使以關鍵字為基所發展出來,但是關鍵字不管對人或電腦來講,都是片斷的、比較沒意義的。因此再此我們提出一個以知識表徵為基的知識文件分析方法,藉由知識表徵,希望讓電腦能夠在某一程度下更能夠真正了解知識文件的內容。此方法主要分為幾大步驟,首先專家必須先建立某一領域的知識,並且輸入訓練資料以訓練系統字彙。在訓練完成後,便可作知識文件的分群。分群的步驟,首先之事文建會經過自然語言處理,然後再經由事先所訓練的字彙,找出代表知識文件的知識表徵,接著我們藉此知識表徵且利用模糊推論去推論知識文件間的關係值,最後再利用階層式的分群法對知識文件做分群動作。在此研究最後,我們會評估本方法的效果,並且與關鍵字為基的方法做比較與討論。
A purpose of the thesis is to present a novel method in analyzing, synthesizing and managing knowledge documents. In general, the methodologies that synthesize and management patents are almost using the key phrases as indices of knowledge documents. But the key phrases extracted from patents are meaningless to computers. Thus a novel methodology to analyze and manage knowledge documents based on ontology is developed in this thesis research. The methodology in this thesis enables computers to understand the knowledge documents in some degree via ontology instead of key phrases in this thesis. The methodology is divided into several steps. First, experts have to construct the specific domain ontology schema and put some training data to train the system. Then a learning method from natural language texts is adapted to infer the principal ontology of the knowledge documents. Therefore we use the fuzzy logic control (FLC) to infer the relationship between the knowledge documents and a suitable document cluster via ontology. Finally, we will evaluate the effectiveness of this methodology, and compare with knowledge document clustering based on key phrases.
References
[1] Aizawa, A., 2003, “An information-theoretic perspective of tf–idf measures,” Information Processing and Management, Vol. 39, pp. 45-65.
[2] Champin, P-A., “RDF Tutorial”, 2001.
[3] Feng, F., and Bruce Croft W., “Probabilistic techniques for phrase extraction,” Information Processing and Management, 37, 2001, 199-220.
[4] Hou, J.L., Chan, C.A., “A document content extraction model using keyword correlation analysis,” International Journal of Electronic Business Management (Taiwan), Vol. 1, No. 1, 2003, 54-62.
[5] http://www.ontology.org/
[6] Kao, C-C. (Advisor: Prof. Y-H. Kuo, and J-H. Chiang), “Personalized information classification system with automatic ontology construction capability,” M.S. Thesis, Department of Computer Science & Information Engineering, 2000, National Cheng Kung University, Tainan, Taiwan.
[7] Kung, C-C. (Advisor: Prof. Y. H. Kuo), “Personalized XMLInformation service system with automatic object-oriented ontology construction,” M.S. Thesis, Department of Computer Science & Information Engineering, 2000, National Cheng Kung University, Tainan, Taiwan.
[8] Lam, S-L., and Lee, L-D., “Feature reduction for neural network based text categorization,” Proceedings of the 6th International Conference on Database Systems for Advanced Applications, 1999, ,195-202.
[9] Lee, C-S., Chen, Y-J., and Jian, Z-W., “Ontology-based fuzzy event extraction agent for Chinese e-news summarization,” Expert Systems with Applications, 25, 2003, 431-447.
[10] Liebowitz, J., “Knowledge management and its link to artificial intelligence,” Expert Systems with Applications, 20, 2001, 1-6.
[11] Lin, S. C. I. (Advisor: Prof. A. J. C. Trappey), “Using Neural Network Categorization Technology to Develop an Electronic Document Management System,” M.S. Thesis, Department of Industrial Engineering and Engineering Management, 2004, National Tsing Hua University, Hsinchu, Taiwan.
[12] Macintosh, A., Filby, I., and Kingston, J., “Knowledge management techniques: teaching and dissemination concepts,” Int. J. Human-Computer Studies, 1999, 549-566.
[13] Maiers, J., and Sherif, Y.S., ”Applications of fuzzy set theory,” IEEE Transactions Systems, SMC-15, 1985, 175-189.
[14] Malone, D., “Knowledge management: a model for organizational learning,” International Journal of Accounting Information Systems, 3, 2002, 111-123.
[15] Mamdani, E.H., “Application of fuzzy logic to approximate reasoning using linguistic synthesis,” IEEE Transactions on Computers, C-26, 1997, 1182-1191.
[16] Nevill-Manning, C. G.., Witten I. H., and Paynter G. W., “Lexically-generated subject hierarchies for browsing large collections,” Intranet. J. Digital Libraries, 2(2-3), 1999, 111-123.
[17] Nissen, M. E., “Knowledge-based knowledge management in the reengineering domain,” Decision Support Systems, 27, 1999, 47-65.
[18] Perrin, P., and Petry, F. E., “Extraction and representation of contextual information for knowledge discovery in texts,” Information Sciences, 151, 2003, 125-152.
[19] Rindflesch, T-C., and Fiszman, M., “The interaction of domain knowledge and linquistic structure in natural language processing: interpreting hypernymic propositions in biomedical text,” Journal of Biomedical Informatics, 2003, 36, 462-477.
[20] Runkler, T. A., and Bezdek, J. C., “Web mining with relational clustering,” International Journal of Approximate Reasoning, 32, 2003, 217-236.
[21] Russell, S., and Norvig, P., “Artificial intelligence a modern approach,” 2002, Addison-Wesley, New York.
[22] Sanchez, J-M., Garcia, and R., Bries, J-T., “An approach for incremental knowledge acquisition from text,” Expert System with Application, 25, 2003, 77-86.
[23] Sanchez, S. N., Triantaphyllou, E., and Kraft, D., “A feature mining based approach for the classification of text documents into disjoint classes,” Information Processing and Management, 38, 2002, 283-604.
[24] Shamsfard, M., and Barforoush, A.A., “Learning ontologies from natural language texts,” Human-Computer Studies, 60, 2004, 17-63.
[25] Takaki, T., and Sugeno, M., “Fuzzy identification of systems and its applications to modeling and control,” IEEE Transactions on Systems, SMC-15, 1985, 116-132.
[26] Vlajic, N., Card, H.C., “An adaptive neural network approach to hypertext clustering,” Neural Networks. IJCNN '99. International Joint Conference on, vol.6, 1999, 3722 - 3726
[27] Wang, H.F., and Wu, G.Y., “Multicriteria Fuzzy C-Mean Analysis,” Fuzzy Set & System, 64, 1994, 311-319.
[28] Witten, I. H., “Adaptive text mining: inferring structure from sequences,” Journal of Discrete Algorithms, 2, 2004, 137-159.
[29] Wu, Z., and Palmer, M., “Verb semantics and lexical selection,” Proceedings of the 32nd Annual Meeting of the Association for Computational Linguistics, 1994, 133-138.
[30] Yuan, S-T., and Cheng, C., “Ontology-based personalized couple clustering for heterogeneous product recommendation in mobile marketing,” Expert System with Applications, 26, 2004, 461-476.