簡易檢索 / 詳目顯示

研究生: 杜建志
Tu, Chien-Chih
論文名稱: Using Biological pathway ontology for information filtering and knowledge extraction –A case study on Apoptosis
指導教授: 蘇豐文
Soo, Von-Wun
口試委員:
學位類別: 碩士
Master
系所名稱: 電機資訊學院 - 資訊系統與應用研究所
Institute of Information Systems and Applications
論文出版年: 2009
畢業學年度: 97
語文別: 中文
論文頁數: 60
中文關鍵詞: 生物資訊智慧型代理人本體論資訊擷取程序性細胞死亡的生化調控路徑文件探勘
外文關鍵詞: Bioinformatics, Intelligent Agent, Ontology, Information Extraction, Apoptosis Pathway, Text mining
相關次數: 點閱:1下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 隨著人類基因體計畫逐步地完成,基因與序列的資料與相關的生物醫學文獻的數量迅速成長。生物醫學領域的學者及研究人員亦藉由網際網路將研究的成果以電子化的型式發佈於公開的平台。雖然提供了生物醫學領域的學者與研究人員方便的管道可以快速的取用與共享相關的生物醫學資料,但卻也面臨的資訊過載、資訊整合及知識萃取等問題,因此如何有效率地自動化生物體的序列分析及以從龐大的生物資料及資訊中萃取知識等成為一個新興的研究議題。京都基因與基因組百科全書( KEGG )提供包含核酸分子、蛋白質序列、基因表現、基因組圖譜、生化調控路徑等資訊及圖表。美國國家生物醫學資訊中心( NCBI )中的PubMed生物醫學文獻資料庫提供龐大的生物醫學相關文獻,在這些生物醫學相關文獻中潛藏著未被發掘的知識如分子生物間的交互影響關係等。本篇論文將利用智慧型代理人、Web Service、本體論( ontology )等技術來輔助生物醫學領域的學者與研究人員從龐大的生物醫學文獻中過濾、整合及分析資訊並淬取出相關的知識。生化調控路徑知識的呈現需藉由進一步的資訊過濾、整合、分析及擷取等工作程序,生物醫學領域的專家整合各個公開的生物醫學詞彙資料庫如WordNet、MeSH、GO等並建立成一個專家的生物詞彙知識庫,用來輔助語義自動註解的工作程序,配合The pattern matching and sentence-parsing等技術的使用,使得利用本體論推論並從生物醫學文獻中擷取正確且相關的知識變的可行。本篇論文以程序性細胞死亡的生化調控路徑( Apoptosis Pathway )為例評估與討論系統的實驗結果與面臨的問題,本系統亦可延伸發展從生物醫學文獻中自動擷取蛋白質間( protein-protein )、基因間( gene-gene )的相互影響關係進而推論未知的基因網絡圖譜( gene networks )。


    Due to the automated biological sequence analysis, the gigantic amount of biological data and knowledge produced has made great challenges for the biologists to process, analyze, and interpret the information and knowledge. Much biological research results of genomic sequences become available in certain electronic forms via Internet or Webs. The database Kyoto Encyclopedia of the Genes and Genomes (KEGG) provides the useful information about the biological pathway. However, the PubMed of the NCBI also consists of a gigantic size of the biological literature that possesses potential relevant information for interpreting the molecular interactions. We developed intelligent agent technology to integrate those databases using web service techniques and helped biologists to filtering the information and also extract knowledge directly from a large scale of biological literature. Including the biological pathway knowledge would be a promising extension for the information filtering and extraction. The research integrated sharable biomedical thesauri of WordNet, MeSH (Medical Subject Heading) to support the automatic semantic annotation. The pattern matching and sentence-parsing techniques are facilitated the ontology inference to extract the correct knowledge from the abstract. We evaluated the system based on the problems in Apoptosis pathway domain. It will be extended to the automated extraction of gene-gene and protein-protein interaction information from biological literature and developed the inference methods in understanding the gene networks.

    第1章 緒論 1 1.1 研究背景 1 1.2 研究動機與目的 2 1.3 研究方法與範圍 5 1.4 論文架構 6 第2章 文獻探討 7 2.1 以生物資訊探討生化調控路徑及推測 7 2.2 線上生物資訊資料庫資源(目標資料庫簡介) 7 2.3 查詢生物資訊資源的問題 10 2.4 資訊擷取的介紹 12 2.5 WEB SERVICES的定義 13 2.6 本體論 ( ONTOLOGY )的定義 14 第3章 系統架構 15 3.1 系統運作流程 17 3.2 KEGG WEB SERVICE MODULE 18 3.3 多重資料庫搜尋模組(MULTIPLE DATABASE SEARCH MODULE) 23 3.4 資訊擷取模組(INFORMATION EXTRACTION MODULE) 23 3.5 程序性細胞死亡的生化調控路徑擷取模組( APOPTOSIS PATHWAY EXTRACTION MODULE ) 25 第4章 研究成果 27 4.1 系統建置的簡介 27 4.2 系統程式實作、系統介面 27 4.2.1 生物醫學文獻來源 27 4.2.2 生物醫學文獻查詢語法 28 4.2.3 生物醫學文獻的格式 29 4.2.4 生物醫學文獻篩選 30 4.2.5 生物醫學文獻之文句篩選 31 4.2.6 生物醫學文獻之文句關係擷取 32 4.3 實驗設計 40 4.3.1 擷取相關文獻( abstract ) 40 4.3.2 擷取相關文句( sentence) 43 4.3.3 擷取文句中之關係( relation ) 45 第5章 結論與未來發展 46 5.1 結論 46 5.2 未來發展 46 5.3 參考文獻 47 附件( 一 ) 生物醫學詞彙資料庫 49 附件( 二 ) 生物醫學領域的動詞詞彙庫 57

    [1] Philip Hieter, Mark Boguski, 1997, “Functional Genomics: It's All How You Read It”, Science, Vol 278, Issue 5338, page 601-602
    [2] H. Alani, S. Kim, D.E. Millard, M.J. Weal, W. Hall, P.H. Lewis, and N.R. Shadbolt, “Automatic ontology-based knowledge extraction from web documents,” IEEE Intelligent Systems, vol.18, no. 1, Jan/Feb 2003, pp.14-21
    [3] Stevens R.; Goble C.A. and Bechhofer S., “Ontology-based knowledge
    representation for bioinformatics”, Briefings in Bioinformatics, Vol. 1, No.4, pp. 398-414, November 2000
    [4] S. Schuster , T. Pfeiffer , F. Moldenhauer , I. Koch and T. Dandekar , “Exploring the pathway structure of metabolism: decomposition into subnetworks and application to Mycoplasma pneumoniae,” Bioinformatics, 18, 351-361, 2002.
    [5] Kwang Mong Sim, “TOWARD AN ONTOLOGY-ENHANCED INFORMATION FILTERING AGENT” SIGMOD Record, Vol. 33, No. 1, March 2004
    [6] J.A. Dickerson1, D. Berleant1, Z. Cox1, W. Qi1, D. Ashlock2, and E. Wurtele3 , “Creating Metabolic Network Models using Text Mining and Expert Knowledge” ,
    [7] http://www.ornl.gov/sci/techresources/Human_Genome/project/about.shtml
    [8] http://www.bmnews.net/big5/biotech/hgp.html
    [9] Zupan B, Demsar J, Bratko I, Juvan P, Halter JA, Kuspa A, Shaulsky G. GenePath: a system for automated construction of genetic networks from mutant data. Bioinformatics. 2003 Feb 12;19(3):383-9.
    [10] L. V. Subramaniam, S. Mukherjea, P. Kankar, B.Srivastava, V. Batra, P. Kamesam, and R. Kothari,“Information Extraction from Biomedical Literature:Methodology, Evaluation and an Application,” Proceedings of the ACM Conference on Information and KnowledgeManagement, New Orleans, 2003, pp. 410–417.
    [11] N. Uramoto , H. Matsuzawa , T. Nagano , A. Murakami , H. Takeuchi , K. Takeda, A text-mining system for knowledge discovery from biomedical documents, IBM Systems Journal, v.43 n.3, p.516-533, July 2004 .
    [12] G. Leroy, H. Chen, and J. D. Martinez, “A shallow parser based on closed-
    class words to capture relations in biomedical text”, J. of Biomedical
    Informatics 36, 145-158 ( 2003 ).
    [13] http://www.genome.jp/kegg/kegg1.html
    [14] http://www.ncbi.nlm.nih.gov/entrez/query/static/overview.html
    [15] http://www.nlm.nih.gov/pubs/factsheets/mesh.html

    無法下載圖示 全文公開日期 本全文未授權公開 (校內網路)
    全文公開日期 本全文未授權公開 (校外網路)
    全文公開日期 本全文未授權公開 (國家圖書館:臺灣博碩士論文系統)
    QR CODE