研究生: |
朱瑞琪 Zhu-Chi Chu |
---|---|
論文名稱: |
The Summarization of Chinese News Articles by Temporal or Themed Sequences 摘要中文新聞之報導-以時間或主題排序 |
指導教授: |
林福仁
Fu-ran Lin |
口試委員: | |
學位類別: |
碩士 Master |
系所名稱: |
科技管理學院 - 科技管理研究所 Institute of Technology Management |
論文出版年: | 2008 |
畢業學年度: | 96 |
語文別: | 英文 |
論文頁數: | 82 |
中文關鍵詞: | Text summarization 、intra-paragraph 、inter-paragraph 、temporal 、themed 、news topic summarization |
相關次數: | 點閱:1 下載:0 |
分享至: |
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
Most of summarization can extract important sentences, but few of them concern the readability. This thesis proposes a summarization system which considers the sentences coherence and orders the sentences by the news features to facilitate readers to comprehend the news topics.
There are three major components of the summarization system proposed in this thesis. First, the event clustering module identifies the events by Self-Organized Map (SOM) and the episodes by Chameleon in every event. Second, the intra-paragraph sequencing module extracts the features of every event in a news topic, and selects the composition strategy either in temporal, themed, or hybrid to compose sentences for an event as a paragraph. Third, the inter-paragraph sequencing module orders the paragraphs and calculates the topic temporal dependence to decide inter-paragraph sequence. It can order inter-paragraph by temporal or by themed based on the feature of topic temporal dependence.
Experimental results show that different users may prefer different summaries using different composition methods, and there is a need of the mechanism to order sentences by different methods and choose suitable methods depending on the event’s features either in temporal, themed sequence, or both.
Aonet, C., M. E. Okurowski, et al. (1997). "A Scalable Summarization System Using Robust NLP." In Proceedings of the workshop on intelligent scalable text summarization at the 35th meeting of the association for computional linguistics, and the 8th conference of the European chapter of the association for computional linguistics(pp. 66-73).
Bollegala, D., N. Okazaki, and M. Ishizuka (2006). "A bottom-up approach to sentence ordering for multi-document summarization." Proceedings of COLING/ACL.
Chandrasekaran, B., J.R.Josephson, and V.R. Benjamins. (1999). "What Are Ontologies, and Why Do We Need Them?" IEEE Intelligent Systems 14(1): 20-26.
Chen, H. H., et al. (2003). "A summarization system for Chinese news from multiple sources." Journal of the American Society for Information Science and Technology 54(13): 1224-1236.
Goldstein, J., V. Mittal, et al. (2000). "Multi-document summarization by sentence extraction." NAACL-ANLP 2000 Workshop on Automatic summarization - Volume 4.
Gong, Y. and X. Liu (2001). "Generic text summarization using relevance measure and latent semantic analysis." Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval.
Gruber, T. (1992). "Ontology Definition." from http://www-ksl.Stanford.edu/kst/what-is-an-ontology.html.
Guha, S., R. Rastogi, et al. (2000). "Rock: A robust clustering algorithm for categorical attributes." Information Systems 25(5): 345-366.
Guha, S., R. Rastogi, et al. (2001). "Cure: an efficient clustering algorithm for large databases." Information Systems 26(1): 35-58.
Han, J. and M. Kamber (2006). Data Mining: Concepts and Techniques, Morgan Kaufmann.
Harabagiu, S. and F. Lacatusu (2005). "Topic themes for multi-document summarization." Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval: 202-209.
Hsueh, J. F. (2003). "Learning ontology from Web documents for supporting Web query."
Karypis, G., E. H. Han, V. Kumar (1999). "Chameleon: A hierarchical clustering algorithm using dynamic modeling." IEEE Computer 32(8): 68-75.
Kohonen, T. (2001). Self-Organizing Maps, Springer.
Kuo, J.-J. and H.-H. Chen (2008). "Multidocument Summary Generation: Using Informative and Event Words." ACM Transactions on Asian Language Information Processing (TALIP) 7(1): 1-23.
Lin, F. and C. H. Liang (2008). "Storyline-based summarization for news topic retrospection." Decision Support Systems 45(3): 473-490.
McKeown, K., R. J. Passonneau, et al. (2005). "Do summaries help?" Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval.
Mihalcea, R. and P. Tarau (2005). "A Language Independent Algorithm for Single and Multiple Document Summarization." In Proceedings of IJCNLP2005.
Okazaki, N., Y. Matsuo, and M. Ishizuka (2004). "Improving chronological sentence ordering by precedence relation." Proceedings of 20th International Conference on Computational Linguistics (COLING 04): 750–756.
Radev, D. R., H. Jing, et al. (2004). "Centroid-based summarization of multiple documents." Information Processing and Management 40(6).
Radev, D. R., Z. Zhang, J. Otterbacher (2004). "Cross-document relationship classification for text summarization." Association for Computational Linguistics.
Sahay, S. Study and Implementation of CHAMELEON algorithm for Gene Clustering, www-static.cc.gatech.edu/~ssahay/7001Report.pdf.
Salton, G. and C. Buckley (1988). "Term-weighting approaches in automatic text retrieval." Information Processing and Management: an International Journal 24(5): 513-523.
Salton, G. and M. J. McGill "Introduction to Modern Information Retrieval."
Tan, P. N., M. Steinbach, V. Kumar (2005). Introduction to Data Mining, Addison-Wesley Longman Publishing Co., Inc. Boston, MA, USA.
Van Rijsbergen, C. J. (1979). Information Retrieval, Butterworth-Heinemann Newton, MA, USA.
Wan, X. and J. Yang (2007). "CollabSum: exploiting multiple document clustering for collaborative single document summarizations." Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval.
Wang, G. B. D. and Z. D. Y. Zhu (2005). "Automatic Chinese Summarization Method Based on the HowNet and Clustering Algorithm." Journal of Chinese Information Processing.
Wang, G. B. D. and Z. D. Y. Zhu (2005). "Automatic Chinese Text Summarization System Based on Conceptual Vector Space Model." Journal of Chinese Information Processing.
Wu(吳家威), G. J. W. and J. L. Liou(劉昭麟) (2002). An Ontology-Based Article Summarization System. 2002 民生電子研討會論文集: 41-46.