簡易檢索 / 詳目顯示

研究生: 盧昆玉
Kun-Yu Lu
論文名稱: 學術論文摘要中文步結構的自動分析
An Automatic Move Tagger for Academic Abstracts
指導教授: 張俊盛
Jason S. Chang
口試委員:
學位類別: 碩士
Master
系所名稱: 電機資訊學院 - 資訊系統與應用研究所
Institute of Information Systems and Applications
論文出版年: 2007
畢業學年度: 95
語文別: 英文
論文頁數: 83
中文關鍵詞: 學術論文寫作摘要文步結構n-gram
外文關鍵詞: Academic Writing, abstract, move structure, n-gram
相關次數: 點閱:3下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 本篇論文致力於研究學術論文摘要中文步結構的自動分析,藉以協助非以英語為母語的學者進行學術論文摘要寫作。我們的方法,能為每篇摘要中所有的句子都以各種不同語言特徵之文步序列進行自動標記。

    此方法利用一組少量的人工標記文步的摘要群及n-grams,自動在大量未標記的摘要中訓練文步與n-grams之間的關係,以擴充產生可用來標示文步的顯著n-grams,並且使用已標記的摘要來訓練文步序列的馬可夫模型。以一組少量由人工標記的摘要做評估。最後,我們提供一個自動分析文步模型,能夠自動快速地把摘要標示上文步。

    本篇研究結果顯示,我們的自動分析文步模型對於學術論文的摘要呈現了合理的精確性,並且快速的文步標記工具提供非以英語為母語的學者一個自動化、實用的方式,預期可以縮短了解或寫作學術論文中良好結構摘要的時間。


    This paper presents a method for automatically labeling the move structures of academic abstracts to assist non-native speakers of English in writing academic abstracts.

    In our approach, sentences in a given abstract are automatically labeled with move sequences. The method involves an annotating small set of abstracts and n-grams with moves by hand, and learning the relationships between moves and n-grams in large unlabeled abstracts, and training a Hidden Markov Model of move sequences. We also implement and evaluate an automatic move tagger.

    The result of this paper shows that our automatic move tagger performs with reasonably high precision, and providing an automatic, practical and fast move tagger for non-native speakers to understand or write well-structured abstracts.

    摘要 i ABSTRACT ii Acknowledgement iii Table of Contents iv List of Tables vi List of Figures vii Chapter 1 Introduction 1 1.1 Motivation 1 1.2 Organization of the Thesis 4 Chapter 2 Related Work 5 2.1 Macrostructure of RAs 5 2.2 Linguistic Features and Rhetorical Structure of Abstracts in RAs 7 2.3 Computerized Corpus Analysis of Moves 9 Chapter 3 Method 10 3.1 Problem Statement 10 3.2 Procedure of Learning One Move per Sentence 11 3.2.1 Collect Abstracts for Training 12 3.2.2 Manually Label of Seed Abstracts with Moves 14 3.2.3 Extract Bigrams from U 16 3.2.4 Label Bigrams Based on One-Move-per-Bigrams Constraint 20 3.2.5 Automatically Extract Bigrams from Labeled Sentences in All Sections 21 3.2.6 Automatically Expand Bigrams with Moves 23 3.2.7 Build a Hidden Markov Model for Move Tagging 27 3.3 Automatic Move-tagging at Run-Time 29 3.3.1 Pre-processing 29 3.3.2 Extract Bigrams from TA 30 3.3.3 Use the Proposed HMM to Automatically Label TA 32 Chapter 4 Experiments and Evaluation 34 4.1 Experimental Setting 34 4.2 Evaluation 39 4.2.1 Evaluation Metrics 39 4.2.2 Analysis of Performance 39 Chapter 5 Discussion 42 5.1 Comparison with MOVER 42 5.2 Comparison with Su (2005) 43 5.3 Analysis of Errors 43 Chapter 6 Conclusion and Future Work 45 6.1 Overview 45 6.2 Future Work 45 References 47 Appendix A – Labeled Abstracts 51

    Anthony, L. & Lashkia, G. V. (2003). Mover: A machine learning tool to assist in the reading and writing of technical papers. IEEE Trans. Prof. Commun., 46, pp. 185-193.
    American National Standards Institute (1979). American national standard for writing abstracts. ANSI Z39, 14-1979. New York: Author.
    Bhatia, V. K. (1993). Analysing genre: language in professional settings. Applied Linguistics and Language Studies Series. London & NY: Longman.
    Bloor, M. (1984). English Language Needs in the University of Cordoba: The Report of a Survey. Birmingham, UK: Univ. Aston Language Studies Unit (mimeo).
    Connor, U. (1996). Contrastive Rhetoric: Cross-Cultural Aspects of Second-Language Writing. Cambridge, U.K.: Cambridge Univ. Press.
    Cooper, C. (1985). Aspects of Article Introductions in IEEE Publications. Unpublished M.Sc. dissertation. Aston University,U.K.
    Crookes, G. (1986). Towards a validated analysis of scientific text structure. Applied Linguistics, 7, 57-70.
    Day, R. A. (1994). How to Write and Publish a Scientific Paper. Cambridge, U.K.: Cambridge Univ. Press, 1994.
    Dudley-Evans, T. (1994). Genre analysis: an approach to text analysis for ESP. Advances in Written Text Analysis. (pp.219-228). NY: Routledge.
    Granger, S. (1998). “Prefabricated Patterns in Advanced EFL Wrtiting: Collocations and 60 Formulae,” in A.P. Cowie (ed.), Phraseology: Theory, Analysis and Applications. Oxford: Clarendon Press.
    Hall, D., Hawkey, R., Kenny, B., & Storer, G. (1986). Patterns of thought in scientific writing: a course in information structuring for engineering students. English for Specific Purposes, 5, 147-160.
    Hill, S. S., Soppelsa, B. F. & West, G. K. (1982). Teaching ESL students to read and write experimental-research papers. TESOL Quarterly, 16(3), 333-347.
    Hinds, J. (1990). “Inductive, deductive, quasiinductive: Expository writing in Japanese, Korean, Chinese, and Thai,” in Coherence in Writing: Research and Pedagogical Perspectives, U. Connor and A. M. Johns, Eds. Alexandria, VA: TESOL.
    Hopkins, A. (1985). An Investigation into the Organizing and Organizational Features of Published Conference Papers. Unpublished Naster’s thesis. University of Birmingham, U.K.
    Hopkins, A. & Dudley-Evans, A. (1988). A genre-based investigation of the discussion sections in articles and dissertations. English for Specific Purposes, 7, 113-122.
    Howarth, P. (1996). Phraseology in English Academic Writing. Tubingen: Max Niemeyer Verlag.
    Kuo, C.H. (2002). Phraseology in scientific research articles. The Eleventh International Symposium on English Teaching/ Fourth Pan–Asian Conference. pp. 405-411.
    Lau, H.H. (2004). The structure of academic journal abstracts written by Taiwanese PhD students. Taiwan Journal of TESOL, 1(1), 1-25. Taipei: Crane.
    Manning, C. and Schutze, H. (1999). Foundations of statistical natural language processing. Mathematical Foundations, Chapter 2. Cam-.bridge, MA: MIT Press.
    Nakajima, T. and Tsukamoto, S. (1996). Chitekina Kagaku Gijutsu Bunshou no Kakikata [Writing Intelligent Scientific and Technical Texts]. Tokyo, Japan: Korona.
    Nattinger, J. R., & DeCarrico, J. S. (1992). Lexical Phrases and Language Teaching. Oxford: Oxford University Press.
    Salager-Meyer, F. S. (1990). Discourse flaws in medical English abstracts: A genre analysis per research- and text-type. Text, 10 (4), 365-384.
    Salager-Meyer, F. S. (1992). A text-type and move analysis study of verb tense and modality distribution in medical English abstracts. English for Specific Purposes, 11, pp 93-113.
    Salton, G. (1989). Automatic Text Processing: The transformation, analysis, and retrieval of information by computer. Addison-Wesley.
    Samraj, B. (2002). Introductions in research articles: variations across disciplines. English for Specific Purposes, 21 (1), 1-17.
    Samraj, B. (2005). An exploration of a genre set: Research article abstracts and introductions in two disciplines. English for Specific Purposes, 24(2), 141-156.
    Santos, M.B. (1996). The textual organization of research paper abstracts in applied linguistics. Text ,16(4), 481-499.
    Su, Y. L. (2005). Computational Analysis of Move Structures in Academic Abstract and Pedagogical Application, ISA NTHU
    Swales, J.M. (1981). Aspects of article introductions. Birmingham, UK: The University of Aston, Language Studies Unit.
    Swales, J.M. (1990). Genre analysis: English in Academic and Rresearch Settings. Cambridge University Press.
    Thompson, D. K. (1993). Arguing for experimental "facts" in science. Written Communication, 10, 106-128.
    Yang, R., & Allison, D. (2003). Research articles in applied linguistics: moving from results to conclusion. English for Specific Purposes, 22 (4), 365-385.
    Yarowsky, D. (1995). “Unsupervised Word Sense Disambiguation Rivaling Supervised Methods.'' In Proceedings of the 33rd Annual Meeting of the Association for Computational Linguistics. Pp.189-196. Cambridge, MA.

    無法下載圖示 全文公開日期 本全文未授權公開 (校內網路)
    全文公開日期 本全文未授權公開 (校外網路)

    QR CODE