研究生: |
張兆慶 Chang, Jao-Ching |
---|---|
論文名稱: |
Counterpoint Reader:運用生成式人工智慧進行文學文本簡化 Counterpoint Reader: Literary Text Simplification Using Generative AI |
指導教授: |
張俊盛
CHANG, JYUN-SHENG |
口試委員: |
杜海倫
Tu, Hai-Lun 楊謦瑜 Yang, Ching-Yu 蕭若綺 Hsiao, Jo-Chi |
學位類別: |
碩士 Master |
系所名稱: |
電機資訊學院 - 資訊工程學系 Computer Science |
論文出版年: | 2025 |
畢業學年度: | 113 |
語文別: | 英文 |
論文頁數: | 49 |
中文關鍵詞: | 文本簡化 、微調 、預訓練模型 、大型語言模型 、提示工程 |
外文關鍵詞: | Text Simplification, Fine-tuning, Pre-trained Model, Large Language Model, Prompt Engineering |
相關次數: | 點閱:150 下載:4 |
分享至: |
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
我們提出一種使用經過微調的預訓練模型系統來產生簡化文本的方法,能有效保存原始文學文本中的語意與結構完整性。在我們的方法中,逐句簡化文學文本,使用指令調整的模型,以提升保留原始語意與結構的能力。此方法包括使用大型語言模型,進行提示工程來產生不同版本的簡化示例,藉此擴充資料集,接著使用擴充後的資料集自動訓練模型,並自動產生針對特定歐洲共同語言參考標準(CEFR)等級的簡化輸出與回饋,例如 B1 等級的可讀程度以及對應的解釋資訊。在執行階段,輸入文本會被轉換成一組標註目標 CEFR 等級的句子層級資料,並透過已微調的模型系統簡化句子。我們提出一個原型閱讀系統「Counterpoint Reader」,將此方法應用於預訓練模型。經過一組青少年文學作品上的評估顯示,該方法顯著優於現有的簡化工具,並在文學簡化方面的表現與大型參數的語言模型相當。我們的方法能有效支援文學文本中句子的簡化,進一步改善保留語意忠實度與句子結構的表現。
We introduce a method for generating simplified texts that preserve the semantic and structural integrity from a given literary text. In our approach, literary texts are simplified into sentences aimed at improving the ability to preserve the original meaning and structure using instruction tuned models. The method involves augmenting datasets with various types of simplified examples generated through prompt engineering with a large language model (LLM), automatically training models on the augmented datasets, and automatically producing simplified outputs and feedback information targeted at a specific CEFR level such as B1 readability along with corresponding explanations. At run time, input texts are transformed into a set of sentence-level queries labeled with the target level, and simplification is performed by the fine-tuned model. We present a prototype reading system, Counterpoint Reader, which applies the method to pre-trained models. Evaluation on a set of young-adult novels shows that the method significantly outperforms existing simplification tools and produces results comparable to those from large-parameter LLMs in literary simplification. Our methodology effectively simplifies sentences from literary texts, resulting in a system that further improves performance while preserving semantic fidelity and sentence-level structure.