簡易檢索 / 詳目顯示

研究生: 蘇昭宇
Su, Chao-Yu
論文名稱: 基於語言學知識的階層式英語韻律模型: 台灣英語腔調分析與對電腦輔助英語教學的應用
A hierarchical prosodic model of English using linguistic information: analysis for L2 accent of Taiwan English and applications for computer-assisted language learning
指導教授: 張俊盛
Chang, Jason-S
張智星
Jang, Jyh-Shing
口試委員: 陳宜欣
冀泰石
王逸如
學位類別: 博士
Doctor
系所名稱: 電機資訊學院 - 資訊系統與應用研究所
Institute of Information Systems and Applications
論文出版年: 2018
畢業學年度: 106
語文別: 英文
論文頁數: 94
中文關鍵詞: 階層式韻律模型語言學台灣英語電腦輔助英語教學
外文關鍵詞: Hierarchical prosodic model, Linguistic information, Taiwan English, Computer-assisted language learning
相關次數: 點閱:1下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 本篇論文研究目標為解構台灣英語腔調與深入探討台灣英語腔調來源,分析的三種語音資料為(1) 美國地區母語英語 (2) 台灣地區二語英語 (3) 台灣地區母語國語,即台灣地區二語英語的母語語料;同時本論文也利用分析的結果發展韻律學習系統,此系統可輸出美式英語腔調的韻律提供學習者自我韻律校正,以改善學習者的帶有腔調的口語韻律。 分析方法方面,本論文提供了三種韻律特徵分別代表了不同的分析角度,此三種韻律特徵分別為 (1) 不考慮階層式韻律架構的表面韻律特徵 (2) 只考慮基本階層式韻律架構的深層特徵 (3) 考慮更完整的階層式韻律架構的深層特徵,即考慮篇章與信息結構。比較此三種不同的研究角度,發現考慮完整階層式架構的韻律特徵,對母語英語與台灣二語英語間的韻律差異有最好的解釋性;在韻律學習系統方面,我們提出的方法,相較之前方法,也能提高對母語英語韻律的預測性,顯示我們的階層式韻律模型有助於發展韻律校正系統,輸出更接近自然母語的韻律,我們相信此韻律學習系統對於電腦輔助英語學習,會是一個有效的應用。


    The present study attempts to pinpoint the distinct prosodic features of Taiwan L2 English, and to account for the source of L2 accent by analyzing and comparing (1) American L1 (Native) English, (2) Taiwan L2 (Non-native) English and (3) Taiwan L1 Mandarin, namely Taiwan L2's mother tongue. The findings are used to develop a
    prosody training application to train and improve L2 speaking by simulating native-like expressive prosody as corrective feedback for L2 learners. The prosodic analyses and comparison accounting for L2 accent are conducted using three different types of prosodic features including (1) surface prosodic features without
    considering interaction among linguistic levels, (2) underlying hierarchical features of prosody considering interaction among basic linguistic levels, and (3) underlying hierarchical features of prosody considering more linguistic levels by discourse and
    information structures. The three types of prosodic features respectively explain the Taiwan L2 accent of English in diverse analytical perspectives. The hierarchical prosodic model considering discourse and information structures best accounts for the Taiwan L2 accent of English by providing significant structural differences
    between L1 and L2 at each level of specification. The hierarchical model incorporating discourse and information structures also shows a better prediction over existing methods and great capacity for generating L1-like expressive continuous speech in a training task aiming to improve L2 prosody by generating corrective feedback for L2 learners. We believe the proposed model can be an effective application for computer-assisted language learning (CALL).

    Chapter 1: Introduction ...........................................1 1.1 Related studies of L2 accent ..................................8 Chapter 2: Experimental dataset ..................................13 2.1 English.......................................................13 2.1.1 Corpus design ..............................................14 2.1.2 Annotation .................................................15 2.2 Mandarin .....................................................16 2.2.1 Annotation .................................................17 Chapter 3: Prosodic analysis .....................................19 3.1 Analysis using surface prosodic features .....................19 3.1.1 Feature extraction .........................................19 3.1.2 Classifiers ................................................21 3.1.3 Prosodic comparison between L1 English and L2 Mandarin .....22 3.1.4 Prosody comparison between L1 and L2 English ...............23 3.2 Hierarchical analysis using command response model ...........23 3.2.1 Extraction of Accent and phrase commands Aa/Ap (F0) ........26 3.3 Hierarchical analysis using discourse and information structure ..................................................................27 3.3.1 Linguistic variables........................................29 3.3.2 Separation of prosodic features into levels of linguistic variables ........................................................35 3.3.3 Predict L1 prosody using hierarchical linguistic variables ..................................................................37 3.3.4 Resynthesizing corrective feedback for L2 learners ..................................................................40 Chapter 4: Experimental results ..................................................................43 4.1 Analysis using surface prosodic features ..................................................................44 4.1.1 Prosody classification .....................................44 4.1.2 Prosodic comparison between L1 English and L2 Mandarin .....45 4.1.3 Prosody comparison between L1 and L2 English ...............48 4.1.4 Discussion .................................................53 4.2 Hierarchical analysis using command response model ...........55 4.2.1 Magnitude, Variety and Contrast of Ap ......................56 Chapter 5: General Discussion ....................................81 Chapter 6: Conclusion ............................................86 Appendix..........................................................88 Reference.........................................................89

    Andersen, R., "Modern Methods for Robust Regression", Sage University Paper Series on Quantitative Applications in the Social Sciences, 2008.
    Auer, P., Harald, B., Wolfgang, M., "A learning rule for very simple universal approximators consisting of a single layer of perceptrons", Neural Networks 21, 2008
    Anderson-Hsieh, J., Johnson, R., & Koehler, K. "The relationship between native speaker judgments of non-native pronunciation and deviance in segmentals, prosody, and syllable structure", Language Learning, 42, 529-555, 1992.
    Atterer, M., Ladd, D. R. "On the phonetics and phonology of ‘‘segmental anchoring’’ of F0: evidence from German", Jornal of phonetics, 2004.
    Andreeva, B., Barry, W. J., & Koreman, J. "Local and global cues in the prosodic realization of broad and narrow focus in Bulgarian", Phonetica, 73(3-4), 256-278., 2016
    Avesani, C., & Vayra, M. "Broad, narrow and contrastive focus in Florentine Italian", In Proceedings of the 15th International Congress of Phonetic Sciences (Vol. 2, pp. 1803-1806), 2003
    Bailly, G., Holm, B., "SFC: a trainable prosodic model", Speech Communication 46: 348-364, 2005.
    Benrabah, M. "Word-stress: A source of unintelligibility in English. IRAL", XXXV(3), 157-165, 1997.
    Cortes, C. & Vapnik, V. "Support-vector networks". Machine Learning 20 (3): 273, 1995.
    Coomans, D. & Massart, D.L. "Alternative k-nearest neighbour rules in supervised pattern recognition : Part 1. k-Nearest neighbour classification by using alternative voting rules". Analytica Chimica Acta 136: 15–27, 1982.
    Coniam, D. "Voice Recognition Software Accuracy with Second Language Speakers of English", System 27, 1: 49-64, 1999.
    Chafe WL. Language and consciousness. Language. 1974;50:111–133, 1974.
    Derwing, T. M., & Munro, M. J. "Accent, intelligibility, and comprehensibility: Evidence from four L1s. Studies in Second Language Acquisition", 19(1), 1-16, 1997.
    Domínguez, M., Farrús, M., Burga, A., Wanner, L., "The Information structure-prosody interface revisited". Proceedings of 7th International Conference on Speech Prosody, Dublin, Ireland, 2014.
    Domínguez, M., Farrús, M., Burga, A., Wanner, L., "Using Hierarchical Information Structure for Prosody Prediction in Content-to-Speech Applications". Proceedings of the 8th International Conference on Speech Prosody, Boston, USA, 2016.
    Fujisaki, H. "Information, prosody, and modeling-with emphasis on tonal features of speech", Speech Prosody, Nara, Japan, 2004
    Fujisaki, H., Wang, C., Ohno, S., Gu, W., "Analysis and synthesis of fundamental frequency contours of Standard Chinese using the command–response model", Speech communication 47: 59-70, 2005.
    Gilbert, J. Teaching pronunciation, Cambridge: Cambridge University Press, 2008.
    Gananathan, R.Y., Yin, Y., Ki, K., and Mok, P. "Interlanguage Influence in Cues of Narrow Focus: a study of Hong Kong English", International Conference on Phonetic Sciences 2015, Glasgow UK, August 2015.
    Grabe, E. and Low, E. L. "Durational variability in speech and the rhythm class hypothesis", In Gussenhoven, C. & Warner, N. (eds.) Papers in Laboratory Phonology 7, Berlin, Mouton de Gruyter, 515-546, 2002.
    Grosser, W., "On the acquisition of tonal and accentual features of English by Austrian learners", In A. James & J. Leather (Eds.), Second language speech: Structure and process (pp. 211–228). Berlin: De Gruyter, 1997.
    Gut, U. "Non-Native Speech: A Corpus-Based Analysis of Phonological and Phonetic Properties of L2 English and German. Frankfurt", Germany: Peter Lang, 2009.
    Gut, U., Pillai, S. and Mohd, D. Z., "The prosodic marking of information status in Malaysian English", World Englishes, 32(2), 185-197, 2013.
    Handley, Z. "Is text-to-speech synthesis ready for use in computer-assisted language learning? ", Speech Communication 51(10):906-919 , 2009.
    Hirose, K., Fujisaki, H. and Yamaguchi, M., "Synthesis by rule of voice fundamental frequency contours of spoken Japanese from linguistic information". IEEE, ICASSP, 1984.
    Halliday MAK. "Notes on Transitivity and Theme in English, Part 2", Journal of Linguistics. 1967;3:199–244., 1967
    Hoskins, S. R. The prosody of broad and narrow focus within noun phrases (Doctoral dissertation, ASA).,1997
    Hanssen, J. E. G., Peters, J., & Gussenhoven, C. "Prosodic effects of focus in Dutch declaratives", In Proceedings of Speech Prosody 2008, 2008
    James, E. and Atkinson, J. "Inter‐ and intraspeaker variability in fundamental voice frequency", Acoust. Soc. Am., Vol. 60, No. 2, 1976.
    Jacewicz, E., Fox, R. A. and Wei, L. "Between-speaker and within-speaker variation in speech tempo of American English",J. Acoust. Soc. Am. 128 2, 2010.
    König, E. The meaning of focus particles: A comparative perspective. Routledge, 2002.
    Laver J. The gift of speech, Edinburgh University Press., Edinburgh, UK,1991.
    Laver J. Principles of phonetics, Oxford University Press., Oxford, UK, 1994.
    Lehmann, E. L. "Testing Statistical Hypotheses: The Story of a Book", Statistical Science 12 (1): 48–52, 1997.
    Lambrecht K. Information structure and sentence form. Cambridge, UK: Cambridge University Press; 1994.
    Moustroufas, N. and Digalakis, V. "Automatic pronunciation evaluation of foreign speakers using unknown text", Computer Speech and Language 21, 1: 219-230, 2007.
    Munro, M. J. "Nonsegmental factors in foreign accent: ratings of filtered speech", Studies in Second Language Acquisition, 17, 17-34, 1995.
    Mixdorff, H. "A Novel Approach to the Fully Automatic Extraction of Fujisaki Model Parameters", Proceedings of ICASSP 2000, vol. 3, pages 1281-1284, Istanbul, Turkey, 2000.
    Mixdorff, H. "Speech Technology, ToBI and Making Sense of Prosody", Invited talk at Speech Prosody 2002, pp. 31-38, Aix, France, 2002-1.
    Mixdorff, H, "An Integrated Approach to Modeling German Prosody". Volume 25, Studientexte zur Sprachkommunikation, Dresden, 2002-2.
    Malah,D. "Time-domain algorithms for harmonic bandwidth reduction and time scaling of speech signals", IEEE Transactions on Acoustics, Speech, and Signal Processing. ASSP-27 (2): 121–133, 1979.
    Nakamura, S. "Analysis of Relationship between Duration Characteristics and Subjective Evaluation of English Speech by Japanese learners with regard to Contrast of the Stressed to the Unstressed", Journal of Pan-Pacific Association of Applied Linguistics, 14(1), 1-14, 2010.
    Nguyễn, T.A., Ingram C.L.J, Pensalfini J.R., "Prosodic transfer in Vietnamese acquisition of English contrastive stress patterns", Journal of Phonetics. 2008;36(1):158–190, 2008.
    O'Brien, M. and Gut, U., "Phonological and phonetic realisation of different types of focus in L2 speech", In Dziubalska-Kołaczyk Katarczyna, Wrembel Magdalena, Kul M. (Eds.): , Achievements and perspectives in the acquisition of second language speech: New Sounds 2010., p. 205-215. Frankfurt: Peter Lang, 2010.
    Pedhazur, E J., "Multiple regression in behavioral research: Explanation and prediction" (2nd ed.). New York: Holt, Rinehart and Winston, 1982.
    Perwitasari, A., Klamer, M., Witteman, J. and Schiller, N.O., "Vowel duration in English as a second language among Javanese learners", International Conference on Phonetic Sciences 2015, Glasgow UK, August 2015.
    Peppé, S., Maxim, J. and Wells, B. "Prosodic Variation in Southern British English Language and Speech", 2000.
    Rosenberg, A., Hirschberg, J. and Manis, K. "Perception of English Prominence by Native Mandarin Chinese Speakers", in Proc. Speech Prosody, Chicago, 2010, 100982: pp. 1-4, 2010.
    Ramirez Verdugo, M.D., "Non-native interlanguage intonation systems: a study based on a computerized corpus of Spanish learners of English", ICAME Journal. 2002;26:115–132, 2002.
    Scruton, R. "The eclipse of listening",The New Criterion, 15(30), 5-13, 1996.
    Su, C. Y. and Tseng, C. Y. "L1/L2 Difference in Phonological Sensitivity and Information Planning - Evidence from F0 Patterns", ISCSLP2016, Tianjin, China, 2016-1.
    Su, C. Y. and Tseng, C. Y. "Global F0 Features of Mandarin L2 English - Reflection of Higher Level Planning Difficulties from Discourse Association and Information Structure", Oriental-COCOSDA 2016, Bali, Indonesia, 2016-2.
    Su, C. Y. and Tseng, C. Y. "The Long Road from Phonological Knowledge to Phonetic Realization – An Acoustic Account of the Temporal Composition of Mandarin L2 English", Speech Prosody 2016 16-20. Boston, USA, 2016-3.
    Su, C. Y., Tseng, C. Y. Jang, Roger. J.S. and Visceglia, T. "A hierarchical linguistic information-based model of English prosody: L2 data analysis and implications for computer-assisted language learning". Computer Speech & Language 51: 44-67, 2018.
    Su, C. Y., Tseng, C. Y. Jang, Roger, "Some Prosodic Characteristics of Taiwan English Accent". Computational Linguistics and Chinese Language Processing Vol. 19, No. 4: 61-76, 2014.
    Sityaev, D. & House, J. "Phonetic and phonological correlates of broad, narrow and contrastive focus in English", In 15th ICPhS (Vol. 1822), 2003.
    Saha, S. N. & Mandal, S. K. D. "Discourse prosody planning in native (L1) and nonnative (L2) (L1-Bengali) English: a comparative study", International Journal of Speech Technology, 20(2), 305-326, 2017
    Trofimovich, P., & Baker, W. "Learning second language suprasegmentals: Effect of L2 experience on prosody and fluency characteristics of L2 speech", Studies in Second Language Acquisition, 28, 1-30, 2006.
    Tokuda, K., Zen, H., Black, A.W. "An HMM-based speech synthesis system applied to English", Proc. of 2002 IEEE SSW, Sept. 2002.
    Tseng, C. Y., Pin, S. H., Lee, Y. L., Wang, H. M. and Chen Y.C., "Fluent speech prosody: framework and modeling", Speech Communication, Special Issue on Quantitative Prosody Modelling for Natural Speech Description and Generation 46(34): 284-309, 2005.
    Tseng, C.Y. and Su, Z.Y., "What’s in the F0 of Mandarin Speech –Tone, Intonation and beyond”, ISCSLP 2008 45-48. Kunming, China, 2008.
    Tseng, C. Y., Su, Z. Y., Huang, C. F., & Visceglia, T. "An initial investigation of L1 and L2 discourse speech planning in English", In 7th International Symposium on Chinese Spoken Language Processing (ISCSLP 2010), pp. 55-59, 2010.
    Tseng, C. Y., Su, C. Y. and Visceglia, T. "2013. Underdifferentiation of English Lexical Stress Contrasts by L2 Taiwan Speakers", Slate 2013 164-167. Grenoble, France, 2013.
    Tseng, C. Y. and Su, C. Y. "Prosodic Differences between Taiwanese L2 and North American L1 speakers — Under-differentiation of Lexical Stress”, The 7th Speech Prosody Conference. Dublin, Ireland, 2014
    Tseng, C. Y. and Su, C. Y . " Learning L2 Prosody Is More Difficult than You Realize–
    F0 Characteristics and Chunking Size of L1 English, TW L2 English and TW L1
    Mandarin ", Interspeech 2014, Singapore, 2014.
    Tseng, C.Y., Cheng, Y. C., Lee, W. S. and Huang, F. L. "Collecting Mandarin speech databases for prosody investigations", Oriented COCOSDA 2003. Sentosa, Singapore, 2003.
    Thorén, B., "Swedish accent - duration of post-vocalic consonants in native swedes speaking English and German", International Conference on Phonetic Sciences 2007, Saarbrücken Germany, August 2007.
    Witt, S. M. and Young, S. J. "Phone-level pronunciation scoring and assessment for interactive language learning", in Speech Communication, 30 (2-3), pp. 95-108, 2000.
    Wennerstrom, A., "Intonational meaning in English discourse: A study of non-native speakers", Applied Linguistics, 15, 399–420, 1994.
    Utgoff, P. E. "Incremental induction of decision trees", Machine learning, 4(2), 161-186, 1989.
    Visceglia, T., Tseng, C.Y., Kondo, M., Meng, H. and Sagisaki, Y. "Phonetic aspects of content design in AESOP (Asian English Speech cOrpus Project)”, Oriental COCOSDA 2009 6 pages. Beijing, China, 2009.
    Visceglia, T., Su, C. Y. and Tseng, C. Y. "Comparison of English Narrow Focus Production by L1 English, Beijing and Taiwan Mandarin Speakers", Oriental COCOSDA 2012 47-51. Macau, China, 2012.
    Xu, Y. "Speech melody as articulatorily implemented communicative functions", Speech Communication. 46, 220–251, 2005.
    Zellner, K. B. and Keller, E. "representing speech rhythm", Improvement in Speech Synthesis 154-164, 2001.

    QR CODE