討論文本與知識地圖之知識節點對映方法｜國立清華大學博碩士論文庫

簡易檢索 / 詳目顯示

回結果列表

研究生：	方同德 Fang, Tung-Te
論文名稱：	討論文本與知識地圖之知識節點對映方法 Mapping of Discussion Text to Knowledge Nodes of Knowledge Map
指導教授：	黃能富 Huang, Nen-Fu
口試委員:	陳俊良 Chen, Jun-Liang 許建平 Sheu, Jang-Ping
學位類別：	碩士 Master
系所名稱：	電機資訊學院 - 資訊工程學系 Computer Science
論文出版年：	2019
畢業學年度：	107
語文別：	英文
論文頁數：	49
中文關鍵詞：	深度學習、知識地圖、LSTM 、磨課師、NLP 、文本分類
外文關鍵詞：	Deep Learning, Knowledge Map, LSTM, MOOCs, NLP, Text Classification
相關次數：	點閱：1 下載：0
分享至:	分享至facebook 分享至twitter

查詢本校圖書館目錄查詢臺灣博碩士論文知識加值系統勘誤回報

現今許多MOOCs 的學習平台都有課程討論區，但是幾乎沒有平台的討論區會做到文
章分類管理。原因很簡單，第一點人工分類耗時耗力，第二點沒有好的分類依據。在
討論區並沒有做到好的分類管理的狀況下，學生使用討論區沒有辦法有效率的找到與
自己疑問相關的討論串，或是在大量的文章中找不到有系統的學習資訊。種種原因大
大降低了學生使用討論區的意願，也讓許多好的討論文章無法被多數學生看到。
因此本研究提出了一種分類討論區文章的深度學習方法，我們透過google crawler,
exercise, 舊討論區文章當作訓練資料，並以知識地圖的知識節點作為分類基礎，透過
NLP 的處理步驟，使用處理過後的資料訓練一個LSTMs 模型來預測討論區文章屬於
哪個知識節點分類
最後，我們也將討論區文章預測的結果結合知識地圖，讓學生能夠更有效率的找到
感興趣的文章。我們認為，透過討論區文章分類系統，未來能夠更好的管理並分類討
論區文章。

Many MOOCs learning platforms have course discussion forums today, but there
are few platform discussion forums that have article classification. The reason is
that manual classification is time-consuming and labor-intensive, and there is no
good classification basis. In the absence of good classification management in the
discussion forum, there is no way for students to use the discussion forum efficiently.
Various reasons have greatly reduced the willingness of students to use
the discussion forum, and many good discussion articles cannot be seen by most
of students. Therefore, this thesis proposed a deep learning method for categorizing
discussion forum articles. We use google crawler, exercise, and old discussion
forum articles as training data, and use the knowledge nodes of knowledge maps
as the classification basis. Through NLP steps, processed data use to train an
LSTMs model to predict which knowledge node classification the discussion forum
article belongs to. Finally, we also combine the predicted results of articles with
the knowledge map to enable students to find articles of interest more efficiently.
We believe that through the discussion forum articles classification system, the
discussion forum articles can be better managed and classified in the future.

Abstract i
中文摘要ii
Contents iii
List of Figures vi
List of Tables viii
Chapter 1 Introduction 1
Chapter 2 Background and Related Works 5
1 MOOCs platform . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.1 Coursera . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.2 edX . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.3 ShareCourse . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2 Knowledge Map . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2.1 Khan Academy . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.2 ShareCourse . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
3 Deep Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
3.1 Deep Neural Network . . . . . . . . . . . . . . . . . . . . . 10
3.2 Long Short Term Memory network . . . . . . . . . . . . . . 12
4 Discussion forum classification . . . . . . . . . . . . . . . . . . . . . 16
4.1 Nature Language Processing . . . . . . . . . . . . . . . . . . 17
4.2 Text Classification . . . . . . . . . . . . . . . . . . . . . . . 17
4.2.1 Text Representation . . . . . . . . . . . . . . . . . 18
4.2.2 Classifier Construction . . . . . . . . . . . . . . . 18
4.2.3 Classifier Evaluation . . . . . . . . . . . . . . . . . 18
4.3 Different Discussion Forum Classification . . . . . . . . . . . 18
Chapter 3 System Architecture 20
1 Data Collection module . . . . . . . . . . . . . . . . . . . . . . . . 23
2 Data Pre-processing module . . . . . . . . . . . . . . . . . . . . . . 23
3 LSTM module . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
4 User Interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
Chapter 4 System Implementation 25
1 Data Collection module . . . . . . . . . . . . . . . . . . . . . . . . 26
1.1 Google Crawler . . . . . . . . . . . . . . . . . . . . . . . . . 27
1.2 Exercise . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
1.3 Discussion Forum Articles . . . . . . . . . . . . . . . . . . . 28
2 Data Pre-processing module . . . . . . . . . . . . . . . . . . . . . . 29
2.1 Data Balance . . . . . . . . . . . . . . . . . . . . . . . . . . 29
2.2 Word Segmentation . . . . . . . . . . . . . . . . . . . . . . 30
2.3 Translation . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
2.4 Remove Stop Words . . . . . . . . . . . . . . . . . . . . . . 31
3 LSTM model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
3.1 Embedding Layer . . . . . . . . . . . . . . . . . . . . . . . . 33
3.2 LSTM layer . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
3.3 Dense layer . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
4 User Interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
Chapter 5 Experiment and Result 35
1 Experiment Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
2 Accuracy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
Chapter 6 Conclusion and Future Work 41
1 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
2 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
Bibliography 43
                                

[1] T. Brahimi and A. Sarirete, “Learning outside the classroom through
moocs,” Comput. Hum. Behav., vol. 51, no. PB, pp. 604–609, Oct. 2015.
[Online]. Available: http://dx.doi.org/10.1016/j.chb.2015.03.013
[2] C. Gütl, R. H. Rizzardini, V. Chang, and M. Morales, “Attrition in mooc:
Lessons learned from drop-out students,” in Learning Technology for Education
in Cloud. MOOC and Big Data, L. Uden, J. Sinclair, Y.-H. Tao, and
D. Liberona, Eds. Cham: Springer International Publishing, 2014, pp. 37–48.
[3] K. S. Hone and G. R. E. Said, “Exploring the factors affecting mooc
retention: A survey study,” Computers & Education, vol. 98, pp. 157 – 168,
2016. [Online]. Available: http://www.sciencedirect.com/science/article/pii/
S0360131516300793
[4] D. Onah, J. Sinclair, and R. Boyatt, “Exploring the use of mooc discussion
forums,” 11 2014.
[5] J. Liang, C. Li, and L. Zheng, “Machine learning application in moocs:
Dropout prediction,” in 2016 11th International Conference on Computer
Science Education (ICCSE), Aug 2016, pp. 52–57.
[6] J. A. Baxter and J. Haycock, “Roles and student identities in online large
course forums: Implications for practice,” The International Review of
Research in Open and Distributed Learning, vol. 15, no. 1, Jan. 2014. [Online].
Available: http://www.irrodl.org/index.php/irrodl/article/view/1593
[7] B. Schweizer, “Confessions of an unreconstructed mooc(h)er,” 2013.
[Online]. Available: https://www.researchgate.net/publication/309032104_
Exploring_the_use_of_MOOC_discussion_forums
[8] A. Ng and D. Koller, “Coursera,” Retrieved June 29, 2018, from the World
Wide Web:https://zh-tw.coursera.org, 2012.
[9] M. I. of Technology and H. University, “edx,” Retrieved June 29, 2018, from
the World Wide Web:https://www.edx.org, 2012.
[10] N. T. University, “Sharecourse,” Retrieved April 16, 2017, from the World
Wide Web:http://www.sharecourse.net/sharecourse/, 2012.
[11] Y. C. Cheng, J. W. Tzeng, N. F. Huang, C. A. Lee, and M. L. Kuo, “Development
of alternative conception diagnostic system based on item response
theory in moocs,” in Proceedings of the 25th International Conference on
Computers in Education (ICCE 2017). New Zealand: Asia-Pacific Society
for Computers in Education, 2017, pp. 469 – 474.
[12] H. M. Chang, T. M. L. Kuo, S. C. Chen, C. A. Li, Y. W. Huang, Y. C.
Cheng, H. H. Hsu, N. F. Huang, and J. W. Tzeng, “Developing a data-driven
learning interest recommendation system to promoting self-paced learning on
moocs,” in 2016 IEEE 16th International Conference on Advanced Learning
Technologies (ICALT), 2016, pp. 23–25.
[13] N. F. Huang, I. H. Hsu, C. A. Lee, H. C. Chen, J. W. Tzeng, and T. T. Fang,
“The clustering analysis system based on students’ motivation and learning
behavior,” in 2018 Learning With MOOCS (LWMOOCS), Sep. 2018, pp. 117–
119.
[14] N. F. Huang, C. A. Lee, Y. W. Huang, P. W. Ou, H. H. Hsu, S. C. Chen,
and J. W. Tzeng, “On the automatic construction of knowledge-map from
handouts for mooc courses,” in Advances in Intelligent Information Hiding
and Multimedia Signal Processing, J.-S. Pan, P.-W. Tsai, J. Watada, and L. C.
Jain, Eds. Cham: Springer International Publishing, 2018, pp. 107–114.
[15] J. H. Lee and A. Segev, “Knowledge maps for e-learning,” Computers &
Education, vol. 59, no. 2, pp. 353–364, 2012.
[16] S. Khan, “Knowledge map from khan academy,” Retrieved June
29, 2018, from the World Wide Web:https://www.khanacademy.org/
exercisedashboard, 2007.
[17] Y. LeCun, Y. Bengio, and G. Hinton, “Deep learning,” Nature, vol. 521, no.
7553, pp. 436–444, 5 2015.
[18] I. J. Goodfellow, Y. Bengio, and A. Courville, Deep Learning. Cambridge,
MA, USA: MIT Press, 2016, http://www.deeplearningbook.org.
[19] J. Schmidhuber, “Deep learning in neural networks: An overview,” Neural
Networks, vol. 61, pp. 85–117, 2015, published online 2014; based on TR
arXiv:1404.7828 [cs.NE].
[20] K. Hornik, M. Stinchcombe, and H. White, “Multilayer feedforward networks
are universal approximators,” Neural Netw., vol. 2, no. 5, pp. 359–366, Jul.
1989. [Online]. Available: http://dx.doi.org/10.1016/0893-6080(89)90020-8
[21] G. Cybenko, “Approximation by superpositions of a sigmoidal function,”
Mathematics of Control, Signals, and Systems (MCSS), vol. 2, no. 4, pp. 303–
314, Dec. 1989. [Online]. Available: http://dx.doi.org/10.1007/BF02551274
[22] V. Nair and G. E. Hinton, “Rectified linear units improve restricted
boltzmann machines,” in Proceedings of the 27th International Conference
on International Conference on Machine Learning, ser. ICML’10. USA:
Omnipress, 2010, pp. 807–814. [Online]. Available: http://dl.acm.org/
citation.cfm?id=3104322.3104425
[23] The Standards Task Force and American Society of Colon and Rectal
Surgeons, “Practice parameters for sigmoid diverticulitis,” Diseases of the
Colon & Rectum, vol. 38, no. 2, pp. 125–125, Feb 1995. [Online]. Available:
https://doi.org/10.1007/BF02052438
[24] S. Hochreiter and J. Schmidhuber, “Long short-term memory,” Neural
Comput., vol. 9, no. 8, pp. 1735–1780, Nov. 1997. [Online]. Available:
http://dx.doi.org/10.1162/neco.1997.9.8.1735
[25] D. E. Rumelhart, G. E. Hinton, and R. J. Williams, “Learning
Representations by Back-propagating Errors,” Nature, vol. 323, no. 6088,
pp. 533–536, 1986. [Online]. Available: http://www.nature.com/articles/
323533a0
[26] C. T. Duong, R. Lebret, and K. Aberer, “Multimodal classification for
analysing social media,” 2017. [Online]. Available: https://arxiv.org/abs/
1708.02099
[27] S. Li, “Multi-class text classification with lstm,” Retrieved July
10, 2019, from the World Wide Web:https://towardsdatascience.com/
multi-class-text-classification-with-lstm-1590bee1bd17, 2019.
[28] T. Hastie, J. Friedman, and R. Tibshirani, Unsupervised Learning. New
York, NY: Springer New York, 2001, pp. 437–508. [Online]. Available:
https://doi.org/10.1007/978-0-387-21606-5_14
[29] F. Sebastiani, “Machine learning in automated text categorization,” ACM
Comput. Surv., vol. 34, no. 1, pp. 1–47, Mar. 2002. [Online]. Available:
http://doi.acm.org/10.1145/505282.505283
[30] R. Xu and D. Wunsch, II, “Survey of clustering algorithms,” Trans.
Neur. Netw., vol. 16, no. 3, pp. 645–678, May 2005. [Online]. Available:
https://doi.org/10.1109/TNN.2005.845141
[31] Y. Yang and J. O. Pedersen, “A comparative study on feature selection
in text categorization,” in Proceedings of the Fourteenth International
Conference on Machine Learning, ser. ICML ’97. San Francisco, CA, USA:
Morgan Kaufmann Publishers Inc., 1997, pp. 412–420. [Online]. Available:
http://dl.acm.org/citation.cfm?id=645526.657137
[32] X. Wei, H. Lin, L. Yang, and Y. Yu, “A convolution-lstm-based deep neural
network for cross-domain mooc forum post classification,” Information, vol. 8,
p. 92, 07 2017.
[33] A. Ezen-Can, K. E. Boyer, S. Kellogg, and S. Booth, “Unsupervised modeling
for understanding mooc discussion forums: A learning analytics approach,”
in Proceedings of the Fifth International Conference on Learning Analytics
And Knowledge, ser. LAK ’15. New York, NY, USA: ACM, 2015, pp.
146–150. [Online]. Available: http://doi.acm.org/10.1145/2723576.2723589
[34] A. W. Wong, K. Wong, and A. Hindle, “Tracing forum posts to
mooc content using topic analysis,” 2019. [Online]. Available: https:
//arxiv.org/abs/1904.07307
[35] Google Brain Team, “Tensorflow,” Retrieved June 29, 2018, from the World
Wide Web:https://github.com/tensorflow/tensorflow, 2015.
[36] G. van Rossum, “Python,” Retrieved July 10, 2019, from the World Wide
Web:www.python.org, 1990.
[37] wention, “beautifulsoup4,” Retrieved July 10, 2019, from the World Wide
Web:https://github.com/wention/BeautifulSoup4, 2015.
[38] H. He and E. A. Garcia, “Learning from imbalanced data,” IEEE Transactions
on Knowledge and Data Engineering, vol. 21, no. 9, pp. 1263–1284, Sep. 2009.
[39] fxsjy, “jieba,” Retrieved July 2, 2019, from the World Wide Web:https://
github.com/fxsjy/jieba, 2019.
[40] ssut, “googletrans,” Retrieved July 10, 2019, from the World Wide Web:https:
//github.com/ssut/py-googletrans, 2019.
[41] Kyubyong, “wordvectors,” Retrieved July 10, 2019, from the World Wide
Web:https://github.com/Kyubyong/wordvectors, 2017.
[42] National Tsinghua University, “2017 introduction to computer networks (autumn),”
Retrieved July 10, 2019, from the World Wide Web:http://www.
sharecourse.net/sharecourse/course/view/courseInfo/1246, 2017.
[43] ——, “2016 introduction to computer networks (autumn),” Retrieved July 10,
2019, from the World Wide Web:http://www.sharecourse.net/sharecourse/
course/view/courseInfo/908, 2016.
[44] ——, “2015 introduction to computer networks (autumn),” Retrieved July 10,
2019, from the World Wide Web:http://www.sharecourse.net/sharecourse/
course/view/courseInfo/568, 2015.
[45] ——, “2018 introduction to computer networks (autumn),” Retrieved July 10,
2019, from the World Wide Web:http://www.sharecourse.net/sharecourse/
course/view/courseInfo/1620, 2018.
[46] ——, “2018 introduction to computer networks (spring),” Retrieved July 10,
2019, from the World Wide Web:http://www.sharecourse.net/sharecourse/
course/view/courseInfo/1406, 2018.

簡易檢索 / 詳目顯示

相關論文