研究生: |
楊庭瑄 Yang, Ting-Hsuan |
---|---|
論文名稱: |
運用文本探勘技術於交易投資策略:以LDA模型辨別主題 Applying Techniques of Text Mining on Trading Investment Strategy:an LDA Approach to Distinguish the Topics |
指導教授: |
張焯然
Chang, Jow-Ran |
口試委員: |
劉鋼
Liu, Kang 蔡璧徽 Tsai, Pi-Hui |
學位類別: |
碩士 Master |
系所名稱: |
|
論文出版年: | 2017 |
畢業學年度: | 105 |
語文別: | 英文 |
論文頁數: | 61 |
中文關鍵詞: | 文本探勘 、LDA模型 、聯準會會議記錄 、S&P 500指數 、交易策略 |
外文關鍵詞: | text mining, LDA model, minutes of FOMC, S&P 500, trading strategy |
相關次數: | 點閱:2 下載:0 |
分享至: |
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
情緒分析是近年來在文本探勘領域中被熱烈討論的一項議題,它的應用十分 多元,可以被應用於網路資訊安全的探測、總統大選的預測甚至是購物網站上的 推薦系統等等,而本研究則將情緒分析應用於交易策略上,對聯準會 (Federal Reserve) 的會議記錄做情緒分析來預測股票的報酬率,並先以 LDA (Latent Dirichlet Allocation) 主題模型來探討文章中的潛在主題,研究目的在於分辨 與聯準會相關的文本資料中與經濟財金議題比較不相關的段落並將這些段落刪 去後,期望能夠更精準地捕捉到投資人對於股票市場的情緒,依據這樣的研究發 現,擬定出一項具有可獲利性的交易投資策略。
此研究以 Tetlock (2007) 以及 Tetlock, Saar-Tsechansky, and MacSkassy (2008) 的論文為發想,先以 LDA 模型分辨出文章中與經濟財金議題不相干的詞 彙,刪去部分包含這些詞彙的段落後,再依據每篇文章建構出來的情緒指數對應 並產出合適的交易建議,最後在檢驗這項交易投資策略的績效之後,做一些適當的調整來做改善。
Sentiment analysis has triggered a heated discussion in recent years,
and it can be widely used in various kinds of fields. For example, It can be applied on the detection of network security, the prediction of the president election, the recommendation system on the shopping website, and so on. This thesis aims to apply the sentiment analysis on the trading investment strategy and make use of the articles of Federal Reserve to do the sentiment analysis to predict the return rate of stocks. Moreover, the thesis uses the topic model of latent dirichlet allocation to investigate the latent topics from the articles of Federal Reserve, and the goal is to distinguish the topics which influence the return rate of stock the most from the articles of Federal Reserve. Finally, my research expects to frame a lucrative trading investment strategy based on the research results.
The thesis is inspired by the researches of Tetlock (2007) and Tetlock, Saar-Tsechansky, and MacSkassy (2008). First, I will use the topic model of latent dirichlet allocation to classify the words according to different topics. Second, I will eliminate the paragraph which is irrelevant to finance in order to assess the exact financial sentiment and to apply it on investment trading strategy. Last but not least, I will add the derivatives into the investment trading strategy so as to hedge the loss from the wrong prediction of sentiment, and then I will examine the performance of the investment trading strategy after the modification.
References
Black, F., & Scholes, M. (1973). The pricing of options and corporate liabilities. Journal of political economy, 81(3), 637-654.
Cutler, D. M., Poterba, J. M., & Summers, L. H. (1988). What Moves Stock Prices? NBER Working Paper(w2538).
Hu, M., & Liu, B. (2004). Mining and summarizing customer reviews. Paper presented at the Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining.
Huang, X., Teoh, S. H., & Zhang, Y. (2013). Tone management. The Accounting Review, 89(3), 1083-1113.
Lavrenko, V., Schmill, M., Lawrie, D., Ogilvie, P., Jensen, D., & Allan, J. (2000). Mining of concurrent text and time series. Paper presented at the KDD-2000 Workshop on Text Mining.
Loughran, T., & McDonald, B. (2009a). Plain English, readability, and 10-K filings. Retrieved from
Loughran, T., & McDonald, B. (2009b). When is a Liability not a Liability? Journal of Finance, forthcoming.
Loughran, T., & McDonald, B. (2011). When is a liability not a liability? Textual analysis, dictionaries, and 10‐Ks. The Journal of Finance, 66(1), 35-65.
Loughran, T., & McDonald, B. (2014a). Measuring readability in financial disclosures. The Journal of Finance, 69(4), 1643-1671.
Loughran, T., & McDonald, B. (2014b). Regulation and financial disclosure: The impact of plain English. Journal of Regulatory Economics, 45(1), 94-113.
Loughran, T., & McDonald, B. (2015). Textual analysis in accounting and finance: A survey. University of Notre Dame Working Paper.
54
Loughran, T., McDonald, B., & Yun, H. (2009). A wolf in sheep’s clothing: The use of ethics-related terms in 10-K reports. Journal of Business Ethics, 89(1), 39-49.
Tetlock, P. C. (2007). Giving content to investor sentiment: The role of media in the stock market. The Journal of Finance, 62(3), 1139-1168.
Tetlock, P. C., SAAR‐TSECHANSKY, M., & Macskassy, S. (2008). More than words: Quantifying language to measure firms' fundamentals. The Journal of Finance, 63(3), 1437-1467.
Lu, Y. Y. (2014). The Information Content of Risk Factor Disclosures in Annual Reports, National Taiwan University, Taipei City.
Lin, I. H. (2013). Creating and Verifying Sentiment Dictionaryof Finance and Economics via Financial News, National Taiwan University, Taipei City.
Huang, C. C. (2012). Text mining of corporate annual report and its information content in predicting financial distress, Feng Chia University, Taichung City.
Hsieh, S. W. (2010). Using Text Mining Technique for Financial Statement Disclosures, National Chung Cheng University,Chiayi County.