簡易檢索 / 詳目顯示

研究生: 陳柏辰
Chen, Bo Chen
論文名稱: 一種適用於協同過濾式推薦系統的分散式架構
A Distributed Framework for Performing Tag-Based Collaborative Filtering on Recommender Systems
指導教授: 王家祥
Wang, Jia Shung
口試委員: 葉梅珍
Yeh, Mei Chen
陳弘軒
Chen, Hung Hsuan
王家祥
Wang, Jia Shung
學位類別: 碩士
Master
系所名稱: 電機資訊學院 - 資訊工程學系
Computer Science
論文出版年: 2015
畢業學年度: 103
語文別: 英文
論文頁數: 45
中文關鍵詞: 分散式儲存系統推薦系統資料分散方式錯誤容忍
外文關鍵詞: Distributed storage system, Recommender system, Data distribution, Fault tolerance
相關次數: 點閱:1下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 使用者產生的資訊因為網際網路的成長越來越重要。使用使用者資訊去做推薦的推薦系統變的越來越受到注目。推薦系統可以簡單地分成兩種分類:以內容為基礎的推薦系統及協同過濾式推薦系統 為了處理大量的資料,對於推薦系統來說平行處理是必要的手段。然而,當在實現平行處理時,資料傳送造成的封包遺失也是一個要被解決的重要議題。如果資料的遺失太嚴重,整個系統的準確度就會受到影響。分散式儲存系統也許會是一種解決方法。推薦系統的平行應用可以在使用可運算的儲存系統環境下被實現。在此之外,恢復資料的需求也可以在此環境被滿足。
    在本論文中,我們提出了一個適用於協同過濾式推薦系統的分散式架構。在我們提出的分散式系統當中,整個推薦系統可以平行處理資料。我們主要的貢獻為我們提出了一個分散資料的演算法來使整個系統的效能更好。我們使用了一個簡單但是有效的資料分散演算法去減少資料傳輸的頻寬及改進整個系統的準確度。另一方面,當資料遺失時,資料可以被完全的恢復回來。我們採用了使用FMSR編碼的NCCloud來當作分散式儲存系統方面。
    在我們的實驗結果中,我們比較我們所提出的分散資料演算法及隨機分配的方法。我們也比較了我們所提出的架構與單機執行的時間加速。最後資料是否恢復的影響的討論也會被呈現。


    User-generated information is greatly increasing because of the growing of Internet. Recommender systems which make use of users’ information are getting more and more attention. The recommender system can be simply separate as content-based and collaborative filtering. To deal with large scale of data, the parallelism is more and more necessary for recommender systems. However, when we implement the parallelism, the data loss which is because of data transmission is also an important issue to be solved. If the problem of data loss is too serious, the accuracy of whole system might greatly influence. The distributed storage system might be a solution of this problem. With a computable storage environment, the parallelism of recommender system can be implemented on it. Moreover, the requirement of repairing data can also be satisfied.
    In this thesis, we propose a framework that combines movie recommender system with a proxy-based distribution storage environment. In our proxy-based framework, whole movie recommender system can process in parallel. The main contribution of our proposed framework is that we propose a clustering algorithm to improve our framework. We use simple but efficient method to scatter data to reduce bandwidth of data transmission and improve the accuracy of whole framework. On the other hand, when the data is lost, the data can be repaired completely. In distribution storage, the method we adopt is NCCloud which implements FMSR codes.
    In our experiments, we compare proposed clustering algorithm with uniform distribution in root mean square error (RMSE), bandwidth, and execution time. We also show the difference in speedup of single node and our proposed framework. Finally, the discussion of difference whether data is repaired is present.
    Keywords: Distributed storage system, Recommender system, Data distribution, Fault tolerance

    ABSTRACT 1 中文摘要 3 CONTENTS 4 LIST OF FIGURES 5 LIST OF TABLES 6 Chapter 1. Introduction 7 Chapter 2. Related Works 10 2.1 Recommender System 10 2.1.1 Recommender System Category 10 2.1.2 Studies Using Information of Tags and Ratings 13 2.1.3 Weighted Tag-Rating Recommender System 14 2.1.4 Predict Function of Ratings 17 2.2 Distributed Storage 18 2.2.1 Regeneration Codes 19 2.2.2 NCCloud 19 Chapter 3. Proposed Framework and Methods 22 3.1 Data Structure 22 3.2 Distributed Recommender Model 22 3.3 The Main Flowchart 24 3.3.1 Main Recommender System Part 25 3.3.2 Movie Ratings Prediction and Repair Part 26 3.4 Data Distribution 27 Chapter 4. Experimental Setup and Results 30 4.1 Experimental Design and Setup 30 4.1.1 Dataset 30 4.1.2 Evaluation Metrics 30 4.1.3 Experimental Environment 31 4.1.4 Experimental Design 32 4.2 Experimental Result and Discussion 33 4.2.1 Data Distribution 33 4.2.2 Data Loss Repair 39 Chapter 5. Conclusion and Future Work 43 REFERENCES 44

    [1] M. Balabanovic and Y. Shoham, “Fab: content-based, collaborative recommendation,” Communications of the ACM, vol.40, no. 3, pp.66-72, March 1997.
    [2] M. Armbrust, A. Fox, R. Griffith, A. D. Joseph, R. Katz, A. Konwinski, G. Lee, D. Patterson, A. Rabkin, I. Stoica, and M. Zaharia, “View of Cloud Computing,” Communications of the ACM, vol. 53, no. 4, pp.50-58, April 2010.
    [3] A. Stanescu, S. Nagar,and D. Caragea, “A Hybrid Recommender System: User Profiling from Keywords and Ratings,” IEEE/WIC/ACM International Conferences on WI and IAT, vol. 1, pp. 73-80, Nov. 2013.
    [4] H. C. H. Chen, Y. Hu, P. P. C. Lee, and Yang Tang, “NCCloud: A Network-Coding-Based Storage System in a Cloud-of-Clouds,” IEEE Trans. on Computers, vol. 63, no. 1, pp. 31-44 , Aug. 2013.
    [5] A. G. Dimakis, P. B. Godfrey, Y. Wu, M.J. Wainwright, and K. Ramchandran,
    “Network Coding for Distributed Storage Systems,” IEEE Trans. on Information Theory, vol. 56, no. 9, pp.4539-4551, Sept. 2010.
    [6] Y. Hu, P. P. C. Lee, and K. W. Shum, “Analysis and Construction of Functional Regenerating Codes with Uncoded Repair for Distributed Storage Systems,” Proceedings Of the 2013 IEEE INFOCOM, pp. 2355-2363, 2013.
    [7] M. de Gemmis, P. Lops, G. Semeraro, and P. Basile, “Integrating tags in a semantic content-based recommender,” Proceedings of the 2008 ACM Conference on Recommender Systems, pp. 163-170, 2008.
    [8] X. Amatriain, J. M. Pujol, and N. Oliver, “I Like It... I Like It Not: Evaluating user ratings noise in recommender systems,” User Modeling, Adaptation, and Personalization, Lecture Notes in Computer Science, vol.5535, pp. 247-258, 2009.
    [9] C. Jones, J. Ghosh, and A. Sharma, “Learning multiple models for exploiting predictive heterogeneity in recommender systems,” Proceedings of the 2nd International Workshop on Information Heterogeneity and Fusion in Recommender Systems, pp. 17-24, 2011.
    [10] E. Bothos, K. Christidis, D. Apostolou, and G. Mentzas, “Information market based recommender systems fusion,” Proceedings of the 2nd International Workshop on Information Heterogeneity and Fusion in Recommender Systems, pp. 1-8, 2011.
    [11] H. Liang, Y. Xu, Y. Li, R. Nayak, and G. Shaw, “A hybrid recommender systems based on weighted tags,” SIAM International Conference on Data Mining, 2010.
    [12] K. M. Greenan, E. L. Miller, and T. J. E. Schwarz, “Optimizing Galois Field Arithmetic for Diverse Processor Architectures and Applications,” IEEE International Symposium on MASCOTS, pp. 1-10, 2008.
    [13] L. Zhen, Z. Jiang, H. Song, “Distributed recommender for peer-to-peer knowledge sharing,” Information Sciences, vol. 180, no. 18, pp. 3546-3561, Sep. 2010.
    [14] J. L. Herlocker, J. A. Konstan, L. G. Terveen, and J. T. Riedl, “Evaluating collaborative filtering recommender systems,” ACM Transactions on Information Systems, vol. 22, no. 1, pp. 5-53, January 2004.
    [15] GroupLens: http://grouplens.org/datasets/movielens/
    [16] Wikipedia-K-medoids: https://en.wikipedia.org/wiki/K-medoids

    無法下載圖示 全文公開日期 本全文未授權公開 (校內網路)
    全文公開日期 本全文未授權公開 (校外網路)

    QR CODE