用於影片推薦的DNN-RNN 集成系統｜國立清華大學博碩士論文庫

簡易檢索 / 詳目顯示

回結果列表

研究生：	詹其侁 Chan, Chi-Shen
論文名稱：	用於影片推薦的DNN-RNN 集成系統 DRIVER: DNN-RNN Integration for Video Recommendation
指導教授：	韓永楷 Hon, Wing-Kai
口試委員:	李哲榮 Lee, Che-Rung 陳柏安 Chen, Po-An
學位類別：	碩士 Master
系所名稱：	電機資訊學院 - 資訊系統與應用研究所 Institute of Information Systems and Applications
論文出版年：	2023
畢業學年度：	111
語文別：	英文
論文頁數：	35
中文關鍵詞：	推薦系統、影片表示
外文關鍵詞：	recommendation system, video representation
相關次數：	點閱：102 下載：0
分享至:	分享至facebook 分享至twitter

查詢本校圖書館目錄查詢臺灣博碩士論文知識加值系統勘誤回報

當我們觀看影片時，非常需要仰賴推薦系統幫助我們找到自己喜歡的影片。除了傳統上的「內容過濾法」(content-based filtering) 跟「協同過濾法」(collaborative filtering) 之外，也有更進一步去分解「用戶評分矩陣」(rating matrix) 的方法。最前沿的一些研究使用了神經網絡去預測用戶的點擊率，以及預測觀看時數等，
其在推薦系統上都很好。但我們觀察到，以上這些方法，都需要仰賴使用者的歷史資料做訓練，像是用戶評分矩陣或者用戶之前觀看其他影片的資料等，而這些資料並非那麼容易便能得到。

在本研究中，我們希望可以在訓練資料時不借助使用者或者其他使用者的一些歷史資料。要做到這件事情，我們首先根據經驗，假設使用者在現實世界是可以被歸類成幾群不同的類別。這項假設是基於，我們發現使用者會有自己的偏好，而根據偏好，可能有些使用者會特別關注體育賽事，有些使用者則是特別喜歡動漫，當然也有使用者同時喜歡多個類別的項目。依據其喜好，我們便可以把使用者歸類成某一特定類別。有了這項假設之後，我們希望提出一種模型，只有藉助影片的資訊來將使用者分群，而不需要
用戶評分矩陣等資料。

除此之外，因為要分析影片很費勁，所以傳統的推薦系統資料集像是 Netflix、ovieLens 等，幾乎都是用影片的一些基本資訊，而沒有包含影片片段內容的分析。對於本研究而言，這些資料集是不充分的，故此我們自己蒐集一些影片並建立了一個新的資料集，裡面包含了為影片資訊及從影片內容提取的 100 $\times$ 1024 維度特徵。
我們以此資料集進行實驗，發現在我們提出結合 DNN 及 RNN 兩種不同特性的神經網絡模型的分類器下，能有效地為使用者進行分類，達到不需要使用者用戶評分矩陣等資料作為訓練資料，而且在其後的推薦，能與 YouTube 提供的推薦相類。

When we watch videos, we rely heavily on recommendation systems to help us find the ones we like. In addition to traditional content-based filtering and collaborative filtering, one may further decompose users' rating matrix to achieve better results. State-of-the-art methods use neural network to predict the user's click rate, watch time, etc, which can be further applied to produce very good recommendation system. Yet, these methods need users' historical view records, or the rating matrix, for training, and such data may not be readily available.

In this study, we hope that we can avoid the use of such data during training.
To do this, we make the following assumption:
Users can be classified into several different categories in the real world.
This assumption is based on the fact that users will have their individual preferences of what they like to watch,
say, some users may pay special attention to sports events,
some users may be anime fanatics,
and some users may simultaneously like multiple categories of videos.
Based on the preferences, users can then be classified into specific groups.
With this assumption, our target is to obtain a model to group users,
purely using the video information and without using data such as a rating matrix.

Since it is very laborious to analyze the video contents, most of the
traditional datasets for recommendation systems, such as the Netflix dataset and the MovieLens dataset, only collect basic information about the videos, and
do not contain any extracts or analyses of the video frames as the data.
For our current study, such datasets are far from sufficient, so that
we have collected videos by ourselves, and produce a new dataset that
contains video basic information, together with a 100 $\times$ 1024 feature vector per video, to represent the video contents. We have used this dataset to train our proposed DNN-RNN integrated model (called DRIVER) for classifying users, which successfully classifies users without using extra information from the users, and then offers recommendations similar to that by YouTube.

Abstract (Chinese)
Abstract
Contents
List of Figures
List of Tables
INTRODUCTION-------------------------------------------1
RELATED WORK-------------------------------------------5
1 Video Embedding--------------------------------------5
2 The Wide and Deep Model------------------------------7
2 UGC Video Recommendation-----------------------------7
3 Existing Video Dataset-------------------------------9
Our Model---------------------------------------------10
1 Model Architecture----------------------------------11
2 Training with DRIVER--------------------------------12
3 Application-----------------------------------------14
Our Dataset-------------------------------------------16
1 The YouTube Features Dataset------------------------16
2 Collecting Videos from YouTube----------------------17
3 Extracting Features from Video----------------------17
EXPERIMENT--------------------------------------------19
1 Video Embedding-------------------------------------19
2 Effectiveness of DNN and RNN in DRIVER--------------20
3 Accuracy--------------------------------------------23
4 Simulation Experiment-------------------------------24
5 Recommendation Result-------------------------------26
CONCLUSION--------------------------------------------30
Bibliography--------------------------------------------31
                                

1. Sami Abu-El-Haija, Nisarg Kothari, Joonseok Lee, Paul Natsev, George Toderici, Balakrishnan Varadarajan, and Sudheendra Vijayanarasimhan. Youtube-8m: A large-scale video classification benchmark. arXiv preprint arXiv:1609.08675, 2016.
2. Massimiliano Albanese, Antonio d’Acierno, Vincenzo Moscato, Fabio Persia, and Antonio Picariello. A multimedia recommender system. ACM Transactions on Internet Technology (TOIT), 13(1):1–32, 2013.
3. Chen Almagor and Yedid Hoshen. You say factorization machine, i say neural network-it’s all in the activation. In Proceedings of the 16th ACM Conference on Recommender Systems, pages 389–398, 2022.
4. SHIVAM BANSAL. Netflix movies and tv shows dataset. 2021.
5. Ilaria Bartolini, Vincenzo Moscato, Ruggero G Pensa, Antonio Penta, Antonio Picariello, Carlo Sansone, and Maria Luisa Sapino. Recommending multimedia objects in cultural heritage applications. In New Trends in Image Analysis and Processing–ICIAP 2013: ICIAP 2013 International Workshops, Naples, Italy, September 9-13, 2013. Proceedings 17, pages 257–267. Springer, 2013.
6. Xiaojie Chen, Pengpeng Zhao, Jiajie Xu, Zhixu Li, Lei Zhao, Yanchi Liu, Victor S Sheng, and Zhiming Cui. Exploiting visual contents in posters and still frames for movie recommendation. IEEE Access, 6:68874–68881, 2018.
7. Xu Chen, Yongfeng Zhang, Qingyao Ai, Hongteng Xu, Junchi Yan, and Zheng Qin. Personalized key frame recommendation. In Proceedings of the 40th international ACM SIGIR conference on research and development in information retrieval, pages 315–324, 2017.
8. Heng-Tze Cheng, Levent Koc, Jeremiah Harmsen, Tal Shaked, Tushar Chan-dra, Hrishi Aradhye, Glen Anderson, Greg Corrado, Wei Chai, Mustafa Ispir, et al. Wide & deep learning for recommender systems. In Proceedings of the 1st workshop on deep learning for recommender systems, pages 7–10, 2016.
9. Paul Covington, Jay Adams, and Emre Sargin. Deep neural networks for youtube recommendations. In Proceedings of the 10th ACM conference on recommender systems, pages 191–198, 2016.
10. Peng Cui, Zhiyu Wang, and Zhou Su. What videos are similar with you? learning a common attributed representation for video recommendation. In Proceedings of the 22nd ACM international conference on Multimedia, pages 597–606, 2014.
11. THE MOVIE DATABASE(TMDB). Tmdb 5000 movie dataset. 2018.
12. Yashar Deldjoo, Mihai Gabriel Constantin, Hamid Eghbal-Zadeh, Bogdan Ionescu, Markus Schedl, and Paolo Cremonesi. Audio-visual encoding of multimedia content for enhancing movie recommendations. In Proceedings of the 12th ACM Conference on Recommender Systems, pages 455–459, 2018.
13. Yashar Deldjoo, Markus Schedl, Bal ́azs Hidasi, Yinwei Wei, and Xiangnan He. Multimedia recommender systems: Algorithms and challenges. In Recommender systems handbook, pages 973–1014. Springer, 2021.
14. Zhengyu Deng, Jitao Sang, and Changsheng Xu. Personalized video recommendation based on cross-platform user modeling. In 2013 IEEE International Conference on Multimedia and Expo (ICME), pages 1–6. IEEE, 2013.
15. Mehdi Elahi, Francesco Ricci, and Neil Rubens. A survey of active learning in collaborative filtering recommender systems. Computer Science Review, 20:29–50, 2016.
16. Xue Geng, Hanwang Zhang, Jingwen Bian, and Tat-Seng Chua. Learning image and user features for recommendation in social networks. In Proceedings of the IEEE international conference on computer vision, pages 4274–4282, 2015.
17. Xudong Gong, Qinlin Feng, Yuan Zhang, Jiangling Qin, Weijie Ding, Biao Li, Peng Jiang, and Kun Gai. Real-time short video recommendation on mobile devices. In Proceedings of the 31st ACM International Conference on Information & Knowledge Management, pages 3103–3112, 2022.
18. F Maxwell Harper and Joseph A Konstan. The movielens datasets: History and context. Acm transactions on interactive intelligent systems (tiis), 5(4):1–19, 2015.
19. Marius Kaminskas, Francesco Ricci, and Markus Schedl. Location-aware music recommendation using auto-tagging and hybrid matching. In Proceedings of the 7th ACM Conference on Recommender Systems, pages 17–24, 2013.
20. Yehuda Koren, Robert Bell, and Chris Volinsky. Matrix factorization techniques for recommender systems. Computer, 42(8):30–37, 2009.
21. Joonseok Lee, Sami Abu-El-Haija, Balakrishnan Varadarajan, and Apostol Natsev. Collaborative deep metric learning for video understanding. In Proceedings of the 24th ACM SIGKDD International conference on knowledge discovery & data mining, pages 481–490, 2018.
22. Pasquale Lops, Marco De Gemmis, and Giovanni Semeraro. Content-based recommender systems: State of the art and trends. Recommender systems handbook, pages 73–105, 2011.
23. Jingwei Ma, Guang Li, Mingyang Zhong, Xin Zhao, Lei Zhu, and Xue Li. Lga: latent genre aware micro-video recommendation on social media. Multimedia Tools and Applications, 77:2991–3008, 2018.
24. Wei Niu, James Caverlee, and Haokai Lu. Neural personalized ranking for image recommendation. In Proceedings of the eleventh ACM international conference on web search and data mining, pages 423–431, 2018.
25. Sergio Oramas, Oriol Nieto, Mohamed Sordo, and Xavier Serra. A deep multimodal approach for cold-start music recommendation. In Proceedings of the 2nd workshop on deep learning for recommender systems, pages 32–37, 2017.
26. Aaron Van den Oord, Sander Dieleman, and Benjamin Schrauwen. Deep content-based music recommendation. Advances in neural information processing systems, 26, 2013.
27. Ruoxi Wang, Bin Fu, Gang Fu, and Mingliang Wang. Deep & cross network for ad click predictions. In Proceedings of the ADKDD’17, pages 1–7. 2017.
28. Zhou Xing, Marzieh Parandehgheibi, Fei Xiao, Nilesh Kulkarni, and Chris Pouliot. Content-based recommendation for podcast audio-items using natural language processing techniques. In 2016 IEEE International Conference on Big Data (Big Data), pages 2378–2383. IEEE, 2016.
29. Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li, and Li Fei-Fei. ImageNet: A Large-Scale Hierarchical Image Database. In Proceedings of IEEEConference on Computer Vision and Pattern Recognition (CVPR), pages 248–255, 2009.

簡易檢索 / 詳目顯示

相關論文