簡易檢索 / 詳目顯示

研究生: 陳昱睿
Chen, Yu-Rui
論文名稱: 利用關聯規則探勘分析專利網路之專利引用關係
Analyzing Patent Citation Relations using Association Rule Mining
指導教授: 黃之浩
Huang, Scott Chih-Hao
口試委員: 鍾偉和
Chung, Wei-Ho
黃啟祐
Huang, Chi-Yo
學位類別: 碩士
Master
系所名稱: 電機資訊學院 - 通訊工程研究所
Communications Engineering
論文出版年: 2023
畢業學年度: 111
語文別: 中文
論文頁數: 70
中文關鍵詞: 網路科學關聯規則探勘專利引用推薦專利引用網路
外文關鍵詞: Network Science, Association Rule Mining, Patent Citation Recommendation, Patent Citation Network
相關次數: 點閱:3下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 關聯規則探勘(Association Rule Mining)是資料科學中一種常見的研究方法,在資料量龐大的情況下通常有許多隱藏的資訊藏於其中,如果不使用任何工具只透過我們人工的觀察往往得不到任何重點結果,而透過關聯規則探勘演算法的幫助之下我們可以找出許多意想不到的組合。因此關聯規則探勘被廣泛的應用在不同領域上,最常見的例子便是顧客購買商品時經常會遇到的「購物籃分析」(Basket Analysis) 問題,然而現有的關聯規則探勘研究幾乎很少應用在網路科學的層面上,在點跟線所組成的網路中我們期望可以透過關聯規則探勘來解決一些網路科學裡經常會遇到的問題,也希望可以透過這個方式發現意想不到的結果。
    我們研究的對象是專利引用網路(Patent Citation Network),在此網路中我們把專利間的引用關係視為人與商品之間的購買問題,透過關聯規則探勘的分析我們預期可以發現許多規則組合並且透過這些結果更進一步的驗證關聯規則探勘對於專利網路亦或是網路科學的實用性以及有效性。首先我們使用Apriori演算法找出頻繁項目集,接著根據結果使用關聯規則探勘得出關聯規則,最後我們使用這些不同的關聯規則進行各方面評估。
    實驗結果我們發現在專利引用網路中使用關聯規則探勘發現的關聯規則是極具有意義的,我們可以將這些規則當作是一個推薦系統,並且這種推薦方式非常可靠且準確,不同於透過關鍵訊息進行專利推薦或搜尋的系統,我們是透過一個個專利引用不斷地推薦下一個可引用的專利,換句話說使用者每引用一篇專利我們都可以推薦使用者下一篇相關或相似的專利,從而達到引用或是參考等等目的。同時我們還發現透過對資料集矩陣進行轉置後再計算關聯規則,可以透過規則發現結構性相似亦或是內容相似的專利組合,使用者可以輕鬆找出與指定專利在某種程度上相似的另一篇專利,從而參考使用。


    Association Rule Mining is a common research method in data science. In cases with large amounts of data, there are often hidden information and insights that cannot be easily obtained through manual observation alone. By utilizing association rule mining algorithms, we can discover unexpected combinations and patterns. As a result, association rule mining is widely applied in various fields. One common example is ”basket analysis” in customer purchasing behavior, where associations between items in shopping baskets are explored. However, there is limited research on applying association rule mining in the context of network science. In networks composed of nodes and edges, we aim to solve problems frequently encountered in network science through association rule mining, hoping to uncover unexpected findings.
    Our research focuses on the Patent Citation Network, where we treat the citation relationships between patents as purchase transactions between people and items. Through the analysis of association rules, we expect to discover various rule combinations. Furthermore, we aim to validate the utility and effectiveness of association rule mining in the context of patent networks and network science. We start by using the Apriori algorithm to identify frequent itemsets. Then, based on the results, we apply association rule mining to derive association rules. Finally, we evaluate the obtained rules from different perspectives.
    The experimental results reveal that association rules discovered through association rule mining in the Patent Citation Network are highly meaningful. We can treat these rules as a recommendation system, which is reliable and accurate. Unlike patent recommendations or search systems based on keyword information, our approach continuously recommends the next citable patent based on individual patent citations. In other words, for each cited patent, we can recommend the user another relevant or similar patent, achieving the goal of citation or reference. Additionally, we discovered that by transposing the dataset matrix and calculating association rules, we can identify structurally or content-wise similar combinations of patents. Users can easily find another patent that is somewhat similar to a specified patent, facilitating their referencing process.

    Abstract II 致謝IV 目錄V 圖目錄VIII 表目錄X 1 緒論1 1.1 研究動機. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.2 研究目的與方法. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 1.3 研究貢獻. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 1.4 論文架構. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 2 相關研究探討5 3 預備知識7 3.1 網路科學. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 3.1.1 定義. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 3.2 名詞解釋. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 3.2.1 鄰接矩陣. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 3.2.2 有向性網路. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 3.2.3 專利引用網路. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 3.3 三元閉包. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 3.3.1 定義. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 3.3.2 圖例探討. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 3.3.3 問題發現. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 4 關聯規則探勘15 4.1 定義. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 4.2 購物籃分析. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 4.2.1 購物籃分析. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 4.2.2 分析步驟說明. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 4.3 Apriori(先驗演算法) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 4.4 關聯規則. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 4.4.1 支持度(Support) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 4.4.2 信賴度(Confidence) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 4.4.3 提升度(Lift) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 4.5 購物籃分析之關聯規則. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26 4.6 鄰接矩陣與數據集. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 5 實驗假設與評估方法28 5.1 關聯規則探勘獨特性. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28 5.1.1 中心性評估指標. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28 5.2 專利引用推薦方法. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30 5.2.1 推薦方法準確性. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30 5.3 相似專利搜尋方法. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34 5.3.1 專利相似性. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35 6 實驗結果與分析38 6.1 實驗前說明. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38 6.1.1 數據集. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38 6.1.2 參數設置. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39 6.1.3 實驗流程. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41 6.2 實驗結果. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42 6.2.1 關聯規則結果. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42 6.2.2 關聯規則與中心性. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48 6.2.3 精確度、召回率、F1-score結果. . . . . . . . . . . . . . . . . . . . . . . . 57 6.2.4 Jaccard、Overlap、Text 相似度結果. . . . . . . . . . . . . . . . . . . . . 60 6.2.5 實驗結果分析. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63 7 總結與未來展望65 7.1 總結. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65 7.2 未來展望. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66 參考文獻67


    [1] X. Ji, X. Gu, F. Dai, J. Chen, and C. Le, “Patent collaborative filtering recommendation
    approach based on patent similarity,” in Proceedings of the 2011
    Eighth International Conference on Fuzzy Systems and Knowledge Discovery
    (FSKD), vol. 3, pp. 1699–1703, 2011.
    [2] X. Liu, Y. Wan, X. Liu, and J. Zhang, “A patent recommendation algorithm
    based on topic classification and semantic similarity,” in Proceedings of the
    2021 International Conference on Wireless Communications and Smart Grid
    (ICWCSG), pp. 289–292, 2021.
    [3] X. Jin, S. Spangler, Y. Chen, K. Cai, R. Ma, L. Zhang, X. Wu, and
    J. Han, “Patent maintenance recommendation with patent information network
    model,” in Proceedings of the 2011 IEEE 11th International Conference
    on Data Mining, pp. 280–289, 2011.
    [4] S. Oh, Z. Lei, W.-C. Lee, and J. Yen, “Recommending missing citations for
    newly granted patents,” in Proceedings of the 2014 International Conference
    on Data Science and Advanced Analytics (DSAA), pp. 442–448, 2014.
    [5] S. Altuntas, T. Dereli, and A. Kusiak, “Analysis of patent documents with
    weighted association rules,” Technological Forecasting and Social Change,
    vol. 92, pp. 249–262, 2015.
    [6] T. Chen, M. Luo, H. Fu, D. Chen, Q. Hu, and N. Deng, “Application of ner
    and association rules to traditional chinese medicine patent mining,” in Proceedings
    of the 2020 International Conferences on Internet of Things (iThings)
    and IEEE Green Computing and Communications (GreenCom) and IEEE Cyber,
    Physical and Social Computing (CPSCom) and IEEE Smart Data (Smart-
    Data) and IEEE Congress on Cybermatics (Cybermatics), pp. 767–772, 2020.
    [7] A. J. M, E. S. I, and L. K. Ramasamy, “Recommender system for predicting
    students’ academic performance in association with cognitive state and affective
    state using sentiment analysis and association rule mining on the closed
    ended questionnaire,” in Proceeding of the 2023 9th International Conference
    on Information Technology Trends (ITT), pp. 79–83, 2023.
    [8] V. Latypova, “Decision support based on analysis of relationship between
    errors using association rule mining on the example of graduate students’ scientific
    papers,” in Proceeding of the 2023 IEEE Ural-Siberian Conference on
    Biomedical Engineering, Radioelectronics and Information Technology (USBEREIT),
    pp. 217–220, 2023.
    [9] M. E. R. Cervantes, D. D. Dasig, R. C. Traballo, M. A. B. Taduyo, R. V. F.
    Guarin, E. E. Claricia, M. P. Gatpandan, D. J. R. Calantoc, C. N. Ferrer,
    L. G. Diampoc, A. B. Cuyugan, and F. Talion, “Applying association rules
    mining approach in skills competition based on apriori algorithm,” in Proceeding
    of the 2022 IEEE 14th International Conference on Humanoid, Nanotechnology,
    Information Technology, Communication and Control, Environment,
    and Management (HNICEM), pp. 1–6, 2022.
    [10] S. K. Solanki and J. T. Patel, “A survey on association rule mining,” in
    Proceeding of the 2015 Fifth International Conference on Advanced Computing
    Communication Technologies, pp. 212–216, 2015.
    [11] N. Arora, K. K. Gola, S. Gulati, and P. Chutani, “Survey, analysis and association
    rules derivation using apriori method for buying preference amongst
    kids of age-group 5 to 9 in india,” in Proceeding of the 2023 2nd Interna-
    tional Conference on Paradigm Shifts in Communications Embedded Systems,
    Machine Learning and Signal Processing (PCEMS), pp. 1–4, 2023.
    [12] T. G. Lewis, Network Science: Theory and Applications. Wiley Publishing,
    2009.
    [13] M. E. J. Newman, Networks: an introduction. Oxford; New York: Oxford
    University Press, 2010.
    [14] D. Easley and J. Kleinberg, Networks, Crowds, and Markets: Reasoning about
    a Highly Connected World. Cambridge University Press, 2010.
    [15] H. Huang, J. Tang, S. Wu, L. Liu, and X. Fu, “Mining triadic closure patterns
    in social networks,” WWW ’14 Companion, (New York, NY, USA), pp. 499–
    504, Association for Computing Machinery, 2014.
    [16] R. Milo, S. Shen-Orr, S. Itzkovitz, N. Kashtan, D. Chklovskii, and U. Alon,
    “Network motifs: Simple building blocks of complex networks,” Science,
    vol. 298, no. 5594, pp. 824–827, 2002.
    [17] G. Piatetsky-Shapiro, “Discovery, analysis, and presentation of strong rules,”
    Knowledge Discovery in Databases, pp. 229–238, 1991.
    [18] R. Agrawal, T. Imieli´nski, and A. Swami, “Mining association rules between
    sets of items in large databases,” SIGMOD Rec., vol. 22, pp. 207–216, jun
    1993.
    [19] R. Agrawal and R. Srikant, “Fast algorithms for mining association rules in
    large databases,” in Proceedings of the 20th International Conference on Very
    Large Data Bases, VLDB ’94, (San Francisco, CA, USA), pp. 487–499, Morgan
    Kaufmann Publishers Inc., 1994.
    [20] T. A. Kumbhare and S. V. Chobe, “An overview of association rule mining
    algorithms,” International Journal of Computer Science and Information
    Technologies, vol. 5, pp. 927–930, 2014.
    [21] A. Telikani, A. H. Gandomi, and A. Shahbahrami, “A survey of evolutionary
    computation for association rule mining,” Information Sciences, vol. 524,
    pp. 318–352, 2020.
    [22] F. Zhan, X. Zhu, L. Zhang, X. Wang, L. Wang, and C. Liu, “Summary of
    association rules,” vol. 252, 2019. ID 302219.
    [23] L. Katz, “A new status index derived from sociometric analysis,” Psychometrika,
    vol. 18, no. 1, pp. 39–43, 1953.
    [24] D. M. W. Powers, “Evaluation: from precision, recall and f-measure to roc,
    informedness, markedness and correlation,” 2020.
    [25] S. Niwattanakul, J. Singthongchai, E. Naenudorn, and S. Wanapu, “Using of
    jaccard coefficient for keywords similarity,” vol. 1, pp. 380–384, 2013.
    [26] V. M K and K. K, “A survey on similarity measures in text mining,” Machine
    Learning and Applications: An International Journal, vol. 3, pp. 19–28, 2016.
    [27] J. Wang and Y. Dong, “Measurement of text similarity: A survey,” Information,
    vol. 11, no. 9, 2020.
    [28] J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova, “Bert: Pre-training of
    deep bidirectional transformers for language understanding,” 2019.
    [29] E. Hikmawati and K. Surendro, “How to determine minimum support in
    association rule,” in Proceedings of the 2020 9th International Conference on
    Software and Computer Applications, ICSCA ’20, (New York, NY, USA),
    pp. 6–10, Association for Computing Machinery, 2020.

    QR CODE