簡易檢索 / 詳目顯示

研究生: 鄭玉堂
Yutang Cheng
論文名稱: 一個探勘頻繁閉項目集的混合方法
A Hybrid Method for Frequent Closed Itemsets Mining
指導教授: 許奮輝
Fenn-Huei Simon Sheu
口試委員:
學位類別: 碩士
Master
系所名稱: 電機資訊學院 - 資訊工程學系
Computer Science
論文出版年: 2005
畢業學年度: 93
語文別: 中文
論文頁數: 1冊
中文關鍵詞: 頻繁閉項目集
外文關鍵詞: frequent closed itemset
相關次數: 點閱:52下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • Frequent itemset(頻繁項目集)的探勘在資料探勘的領域中已經是很典型的問題了,關於這方面已經有非常的多的研究,而因為frequent closed itemset(頻繁閉項目集)有著資料較少但卻完全沒有丟失任何frequent itemset所有的資訊的特性,frequent closed itemset的探勘成為近來較為重要的課題。在以往的研究中,若我們以運算資料來做區分的話,主要可以分為horizontal format(水平格式)與vertical format(垂直格式)兩大類,horizontal format與vertical format的資料我們可以很容易的轉換,對應兩種不同資料格式的方法有著不同的特點,但無論是哪一種格式,目前諸多的方法都需要做「closed」的檢查,針對答案是否為閉集合做額外的檢查是一種負擔,我們綜合了水平格式與垂直格式的作法設計了一個新方法,這個方法的特點之一,就是不需要再檢查答案是否為閉集合,而且針對某些情況下的資料集,我們的方法能有非常好的效能表現。


    Frequent itemset mining is a typical question in the data mining domain. There are many researches about this problem already. We can use frequent closed itemset to find all frequent itemset, and the number of frequent closed itemset is much smaller than frequent itemset. Because of these strong points, frequent closed itemset mining becomes an important topic recently. In former researches, we can separate the approaches by data format into two ways: horizontal format and vertical format. And they can be transformed to each other easily. However, no matter what kind of format we use, we have to check if the answer is "closed". And that is overhead. Our new approach uses both horizontal format and vertical format data at the same time. One characteristics of this approach is that we do not need to check “closed” again. Moreover, in certain special dataset our approach has extremely good performance.

    目 錄 I 表格及圖片索引 II 第一章 1 第二章 3 2.1問題定義 3 2.2相關研究 5 2.2.1 CHARM 6 2.2.2 CLOSET+ 6 第三章 10 3.1 集合S 10 3.2 VFD(VERTICAL FORMAT DATA)的查詢 12 3.3 CLOSED維持 12 3.4 例子 14 3.5 演算法 14 第四章 17 第五章 23 附錄 A

    [1] B. Liu, W. Hsu, and Y. Ma, "Mining Association Rules with Multiple Minimum Supports," presented at KDD, San Diego, CA, USA, 1999.
    [2] J. Han, J. Pei, and Y. Yin, "Mining Frequent Patterns without Candidate Generation," presented at SIGMOD, Dallas, Texas, USA, 2000.
    [3] J. Pei, J. Han, and R. Mao, "CLOSET: An Efficient Algorithm for Mining Frequent Closed Itemsets," presented at SIGMOD Workshop on Research Issues in Data Mining and Knowledge Discovery(DMKD), Dallas, Texas, USA, 2000.
    [4] J.-F. Boulicaut and B. Jeudy, "Mining Free Itemsets under Constraints," presented at International Database Engineering and Applications Symposium(IDEAS), Grenoble, France, 2001.
    [5] J. Pei, J. Han, and L. V. S. Lakshmanan, "Mining Frequent Item Sets with Convertible Constraints," presented at ICDE, Heidelberg, Germany, 2001.
    [6] J. Pei, J. Han, H. Lu, S. Nishio, S. Tang, and D. Yang, "H-Mine: Hyper-Structure Mining of Frequent Patterns in Large Databases," presented at ICDM, San Jose, California, USA, 2001.
    [7] M. Seno and G. Karypis, "LPMiner: An Algorithm for Finding Frequent Itemsets Using Length-Decreasing Support Constraint," presented at ICDM, San Jose, California, USA, 2001.
    [8] Z. Zheng, R. Kohavi, and L. Mason, "Real World Performance of Association Rule Algorithms," presented at KDD, San Francisco, CA, USA, 2001.
    [9] C. Bucila, J. Gehrke, D. Kifer, and W. White, "DualMiner: A Dual-Pruning Algorithm for Itemsets with Constraints," presented at KDD, Edmonton, Alberta, Canada, 2002.
    [10] J. Liu, Y. Pan, K. Wang, and J. Han, "Mining Frequent Item Sets by Opportunistic Projection," presented at KDD, Edmonton, Alberta, Canada, 2002.
    [11] S. Orlando, P. Palmerini, R. Perego, and F. Silvestri, "Adaptive and Resource-Aware Mining of Frequent Sets," presented at ICDM, Maebashi City, Japan, 2002.
    [12] J. Pei, G. Dong, W. Zou, and J. Han, "On Computing Condensed Frequent Pattern Bases," presented at ICDM, Maebashi City, Japan, 2002.
    [13] M. J. Zaki and C.-J. Hsiao, "CHARM: An Efficient Algorithm for Closed Itemset Mining," presented at SIAM International Conference on Data MIning(SDM), Arlington, VA, USA, 2002.
    [14] Q. Zou, W. W. Chu, and B. Lu, "SmartMiner: A Depth First Algorithm Guided by Tail Information for Mining Maximal Frequent Itemsets," presented at ICDM, Maebashi City, Japan, 2002.
    [15] W. Cheung and O. R. Zaiane, "Incremental Mining of Frequent Patterns Without Candidate Generation or Support Constraint," presented at International Database Engineering and Applications Symposium(IDEAS), Hong Kong, China, 2003.
    [16] M. ElHajj and O. R. Zaiane, "Inverted Matrix: Efficient Discovery of Frequent Items in Large Datasets in the Context of Interactive Mining," presented at KDD, Washington, DC, USA, 2003.
    [17] G. Grahne and J. Zhu, "Efficiently Using Prefix-trees in Mining Frequent Itemsets," presented at ICDM Workshop on Frequent Itemset Mining Implementations(FIMI), Melbourne, Florida, USA, 2003.
    [18] J. Li and Y. Zhang, "Direct Interesting Rule Generation," presented at ICDM, Melbourne, Florida, USA, 2003.
    [19] T. Mielikainen, "Intersecting Data to Closed Sets with Constraints," presented at ICDM Workshop on Frequent Itemset Mining Implementations(FIMI), Melbourne, Florida, USA, 2003.
    [20] F. Pan, G. Cong, A. K. H. Tung, J. Yang, and M. J. Zaki, "CARPENTER: Finding Closed Patterns in Long Biological Datasets," presented at KDD, Washington, DC, USA, 2003.
    [21] A. Pietracaprina and D. Zandolin, "Mining Frequent Itemsets using Patricia Tries," presented at ICDM Workshop on Frequent Itemset Mining Implementations(FIMI), Melbourne, Florida, USA, 2003.
    [22] W.-G. Teng, M.-S. Chen, and P. S. Yu, "A Regression-Based Temporal Pattern Mining Scheme for Data Streams," presented at VLDB, Berlin, Germany, 2003.
    [23] J. Wang, J. Han, and J. Pei, "CLOSET+: Searching for the Best Strategies for Mining Frequent Closed Itemsets," presented at KDD, Washington, DC, USA, 2003.
    [24] H. Xiong, P.-N. Tan, and V. Kumar, "Mining Strong Affinity Association Patterns in Data Sets with Skewed Support Distribution," presented at ICDM, Melbourne, Florida, USA, 2003.
    [25] J. Yang, W. Wang, and P. S. Yu, "STAMP: On Discovery of Statistically Important Pattern Repeats in Long Sequential Data," presented at SIAM International Conference on Data Mining(SDM), San Francisco, CA, USA, 2003.
    [26] M. J. Zaki and K. Gouda, "Fast Vertical Mining Using Diffsets," presented at KDD, Washington, DC, USA, 2003.
    [27] F. Bonchi and C. Lucchese, "On Closed Constrained Frequent Pattern Mining," presented at ICDM, Brighton, UK, 2004.
    [28] F. Bonchi and B. Goethals, "FP-Bonsai: the Art of Growing and Pruning Small FP-Trees," presented at Advances in Knowledge Discovery and Data Mining, Pacific-Asia Conference(PAKDD), Sydney, Australia, 2004.
    [29] Y. Chi, H. Wang, P. S. Yu, and R. R. Muntz, "Moment: Maintaining Closed Frequent Itemsets over a Stream Sliding Window," presented at ICDM, Brighton, UK, 2004.
    [30] G. Cong, A. K. H. Tung, X. Xu, F. Pan, and J. Yang, "FARMER: Finding Interesting Rule Groups in Microarray Datasets," presented at SIGMOD, Paris, France, 2004.
    [31] B. Goethals, "Memory issues in frequent itemset mining," presented at ACM Symposium on Applied Computing(SAC), Nicosia, Cyprus, 2004.
    [32] G. Grahne and J. Zhu, "Mining Frequent Itemsets from Secondary Memory," presented at ICDM, Brighton, UK, 2004.
    [33] P.-Y. Hsu, Y.-L. Chen, and C.-C. Ling, "Algorithms for mining association rules in bag databases," Information Sciences, vol. 166, pp. 31-47, 2004.
    [34] W.-Y. Kim, Y.-K. Lee, and J. Han, "CCMine: Efficient Mining of Confidence-Closed Correlated Patterns," presented at Advances in Knowledge Discovery and Data Mining, Pacific-Asia Conference(PAKDD), Sydney, Australia, 2004.
    [35] J. Wang and G. Karypis, "BAMBOO: Accelerating Closed Itemset Mining by Deeply Pushing the Length-Decreasing Support Constraint," presented at SIAM International Conference on Data Mining(SDM), Lake Buena Vista, Florida, USA, 2004.

    無法下載圖示 全文公開日期 本全文未授權公開 (校內網路)
    全文公開日期 本全文未授權公開 (校外網路)

    QR CODE