Seeing the forest for the trees: tree-based uncertain frequent pattern mining

Loading...
Thumbnail Image
Date
2014-05, 2014-09, 2014-09, 2014-12, 2014-12
Authors
MacKinnon, Richard Kyle
Journal Title
Journal ISSN
Volume Title
Publisher
Springer International Publishing
Springer International Publishing
Elsevier
IEEE Computer Society Press
IEEE Computer Society Press
Abstract
Many frequent pattern mining algorithms operate on precise data, where each data point is an exact accounting of a phenomena (e.g., I have exactly two sisters). Alas, reasoning this way is a simplification for many real world observations. Measurements, predictions, environmental factors, human error, &ct. all introduce a degree of uncertainty into the mix. Tree-based frequent pattern mining algorithms such as FP-growth are particularly efficient due to their compact in-memory representations of the input database, but their uncertain extensions can require many more tree nodes. I propose new algorithms with tightened upper bounds to expected support, Tube-S and Tube-P, which mine frequent patterns from uncertain data. Extensive experimentation and analysis on datasets with different probability distributions are undertaken that show the tightness of my bounds in different situations.
Description
Keywords
Data mining, Databases
Citation
MacKinnon, R.K., Leung, C.K.-S., Tanbeer, S.K. (2014) A scalable data analytics algorithm for mining frequent patterns from uncertain data. In Proc. PAKDDW 2014: 404-416. Springer International Publishing.
Leung, C.K.-S., MacKinnon, R.K. (2014) BLIMP: a compact tree structure for uncertain frequent pattern mining. In Proc. DaWaK 2014: 115-123. Springer International Publishing.
Leung, C.K.-S., MacKinnon, R.K., Tanbeer, S.K. (2014) Tightening upper bounds to the expected support for uncertain frequent pattern mining. In Proc. KES 2014: 328-337. Elsevier.
MacKinnon, R.K., Strauss, T.D., Leung, C.K.-S. (2014) DISC: efficient uncertain frequent pattern mining with tightened upper bounds. In Proc. ICDMW 2014: 1038-1045. IEEE Computer Society Press.
Leung, C.K.-S., MacKinnon, R.K., Tanbeer, S.K. (2014) Fast algorithms for frequent itemset mining from uncertain data. In Proc. ICDM 2014: 893-898. IEEE Computer Society Press.