An Algorithm for Mining Implicit Itemset Pairs Based on Differences of Correlations

Taniguchi, Tsuyoshi; Haraguchi, Makoto

doi:10.1007/11563983_20


Hokkaido University \| Library \| HUSCAP	Advanced Search		言語

	Home
	About HUSCAP
	Open Access Policy

	Browse by Author

Browse
	Communities & Collections

	Scholarly Journals
	Theses
	Doctoral Dissertations Listed by Graduate Schools
	Conference Procs.
	Events

	HUSCAP Senior (in Japanese)

	Societies

	Downloads (country)

For university staff
	How to post your papers to HUSCAP
	Publication of theses
	Helpline about theses publication

Open Archives Compliant

You can search our collection also at:
	Google
	Google Scholar
	CiNii
	IRDB
	OAIster
	NDLTD

Hokkaido University Collection of Scholarly and Academic Papers >
Graduate School of Information Science and Technology / Faculty of Information Science and Technology >
Peer-reviewed Journal Articles, etc >

An Algorithm for Mining Implicit Itemset Pairs Based on Differences of Correlations

Files in This Item:

DSP3735.pdf

202.65 kB

PDF

View/Open

Please use this identifier to cite or link to this item:http://hdl.handle.net/2115/5590

Title:	An Algorithm for Mining Implicit Itemset Pairs Based on Differences of Correlations
Authors:	Taniguchi, Tsuyoshi¹ Browse this author
Authors:	Haraguchi, Makoto Browse this author
Authors(alt):	谷口, 剛¹
Issue Date:	8-Oct-2005
Publisher:	Springer Berlin
Journal Title:	Discovery Science
Volume:	3735
Start Page:	227
End Page:	240
Publisher DOI:	10.1007/11563983_20
Abstract:	Given a transaction database as a global set of transactions and its local database obtained by some conditioning to the global one, we consider a pair of itemsets whose degrees of correlations are higher in the local database than in the global one. A problem of finding paired itemsets with high correlation in one database is known as Discovery of Correlation, and some algorithms to search for such characteristic paired itemsets are already proposed. However, even non-characteristic paired itemsets in the local database are also meaningful, provided the degree of correlation increases much higher in the local database than in the global one. They can be an implicit and hidden evidence showing that something particular to the local database occurs even though they are not yet realized as characteristic ones in the local. From this viewpoint, we have already proposed to measure the significance of paired itemsets by the difference of two correlations before and after the conditioning to the local database, and define a notion of DC pairs whose degrees of differences of correlations are high. As DC pairs are regarded as compound itemsets consisting of two component itemsets, we can have two basic strategies for finding them. One strategy firstly examines the compound itemsets and then the components, while another one does the component itemsets and then the compound ones. According to the former strategy, which we have already proposed and tested for its effectiveness, we have to enumerate many number of candidate compound itemsets that cannot be decomposable to components. For this reason, this paper presents a new algorithm according to the second strategy. It firstly enumerate possible component itemsets based on a new pruning rule for cutting off useless components. Secondly it forms the compound itemsets by combining the components thus detected, while we also make use of a constraint for preventing our algorithm from checking meaningless combinations.
Rights:	The original publication is available at www.springerlink.com
Type:	article (author version)
URI:	http://hdl.handle.net/2115/5590
Appears in Collections:	情報科学院・情報科学研究院 (Graduate School of Information Science and Technology / Faculty of Information Science and Technology) > 雑誌発表論文等 (Peer-reviewed Journal Articles, etc)

Submitter: 谷口剛

OAI-PMH ( junii2 , jpcoar_1.0 )

- Hokkaido University