Deakin University
Browse
chen-automaticpatterntaxonomy-2004.pdf (159.04 kB)

Automatic pattern-taxonomy extraction for web mining

Download (159.04 kB)
conference contribution
posted on 2004-01-01, 00:00 authored by S T Wu, Yuefeng Li, Y Xu, B Pham, Yi-Ping Phoebe Chen
In this paper, we propose a model for discovering frequent sequential patterns, phrases, which can be used as profile descriptors of documents. It is indubitable that we can obtain numerous phrases using data mining algorithms. However, it is difficult to use these phrases effectively for answering what users want. Therefore, we present a pattern taxonomy extraction model which performs the task of extracting descriptive frequent sequential patterns by pruning the meaningless ones. The model then is extended and tested by applying it to the information filtering system. The results of the experiment show that pattern-based methods outperform the keyword-based methods. The results also indicate that removal of meaningless patterns not only reduces the cost of computation but also improves the effectiveness of the system.

History

Event

IEEE/WIC/ACM International Conference on Intelligent Agent Technology (2004 : Beijing, China)

Pagination

242 - 248

Publisher

IEEE Xplore

Location

Beijing, China

Place of publication

Piscataway, N.J.

Start date

2004-09-20

End date

2004-09-24

ISBN-13

9780769521008

ISBN-10

0769521002

Language

eng

Publication classification

E1 Full written paper - refereed

Copyright notice

2004 IEEE. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the IEEE.

Editor/Contributor(s)

N Zhong, H Tirri, Y Yao, L Zhou, J Liu, N Cercone

Title of proceedings

IEEE/WIC International Conference on Web Intelligence (WI 2004) : Beijing, China, September 20-24, 2004 : proceedings

Usage metrics

    Research Publications

    Categories

    No categories selected

    Keywords

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC