Effective Pattern Discovery for Text Mining

Authors:
Ning Zhong;Yuefeng Li;Sheng-Tang Wu
Affiliations:
Maebashi Institute of Technology, Maebashi;Queensland University of Technology, Brisbane;Asia University, Taiwan
Venue:
IEEE Transactions on Knowledge and Data Engineering
Year:
2012

Citing 0
Cited 7

A two-stage decision model for information filtering

Decision Support Systems
A pattern discovery model for effective text mining

MLDM'12 Proceedings of the 8th international conference on Machine Learning and Data Mining in Pattern Recognition
Discovering relevant features for effective query formulation

IRFC'12 Proceedings of the 5th conference on Multidisciplinary Information Retrieval
Adopting relevance feature to learn personalized ontologies

AI'12 Proceedings of the 25th Australasian joint conference on Advances in Artificial Intelligence
Mining pure high-order word associations via information geometry for information retrieval

ACM Transactions on Information Systems (TOIS)
Mining high coherent association rules with consideration of support measure

Expert Systems with Applications: An International Journal
Mapping semantic knowledge for unsupervised text categorisation

ADC '13 Proceedings of the Twenty-Fourth Australasian Database Conference - Volume 137

Quantified Score

Hi-index	0.00

Visualization

Abstract

Many data mining techniques have been proposed for mining useful patterns in text documents. However, how to effectively use and update discovered patterns is still an open research issue, especially in the domain of text mining. Since most existing text mining methods adopted term-based approaches, they all suffer from the problems of polysemy and synonymy. Over the years, people have often held the hypothesis that pattern (or phrase)-based approaches should perform better than the term-based ones, but many experiments do not support this hypothesis. This paper presents an innovative and effective pattern discovery technique which includes the processes of pattern deploying and pattern evolving, to improve the effectiveness of using and updating discovered patterns for finding relevant and interesting information. Substantial experiments on RCV1 data collection and TREC topics demonstrate that the proposed solution achieves encouraging performance.