Adapting association patterns for text categorization: weaknesses and enhancements

Authors:
Tieyun Qian;Hui Xiong;Yuanzhen Wang;Enhong Chen
Affiliations:
Wuhan University;Rutgers University;Huazhong University of Science and Technology;University of Science and Technology of China
Venue:
CIKM '06 Proceedings of the 15th ACM international conference on Information and knowledge management
Year:
2006

Citing 7
Cited 0

Mining association rules between sets of items in large databases

SIGMOD '93 Proceedings of the 1993 ACM SIGMOD international conference on Management of data
Inductive learning algorithms and representations for text categorization

Proceedings of the seventh international conference on Information and knowledge management
A Probabilistic Analysis of the Rocchio Algorithm with TFIDF for Text Categorization

ICML '97 Proceedings of the Fourteenth International Conference on Machine Learning
CMAR: Accurate and Efficient Classification Based on Multiple Class-Association Rules

ICDM '01 Proceedings of the 2001 IEEE International Conference on Data Mining
Mining Strong Affinity Association Patterns in Data Sets with Skewed Support Distribution

ICDM '03 Proceedings of the Third IEEE International Conference on Data Mining
SAT-MOD: moderate itemset fittest for text classification

WWW '05 Special interest tracks and posters of the 14th international conference on World Wide Web
OCFS: optimal orthogonal centroid feature selection for text categorization

Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval

Quantified Score

Hi-index	0.00

Visualization

Abstract

The use of association patterns for text categorization has attracted great interest and a variety of useful methods have been developed. However, the key characteristics of pattern-based text categorization remain unclear. Indeed, there are still no concrete answers for the following two questions: First, what kind of association patterns are the best candidate for pattern-based text categorization? Second, what is the most desirable way to use patterns for text categorization? In this paper, we focus on answering the above two questions. Specifically, we show that hyperclique patterns are more desirable than frequent patterns for text categorization. Along this line, we develop an algorithm for text categorization using hyperclique patterns. The experimental results show that our method provides better performance than state-of-the-art methods in terms of both computational performance and classification accuracy.