Mining Association Rules in Text Databases Using Multipass with Inverted Hashing and Pruning

Authors:
John D. Holt;Soon M. Chung
Affiliations:
-;-
Venue:
ICTAI '02 Proceedings of the 14th IEEE International Conference on Tools with Artificial Intelligence
Year:
2002

Citing 0
Cited 2

Parallel mining of association rules from text databases

The Journal of Supercomputing
Algorithms for mining frequent itemsets in static and dynamic datasets

Intelligent Data Analysis

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this paper, we propose a new algorithm named Multipass with Inverted Hashing and Pruning (MIHP) for mining association rules between words in text databases. The characteristics of text databases are quite different from those of retail transaction databases, and existing mining algorithms cannot handle text databases efficiently because of the large number of itemsets (i.e., words) that need to be counted. Two well-known mining algorithms, the Apriori algorithm [1] and the Direct Hashing and Pruning (DHP) algorithm [8], are evaluated in the context of mining text databases, and are compared with the proposed MIHP algorithm. It has been shown that the MIHP algorithm has better performance for large text databases.