Automatic text processing: the transformation, analysis, and retrieval of information by computer
Automatic text processing: the transformation, analysis, and retrieval of information by computer
Dynamic itemset counting and implication rules for market basket data
SIGMOD '97 Proceedings of the 1997 ACM SIGMOD international conference on Management of data
Using latent semantic indexing for literature based discovery
Journal of the American Society for Information Science
Data Mining: An Overview from a Database Perspective
IEEE Transactions on Knowledge and Data Engineering
Using a Hash-Based Method with Transaction Trimming for Mining Association Rules
IEEE Transactions on Knowledge and Data Engineering
Fast Algorithms for Mining Association Rules in Large Databases
VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
An Efficient Algorithm for Mining Association Rules in Large Databases
VLDB '95 Proceedings of the 21th International Conference on Very Large Data Bases
Sampling Large Databases for Association Rules
VLDB '96 Proceedings of the 22th International Conference on Very Large Data Bases
New Algorithms for Fast Discovery of Association Rules
New Algorithms for Fast Discovery of Association Rules
News Sensitive Stock Trend Prediction
PAKDD '02 Proceedings of the 6th Pacific-Asia Conference on Advances in Knowledge Discovery and Data Mining
TopCat: Data Mining for Topic Identification in a Text Corpus
IEEE Transactions on Knowledge and Data Engineering
Semantic-Based Temporal Text-Rule Mining
CICLing '09 Proceedings of the 10th International Conference on Computational Linguistics and Intelligent Text Processing
A simplicial complex, a hypergraph, structure in the latent semantic space of document clustering
International Journal of Approximate Reasoning
The improvement of PHP algorithm for association rules
CAR'10 Proceedings of the 2nd international Asia conference on Informatics in control, automation and robotics - Volume 3
Mining association rules in temporal document collections
ISMIS'06 Proceedings of the 16th international conference on Foundations of Intelligent Systems
A novel approach of mining write-prints for authorship attribution in e-mail forensics
Digital Investigation: The International Journal of Digital Forensics & Incident Response
Hi-index | 0.00 |
In this paper, we propose two new algorithms for mining association rules between words in text databases. The characteristics of text databases are quite different from those of retail transaction databases, and existing mining algorithms cannot handle text databases efficiently because of the large number of itemsets (i.e., words) that need to be counted. Two well-known mining algorithms, Apriori algorithm and Direct Hashing and Pruning (DHP) algorithm, are evaluated in the context of mining text databases, and are compared with the new proposed algorithms named Multipass-Apriori (M-Apriori) and Multipass-DHP (M-DHP). It has been shown that the proposed algorithms have better performance for large text databases.