Text Document Categorization by Term Association

Authors:
Maria-Luiza Antonie;Osmar R. Zaïane
Affiliations:
-;-
Venue:
ICDM '02 Proceedings of the 2002 IEEE International Conference on Data Mining
Year:
2002

Citing 0
Cited 55

Inverted matrix: efficient discovery of frequent items in large datasets in the context of interactive mining

Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
An associative classifier based on positive and negative rules

Proceedings of the 9th ACM SIGMOD workshop on Research issues in data mining and knowledge discovery
Automatic Pattern-Taxonomy Extraction for Web Mining

WI '04 Proceedings of the 2004 IEEE/WIC/ACM International Conference on Web Intelligence
SAT-MOD: moderate itemset fittest for text classification

WWW '05 Special interest tracks and posters of the 14th international conference on World Wide Web
Hierarchical document categorization with k-NN and concept-based thesauri

Information Processing and Management: an International Journal
Associative text categorization exploiting negated words

Proceedings of the 2006 ACM symposium on Applied computing
Sequential patterns for text categorization

Intelligent Data Analysis
Extending the single words-based document model: a comparison of bigrams and 2-itemsets

Proceedings of the 2006 ACM symposium on Document engineering
Association rule based classifier built via direct enumeration, online pruning and genetic algorithm based rule decimation

AIA'06 Proceedings of the 24th IASTED international conference on Artificial intelligence and applications
On Mining Instance-Centric Classification Rules

IEEE Transactions on Knowledge and Data Engineering
Parallel Bifold: Large-scale parallel pattern mining with constraints

Distributed and Parallel Databases
The effect of threshold values on association rule based classification accuracy

Data & Knowledge Engineering
Web Service Discovery via Semantic Association Ranking and Hyperclique Pattern Discovery

WI '06 Proceedings of the 2006 IEEE/WIC/ACM International Conference on Web Intelligence
Evolving Lucene search queries for text classification

Proceedings of the 9th annual conference on Genetic and evolutionary computation
On the strength of hyperclique patterns for text categorization

Information Sciences: an International Journal
Adapting associative classification to text categorization

Proceedings of the 2007 ACM symposium on Document engineering
Text classification using sentential frequent itemsets

Journal of Computer Science and Technology
Document-Base Extraction for Single-Label Text Classification

DaWaK '08 Proceedings of the 10th international conference on Data Warehousing and Knowledge Discovery
Neurolinguistic approach to natural language processing with applications to medical text analysis

Neural Networks
Application of Classification Association Rule Mining for Mammalian Mesenchymal Stem Cell Differentiation

ICDM '09 Proceedings of the 9th Industrial Conference on Advances in Data Mining. Applications and Theoretical Aspects
A Hybrid Statistical Data Pre-processing Approach for Language-Independent Text Classification

ADMA '09 Proceedings of the 5th International Conference on Advanced Data Mining and Applications
Rule Learning with Probabilistic Smoothing

DaWaK '09 Proceedings of the 11th International Conference on Data Warehousing and Knowledge Discovery
Efficient itemset generator discovery over a stream sliding window

Proceedings of the 18th ACM conference on Information and knowledge management
Associative classification with artificial immune system

IEEE Transactions on Evolutionary Computation
Pattern-oriented associative rule-based patent classification

Expert Systems with Applications: An International Journal
Hierarchical document categorization with k-NN and concept-based thesauri

Information Processing and Management: an International Journal
Extraction of unexpected sentences: A sentiment classification assessed approach

Intelligent Data Analysis
Mining rough association from text documents for web information gathering

Transactions on rough sets VII
A study on interestingness measures for associative classifiers

Proceedings of the 2010 ACM Symposium on Applied Computing
Ontology based web mining for information gathering

WImBI'06 Proceedings of the 1st WICI international conference on Web intelligence meets brain informatics
Efficient generic association rules based classifier approach

CLA'06 Proceedings of the 4th international conference on Concept lattices and their applications
Input space reduction for rule based classification

WSEAS Transactions on Information Science and Applications
A coarse-to-fine framework to efficiently thwart plagiarism

Pattern Recognition
Hybrid DIAAF/RS: statistical textual feature selection for language-independent text classification

ICDM'10 Proceedings of the 10th industrial conference on Advances in data mining: applications and theoretical aspects
A clustering rule-based approach to predictive modeling

Proceedings of the 48th Annual Southeast Regional Conference
Symptom-based problem determination using log data abstraction

Proceedings of the 2010 Conference of the Center for Advanced Studies on Collaborative Research
Classifying microarray data with association rules

Proceedings of the 2011 ACM Symposium on Applied Computing
Sentential association based text classification systems

APWeb'05 Proceedings of the 7th Asia-Pacific web conference on Web Technologies Research and Development
Feature selection, rule extraction, and score model: making ATC competitive with SVM

RSKT'06 Proceedings of the First international conference on Rough Sets and Knowledge Technology
A hybrid text classification system using sentential frequent itemsets

CIS'05 Proceedings of the 2005 international conference on Computational Intelligence and Security - Volume Part I
Parallel mining of top-k frequent itemsets in very large text database

WAIM'05 Proceedings of the 6th international conference on Advances in Web-Age Information Management
GARC: a new associative classification approach

DaWaK'06 Proceedings of the 8th international conference on Data Warehousing and Knowledge Discovery
Mining rough association from text documents

RSCTC'06 Proceedings of the 5th international conference on Rough Sets and Current Trends in Computing
2-PS based associative text classification

DaWaK'05 Proceedings of the 7th international conference on Data Warehousing and Knowledge Discovery
Considering re-occurring features in associative classifiers

PAKDD'05 Proceedings of the 9th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining
On pruning and tuning rules for associative classifiers

KES'05 Proceedings of the 9th international conference on Knowledge-Based Intelligent Information and Engineering Systems - Volume Part III
Text mining technique for chinese written judgment of criminal case

PAISI'10 Proceedings of the 2010 Pacific Asia conference on Intelligence and Security Informatics
Mining class association rules for word sense disambiguation

SIIS'11 Proceedings of the 2011 international conference on Security and Intelligent Information Systems
Biologically relevant association rules for classification of microarray data

ACM SIGAPP Applied Computing Review
A term association translation model for naive bayes text classification

PAKDD'12 Proceedings of the 16th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining - Volume Part I
X-Class: Associative Classification of XML Documents by Structure

ACM Transactions on Information Systems (TOIS)
Two scalable algorithms for associative text classification

Information Processing and Management: an International Journal
Automatic Item Weight Generation for Pattern Mining and its Application

International Journal of Data Warehousing and Mining
ACNB: Associative Classification Mining Based on Naïve Bayesian Method

International Journal of Information Technology and Web Engineering
Editorial: Parameter-free classification in multi-class imbalanced data sets

Data & Knowledge Engineering

Quantified Score

Hi-index	0.01

Visualization

Abstract

A good text classifier is a classifier that efficiently categorizeslarge sets of text documents in a reasonable timeframe and with an acceptable accuracy, and that providesclassification rules that are human readable for possiblefine-tuning. If the training of the classifier is also quick,this could become in some application domains a good assetfor the classifier. Many techniques and algorithms forautomatic text categorization have been devised. Accordingto published literature, some are more accurate than others,and some provide more interpretable classification modelsthan others. However, none can combine all the beneficialproperties enumerated above. In this paper, we present anovel approach for automatic text categorization that borrowsfrom market basket analysis techniques using associationrule mining in the data-mining field. We focus on twomajor problems: (1) finding the best term association rulesin a textual database by generating and pruning; and (2)using the rules to build a text classifier. Our text categorizationmethod proves to be efficient and effective, and experimentson well-known collections show that the classifierperforms well. In addition, training as well as classificationare both fast and the generated rules are human readable.