C4.5: programs for machine learning
C4.5: programs for machine learning
Cluster-based text categorization: a comparison of category search strategies
SIGIR '95 Proceedings of the 18th annual international ACM SIGIR conference on Research and development in information retrieval
Extending naïve Bayes classifiers using long itemsets
KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
Automatic Indexing: An Experimental Inquiry
Journal of the ACM (JACM)
Growing decision trees on support-less association rules
Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining
An Evaluation of Statistical Approaches to Text Categorization
Information Retrieval
Machine learning in automated text categorization
ACM Computing Surveys (CSUR)
Information Retrieval
Introduction to Modern Information Retrieval
Introduction to Modern Information Retrieval
Machine Learning
Text Categorization with Suport Vector Machines: Learning with Many Relevant Features
ECML '98 Proceedings of the 10th European Conference on Machine Learning
Mining Sequential Patterns: Generalizations and Performance Improvements
EDBT '96 Proceedings of the 5th International Conference on Extending Database Technology: Advances in Database Technology
ICDE '95 Proceedings of the Eleventh International Conference on Data Engineering
PrefixSpan: Mining Sequential Patterns by Prefix-Projected Growth
Proceedings of the 17th International Conference on Data Engineering
CMAR: Accurate and Efficient Classification Based on Multiple Class-Association Rules
ICDM '01 Proceedings of the 2001 IEEE International Conference on Data Mining
Fast Algorithms for Mining Association Rules in Large Databases
VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
Sequential PAttern mining using a bitmap representation
Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Text Document Categorization by Term Association
ICDM '02 Proceedings of the 2002 IEEE International Conference on Data Mining
Visualizing Sequential Patterns for Text Mining
INFOVIS '00 Proceedings of the IEEE Symposium on Information Vizualization 2000
Incremental mining of sequential patterns in large databases
Data & Knowledge Engineering
On support thresholds in associative classification
Proceedings of the 2004 ACM symposium on Applied computing
Pre-Processing Time Constraints for Efficiently Mining Generalized Sequential Patterns
TIME '04 Proceedings of the 11th International Symposium on Temporal Representation and Reasoning
Word selection for EBMT based on monolingual similarity and translation confidence
HLT-NAACL-PARALLEL '03 Proceedings of the HLT-NAACL 2003 Workshop on Building and using parallel texts: data driven machine translation and beyond - Volume 3
Data & Knowledge Engineering
Sequential Patterns for Maintaining Ontologies over Time
OTM '08 Proceedings of the OTM 2008 Confederated International Conferences, CoopIS, DOA, GADA, IS, and ODBASE 2008. Part II on On the Move to Meaningful Internet Systems
Extraction of unexpected sentences: A sentiment classification assessed approach
Intelligent Data Analysis
A pattern discovery model for effective text mining
MLDM'12 Proceedings of the 8th international conference on Machine Learning and Data Mining in Pattern Recognition
Discovering relevant features for effective query formulation
IRFC'12 Proceedings of the 5th conference on Multidisciplinary Information Retrieval
Free-gram phrase identification for modeling Chinese text
Information Processing Letters
Hi-index | 0.00 |
Text categorization is a well-known task based essentially on statistical approaches using neural networks, Support Vector Machines and other machine learning algorithms. Texts are generally considered as bags of words without any order. Although these approaches have proven to be efficient, they do not provide users with comprehensive and reusable rules about their data. Such rules are, however, very important for users to describe trends in the data they have to analyze. In this framework, an association-rule based approach has been proposed by Bing Liu (CBA). We propose, in this paper, to extend this approach by using sequential patterns in the SPaC method (Sequential Patterns for Classification) for text categorization. Taking order into account allows us to represent the succession of words through a document without complex and time-consuming representations and treatments such as those performed in natural language and grammatical methods. The original method we propose here consists in mining sequential patterns in order to build a classifier. We experimentally show that our proposal is relevant, and that it is very interesting compared to other methods. In particular, our method outperforms CBA and provides better results than SVM on some corpus.