Term-weighting approaches in automatic text retrieval
Information Processing and Management: an International Journal
Automatic indexing based on Bayesian inference networks
SIGIR '93 Proceedings of the 16th annual international ACM SIGIR conference on Research and development in information retrieval
Optimization of relevance feedback weights
SIGIR '95 Proceedings of the 18th annual international ACM SIGIR conference on Research and development in information retrieval
Learning routing queries in a query zone
Proceedings of the 20th annual international ACM SIGIR conference on Research and development in information retrieval
Wrappers for feature subset selection
Artificial Intelligence - Special issue on relevance
Using a generalized instance set for automatic text categorization
Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
Boosting and Rocchio applied to text filtering
Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
Context-sensitive learning methods for text categorization
ACM Transactions on Information Systems (TOIS)
A probabilistic description-oriented approach for categorizing web documents
Proceedings of the eighth international conference on Information and knowledge management
An Evaluation of Statistical Approaches to Text Categorization
Information Retrieval
Machine learning in automated text categorization
ACM Computing Surveys (CSUR)
Text Categorization with Suport Vector Machines: Learning with Many Relevant Features
ECML '98 Proceedings of the 10th European Conference on Machine Learning
A Comparative Study on Feature Selection in Text Categorization
ICML '97 Proceedings of the Fourteenth International Conference on Machine Learning
A Probabilistic Analysis of the Rocchio Algorithm with TFIDF for Text Categorization
ICML '97 Proceedings of the Fourteenth International Conference on Machine Learning
Pivoted Document Length Normalization
Pivoted Document Length Normalization
Intrusion detection in web applications using text mining
Engineering Applications of Artificial Intelligence
CyberIR --- A Technological Approach to Fight Cybercrime
PAISI, PACCF and SOCO '08 Proceedings of the IEEE ISI 2008 PAISI, PACCF, and SOCO international workshops on Intelligence and Security Informatics
A framework of feature selection methods for text categorization
ACL '09 Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 2 - Volume 2
On the importance of parameter tuning in text categorization
PSI'06 Proceedings of the 6th international Andrei Ershov memorial conference on Perspectives of systems informatics
Cross language text classification by model translation and semi-supervised learning
EMNLP '10 Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing
Pairwise optimized Rocchio algorithm for text categorization
Pattern Recognition Letters
Feature selection strategy in text classification
PAKDD'11 Proceedings of the 15th Pacific-Asia conference on Advances in knowledge discovery and data mining - Volume Part I
Intrusion detection using text mining in a web-based telemedicine system
AI'05 Proceedings of the 18th Australian Joint conference on Advances in Artificial Intelligence
Text categorization methods for automatic estimation of verbal intelligence
Expert Systems with Applications: An International Journal
Automatic text classification to support systematic reviews in medicine
Expert Systems with Applications: An International Journal
Hi-index | 0.00 |
Current trend in operational text categorization is the designing of fast classification tools. Several studies on improving accuracy of fast but less accurate classifiers have been recently carried out. In particular, enhanced versions of the Rocchio text classifier, characterized by high performance, have been proposed. However, even in these extended formulations the problem of tuning its parameters is still neglected. In this paper, a study on parameters of the Rocchio text classifier has been carried out to achieve its maximal accuracy. The result is a model for the automatic selection of parameters. Its main feature is to bind the searching space so that optimal parameters can be selected quickly. The space has been bound by giving a feature selection interpretation of the Rocchio parameters. The benefit of the approach has been assessed via extensive cross evaluation over three corpora in two languages. Comparative analysis shows that the performances achieved are relatively close to the best TC models (e.g. Support Vector Machines).