Scoring and Selecting Terms for Text Categorization

Authors:
Elena Montanes;Irene Diaz;Jose Ranilla;Elias F. Combarro;Javier Fernandez
Affiliations:
University of Oviedo;University of Oviedo;University of Oviedo;University of Oviedo;University of Oviedo
Venue:
IEEE Intelligent Systems
Year:
2005

Citing 8
Cited 10

Automated learning of decision rules for text categorization

ACM Transactions on Information Systems (TOIS)
Machine learning in automated text categorization

ACM Computing Surveys (CSUR)
Introduction to Modern Information Retrieval

Introduction to Modern Information Retrieval
Text Categorization with Suport Vector Machines: Learning with Many Relevant Features

ECML '98 Proceedings of the 10th European Conference on Machine Learning
A Comparative Study on Feature Selection in Text Categorization

ICML '97 Proceedings of the Fourteenth International Conference on Machine Learning
Feature Selection for Unbalanced Class Distribution and Naive Bayes

ICML '99 Proceedings of the Sixteenth International Conference on Machine Learning
Discriminative Features for Document Classification

ICPR '02 Proceedings of the 16 th International Conference on Pattern Recognition (ICPR'02) Volume 1 - Volume 1
Improving performance of text categorization by combining filtering and support vector machines: Research Articles

Journal of the American Society for Information Science and Technology

Finding optimal linear measures for feature selection in text categorization

Proceedings of the 2006 ACM symposium on Applied computing
Using Laplace and angular measures for Feature Selection in Text Categorisation

International Journal of Advanced Intelligence Paradigms
A class-feature-centroid classifier for text categorization

Proceedings of the 18th international conference on World wide web
Set Cover Feature Selection for Text Categorisation and spam detection

International Journal of Advanced Intelligence Paradigms
A framework for diagnosis of urinary incontinence disease based on scoring measures and automatic classifiers

Computers in Biology and Medicine
Mining association language patterns using a distributional semantic model for negative life event classification

Journal of Biomedical Informatics
A parallel ACO algorithm to select terms to categorise longer documents

International Journal of Computational Science and Engineering
An enhanced ACO algorithm to select features for text categorization and its parallelization

Expert Systems with Applications: An International Journal
Categorical proportional difference: a feature selection method for text categorization

AusDM '08 Proceedings of the 7th Australasian Data Mining Conference - Volume 87
Feature ranking fusion for text classifier

Intelligent Data Analysis

Quantified Score

Hi-index	0.01

Visualization

Abstract

Machine learning has become one of the main approaches to tackling text categorization. Because text domains present much irrelevant information, effective feature reduction is essential to improve classifiers' effectiveness and efficiency. A set of new scoring measures for feature selection taken from the machine learning domain were evaluated over two well-known collections of documents. Some of these measures outperformed traditional measures from information retrieval and information theory in certain situations.