Using micro-documents for feature selection: The case of ordinal text classification

Authors:
Stefano Baccianella;Andrea Esuli;Fabrizio Sebastiani
Affiliations:
Istituto di Scienza e Tecnologie dell'Informazione, Consiglio Nazionale delle Ricerche, 56124 Pisa, Italy;Istituto di Scienza e Tecnologie dell'Informazione, Consiglio Nazionale delle Ricerche, 56124 Pisa, Italy;Istituto di Scienza e Tecnologie dell'Informazione, Consiglio Nazionale delle Ricerche, 56124 Pisa, Italy
Venue:
Expert Systems with Applications: An International Journal
Year:
2013

Citing 16
Cited 1

Term-weighting approaches in automatic text retrieval

Information Processing and Management: an International Journal
Exploring the similarity space

ACM SIGIR Forum
Fast training of support vector machines using sequential minimal optimization

Advances in kernel methods
Feature selection in SVM text categorization

AAAI '99/IAAI '99 Proceedings of the sixteenth national conference on Artificial intelligence and the eleventh Innovative applications of artificial intelligence conference innovative applications of artificial intelligence
Gene Selection for Cancer Classification using Support Vector Machines

Machine Learning
Text Categorization with Suport Vector Machines: Learning with Many Relevant Features

ECML '98 Proceedings of the 10th European Conference on Machine Learning
A Comparative Study on Feature Selection in Text Categorization

ICML '97 Proceedings of the Fourteenth International Conference on Machine Learning
An introduction to variable and feature selection

The Journal of Machine Learning Research
An extensive empirical study of feature selection metrics for text classification

The Journal of Machine Learning Research
Review spam detection

Proceedings of the 16th international conference on World Wide Web
Multi-facet Rating of Product Reviews

ECIR '09 Proceedings of the 31th European Conference on IR Research on Advances in Information Retrieval
Ambiguity measure feature-selection algorithm

Journal of the American Society for Information Science and Technology
Evaluation Measures for Ordinal Regression

ISDA '09 Proceedings of the 2009 Ninth International Conference on Intelligent Systems Design and Applications
Feature selection for ordinal regression

Proceedings of the 2010 ACM Symposium on Applied Computing
Seeing several stars: a rating inference task for a document containing several evaluation criteria

PAKDD'08 Proceedings of the 12th Pacific-Asia conference on Advances in knowledge discovery and data mining
Improving Farsi multiclass text classification using a thesaurus and two-stage feature selection

Journal of the American Society for Information Science and Technology

Feature selection for ordinal text classification

Neural Computation

Quantified Score

Hi-index	12.05

Visualization

Abstract

Most popular feature selection methods for text classification such as information gain (also known as ''mutual information''), chi-square, and odds ratio, are based on binary information indicating the presence/absence of the feature (or ''term'') in each training document. As such, these methods do not exploit a rich source of information, namely, the information concerning how frequently the feature occurs in the training document (term frequency). In order to overcome this drawback, when doing feature selection we logically break down each training document of length k into k training ''micro-documents'', each consisting of a single word occurrence and endowed with the same class information of the original training document. This move has the double effect of (a) allowing all the original feature selection methods based on binary information to be still straightforwardly applicable, and (b) making them sensitive to term frequency information. We study the impact of this strategy in the case of ordinal text classification, a type of text classification dealing with classes lying on an ordinal scale, and recently made popular by applications in customer relationship management, market research, and Web 2.0 mining. We run experiments using four recently introduced feature selection functions, two learning methods of the support vector machines family, and two large datasets of product reviews. The experiments show that the use of this strategy substantially improves the accuracy of ordinal text classification.