Term-weighting approaches in automatic text retrieval
Information Processing and Management: an International Journal
Fast training of support vector machines using sequential minimal optimization
Advances in kernel methods
The Earth Mover's Distance as a Metric for Image Retrieval
International Journal of Computer Vision
Classification trees for problems with monotonicity constraints
ACM SIGKDD Explorations Newsletter
A Comparative Study on Feature Selection in Text Categorization
ICML '97 Proceedings of the Fourteenth International Conference on Machine Learning
Feature Selection for Clustering - A Filter Solution
ICDM '02 Proceedings of the 2002 IEEE International Conference on Data Mining
An introduction to variable and feature selection
The Journal of Machine Learning Research
An extensive empirical study of feature selection metrics for text classification
The Journal of Machine Learning Research
Web metasearch: rank vs. score based rank aggregation methods
Proceedings of the 2003 ACM symposium on Applied computing
A pitfall and solution in multi-class feature selection for text classification
ICML '04 Proceedings of the twenty-first international conference on Machine learning
Dimension Reduction for Supervised Ordering
ICDM '06 Proceedings of the Sixth International Conference on Data Mining
Computational Methods of Feature Selection (Chapman & Hall/Crc Data Mining and Knowledge Discovery Series)
ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
Proceedings of the 16th international conference on World Wide Web
Support Vector Ordinal Regression
Neural Computation
SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
A review of feature selection techniques in bioinformatics
Bioinformatics
Multi-facet Rating of Product Reviews
ECIR '09 Proceedings of the 31th European Conference on IR Research on Advances in Information Retrieval
TextGraphs-1 Proceedings of the First Workshop on Graph Based Methods for Natural Language Processing
Evaluation Measures for Ordinal Regression
ISDA '09 Proceedings of the 2009 Ninth International Conference on Intelligent Systems Design and Applications
Feature selection for ordinal regression
Proceedings of the 2010 ACM Symposium on Applied Computing
Seeing several stars: a rating inference task for a document containing several evaluation criteria
PAKDD'08 Proceedings of the 12th Pacific-Asia conference on Advances in knowledge discovery and data mining
Effective rank aggregation for metasearching
Journal of Systems and Software
Mathematical Linguistics
Feature Selection for Monotonic Classification
IEEE Transactions on Fuzzy Systems
Adaptive metric learning vector quantization for ordinal classification
Neural Computation
Using micro-documents for feature selection: The case of ordinal text classification
Expert Systems with Applications: An International Journal
Hi-index | 0.00 |
Ordinal classification also known as ordinal regression is a supervised learning task that consists of estimating the rating of a data item on a fixed, discrete rating scale. This problem is receiving increased attention from the sentiment analysis and opinion mining community due to the importance of automatically rating large amounts of product review data in digital form. As in other supervised learning tasks such as binary or multiclass classification, feature selection is often needed in order to improve efficiency and avoid overfitting. However, although feature selection has been extensively studied for other classification tasks, it has not for ordinal classification. In this letter, we present six novel feature selection methods that we have specifically devised for ordinal classification and test them on two data sets of product review data against three methods previously known from the literature, using two learning algorithms from the support vector regression tradition. The experimental results show that all six proposed metrics largely outperform all three baseline techniques and are more stable than these others by an order of magnitude, on both data sets and for both learning algorithms.