Feature ranking fusion for text classifier

Authors:
Masoud Makrehchi;Mohamed S. Kamel
Affiliations:
Department of Electrical, Computer, and Software Engineering, University of Ontario Institute of Technology UOIT, Oshawa, ON, Canada;Pattern Analysis and Machine Intelligence Lab, Department of Electrical and Computer Engineering, University of Waterloo, Waterloo, ON, Canada
Venue:
Intelligent Data Analysis
Year:
2012

Citing 26
Cited 0

A Distance-Based Attribute Selection Measure for Decision Tree Induction

Machine Learning
Democracy in neural nets: voting schemes for classification

Neural Networks
Recent trends in automatic information retrieval

Proceedings of the 9th annual international ACM SIGIR conference on Research and development in information retrieval
A Theoretical Study on Six Classifier Fusion Strategies

IEEE Transactions on Pattern Analysis and Machine Intelligence
Unsupervised Feature Selection Using Feature Similarity

IEEE Transactions on Pattern Analysis and Machine Intelligence
Feature Selection for Knowledge Discovery and Data Mining

Feature Selection for Knowledge Discovery and Data Mining
High-performing feature selection for text classification

Proceedings of the eleventh international conference on Information and knowledge management
Sum Versus Vote Fusion in Multiple Classifier Systems

IEEE Transactions on Pattern Analysis and Machine Intelligence
Comparing top k lists

SODA '03 Proceedings of the fourteenth annual ACM-SIAM symposium on Discrete algorithms
A Comparative Study on Feature Selection in Text Categorization

ICML '97 Proceedings of the Fourteenth International Conference on Machine Learning
A Probabilistic Analysis of the Rocchio Algorithm with TFIDF for Text Categorization

ICML '97 Proceedings of the Fourteenth International Conference on Machine Learning
Improving Text Classification by Shrinkage in a Hierarchy of Classes

ICML '98 Proceedings of the Fifteenth International Conference on Machine Learning
Performance analysis of pattern classifier combination by plurality voting

Pattern Recognition Letters
An Overview and Comparison of Voting Methods for Pattern Recognition

IWFHR '02 Proceedings of the Eighth International Workshop on Frontiers in Handwriting Recognition (IWFHR'02)
An extensive empirical study of feature selection metrics for text classification

The Journal of Machine Learning Research
Survey of Text Mining

Survey of Text Mining
Combining Pattern Classifiers: Methods and Algorithms

Combining Pattern Classifiers: Methods and Algorithms
Pattern Recognition Algorithms for Data Mining: Scalability, Knowledge Discovery, and Soft Granular Computing

Pattern Recognition Algorithms for Data Mining: Scalability, Knowledge Discovery, and Soft Granular Computing
Dimensionality Reduction for Supervised Learning with Reproducing Kernel Hilbert Spaces

The Journal of Machine Learning Research
RCV1: A New Benchmark Collection for Text Categorization Research

The Journal of Machine Learning Research
Feature selection using linear classifier weights: interaction with classification models

Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval
A pitfall and solution in multi-class feature selection for text classification

ICML '04 Proceedings of the twenty-first international conference on Machine learning
Scoring and Selecting Terms for Text Categorization

IEEE Intelligent Systems
Iterative RELIEF for Feature Weighting: Algorithms, Theories, and Applications

IEEE Transactions on Pattern Analysis and Machine Intelligence
Combining feature subsets in feature selection

MCS'05 Proceedings of the 6th international conference on Multiple Classifier Systems
AP-based borda voting method for feature extraction in TRECVID-2004

ECIR'05 Proceedings of the 27th European conference on Advances in Information Retrieval Research

Quantified Score

Hi-index	0.00

Visualization

Abstract

Feature ranking is widely used in text classification. One problem with feature ranking methods is their non-robust behavior when applied to different data sets. In other words, the feature ranking methods behave differently from one data set to the other. The problem becomes more complex when we consider that the performance of feature ranking methods highly depends on the type of text classifier. In this paper, a new method based on combining feature rankings is proposed to find the best features among a set of feature rankings. The proposed method is applied to the text classification problem and evaluated on three well-known data sets using Support Vector Machine and Rocchio classifier. Several combining methods are employed to aggregate ranked list of the features. We show that combining methods can offer reliable results very close to the best solution without the need to use a classifier.