Instance-Based Learning Algorithms
Machine Learning
Elements of information theory
Elements of information theory
A training algorithm for optimal margin classifiers
COLT '92 Proceedings of the fifth annual workshop on Computational learning theory
The nature of statistical learning theory
The nature of statistical learning theory
Wrappers for feature subset selection
Artificial Intelligence - Special issue on relevance
The Random Subspace Method for Constructing Decision Forests
IEEE Transactions on Pattern Analysis and Machine Intelligence
An introduction to support Vector Machines: and other kernel-based learning methods
An introduction to support Vector Machines: and other kernel-based learning methods
Unsupervised Feature Selection Using Feature Similarity
IEEE Transactions on Pattern Analysis and Machine Intelligence
Learning to Classify Text Using Support Vector Machines: Methods, Theory and Algorithms
Learning to Classify Text Using Support Vector Machines: Methods, Theory and Algorithms
Discretization: An Enabling Technique
Data Mining and Knowledge Discovery
An introduction to variable and feature selection
The Journal of Machine Learning Research
An extensive empirical study of feature selection metrics for text classification
The Journal of Machine Learning Research
Pattern Classification (2nd Edition)
Pattern Classification (2nd Edition)
Efficient Feature Selection via Analysis of Relevance and Redundancy
The Journal of Machine Learning Research
Not So Naive Bayes: Aggregating One-Dependence Estimators
Machine Learning
IEEE Transactions on Pattern Analysis and Machine Intelligence
Random subspace method for multivariate feature selection
Pattern Recognition Letters
Data Mining: Practical Machine Learning Tools and Techniques, Second Edition (Morgan Kaufmann Series in Data Management Systems)
Feature Extraction: Foundations and Applications (Studies in Fuzziness and Soft Computing)
Feature Extraction: Foundations and Applications (Studies in Fuzziness and Soft Computing)
Statistical Comparisons of Classifiers over Multiple Data Sets
The Journal of Machine Learning Research
A discretization algorithm based on Class-Attribute Contingency Coefficient
Information Sciences: an International Journal
Artificial Intelligence in Medicine
A review of feature selection techniques in bioinformatics
Bioinformatics
Introduction to Information Retrieval
Introduction to Information Retrieval
The Feature Importance Ranking Measure
ECML PKDD '09 Proceedings of the European Conference on Machine Learning and Knowledge Discovery in Databases: Part II
Information Theory in Computer Vision and Pattern Recognition
Information Theory in Computer Vision and Pattern Recognition
A hybrid GA/SVM approach for gene selection and classification of microarray data
EuroGP'06 Proceedings of the 2006 international conference on Applications of Evolutionary Computing
Recognition of word collocation habits using frequency rank ratio and inter-term intimacy
Expert Systems with Applications: An International Journal
Feature subset selection Filter-Wrapper based on low quality data
Expert Systems with Applications: An International Journal
A feature construction method for general object recognition
Pattern Recognition
Hi-index | 0.01 |
Many learning problems require handling high dimensional datasets with a relatively small number of instances. Learning algorithms are thus confronted with the curse of dimensionality, and need to address it in order to be effective. Examples of these types of data include the bag-of-words representation in text classification problems and gene expression data for tumor detection/classification. Usually, among the high number of features characterizing the instances, many may be irrelevant (or even detrimental) for the learning tasks. It is thus clear that there is a need for adequate techniques for feature representation, reduction, and selection, to improve both the classification accuracy and the memory requirements. In this paper, we propose combined unsupervised feature discretization and feature selection techniques, suitable for medium and high-dimensional datasets. The experimental results on several standard datasets, with both sparse and dense features, show the efficiency of the proposed techniques as well as improvements over previous related techniques.