Fundamentals of neural networks: architectures, algorithms, and applications
Fundamentals of neural networks: architectures, algorithms, and applications
Noise reduction in a statistical approach to text categorization
SIGIR '95 Proceedings of the 18th annual international ACM SIGIR conference on Research and development in information retrieval
Wrappers for feature subset selection
Artificial Intelligence - Special issue on relevance
Feature selection on hierarchy of web documents
Decision Support Systems - Web retrieval and mining
Text Categorization with Suport Vector Machines: Learning with Many Relevant Features
ECML '98 Proceedings of the 10th European Conference on Machine Learning
A Comparative Study on Feature Selection in Text Categorization
ICML '97 Proceedings of the Fourteenth International Conference on Machine Learning
A Probabilistic Analysis of the Rocchio Algorithm with TFIDF for Text Categorization
ICML '97 Proceedings of the Fourteenth International Conference on Machine Learning
An introduction to variable and feature selection
The Journal of Machine Learning Research
An extensive empirical study of feature selection metrics for text classification
The Journal of Machine Learning Research
Pattern Recognition and Machine Learning (Information Science and Statistics)
Pattern Recognition and Machine Learning (Information Science and Statistics)
A novel feature selection algorithm for text categorization
Expert Systems with Applications: An International Journal
Author identification: Using text sampling to handle the class imbalance problem
Information Processing and Management: an International Journal
A review of feature selection techniques in bioinformatics
Bioinformatics
Introduction to Information Retrieval
Introduction to Information Retrieval
Subspace based feature selection for pattern recognition
Information Sciences: an International Journal
A Probabilistic Approach to Feature Selection for Multi-class Text Categorization
ISNN '07 Proceedings of the 4th international symposium on Neural Networks: Advances in Neural Networks
Pattern Recognition, Fourth Edition
Pattern Recognition, Fourth Edition
Category Classification and Topic Discovery of Japanese and English News Articles
Electronic Notes in Theoretical Computer Science (ENTCS)
Feature selection with a measure of deviations from Poisson in text categorization
Expert Systems with Applications: An International Journal
Feature selection for text classification with Naïve Bayes
Expert Systems with Applications: An International Journal
Feature selection with dynamic mutual information
Pattern Recognition
Review: A review of machine learning approaches to Spam filtering
Expert Systems with Applications: An International Journal
The search for optimal feature set in power quality event classification
Expert Systems with Applications: An International Journal
Ambiguity measure feature-selection algorithm
Journal of the American Society for Information Science and Technology
Combining neural networks and semantic feature space for email classification
Knowledge-Based Systems
A decision-tree-based symbolic rule induction system for text categorization
IBM Systems Journal
Information Processing and Management: an International Journal - Special issue: Formal methods for information retrieval
A comparison study on multiple binary-class SVM methods for unilabel text categorization
Pattern Recognition Letters
A Web page classification system based on a genetic algorithm using tagged-terms as features
Expert Systems with Applications: An International Journal
Using chi-square statistics to measure similarities for text categorization
Expert Systems with Applications: An International Journal
A new feature selection algorithm based on binomial hypothesis testing for spam filtering
Knowledge-Based Systems
Contributions to the study of SMS spam filtering: new collection and results
Proceedings of the 11th ACM symposium on Document engineering
On feature extraction for spam e-mail detection
MRCS'06 Proceedings of the 2006 international conference on Multimedia Content Representation, Classification and Security
Author gender identification from text
Digital Investigation: The International Journal of Digital Forensics & Incident Response
Support vector machines for spam categorization
IEEE Transactions on Neural Networks
A comparison of methods for multiclass support vector machines
IEEE Transactions on Neural Networks
The impact of preprocessing on text classification
Information Processing and Management: an International Journal
A hybrid Gini PSO-SVM feature selection based on Taguchi method: an evaluation on email filtering
Proceedings of the 8th International Conference on Ubiquitous Information Management and Communication
Improved categorical distribution difference feature selection for Chinese document categorization
Proceedings of the 8th International Conference on Ubiquitous Information Management and Communication
Hi-index | 0.00 |
High dimensionality of the feature space is one of the most important concerns in text classification problems due to processing time and accuracy considerations. Selection of distinctive features is therefore essential for text classification. This study proposes a novel filter based probabilistic feature selection method, namely distinguishing feature selector (DFS), for text classification. The proposed method is compared with well-known filter approaches including chi square, information gain, Gini index and deviation from Poisson distribution. The comparison is carried out for different datasets, classification algorithms, and success measures. Experimental results explicitly indicate that DFS offers a competitive performance with respect to the abovementioned approaches in terms of classification accuracy, dimension reduction rate and processing time.