Feature shaping for linear SVM classifiers
Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
Feature generation and representations for protein-protein interaction classification
Journal of Biomedical Informatics
Analytical evaluation of term weighting schemes for text categorization
Pattern Recognition Letters
A schema for ontology-based concept definition and identification
International Journal of Computer Applications in Technology
IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
Fast text categorization using concise semantic analysis
Pattern Recognition Letters
Adaptable term weighting framework for text classification
CICLing'11 Proceedings of the 12th international conference on Computational linguistics and intelligent text processing - Volume Part II
A semantic term weighting scheme for text categorization
Expert Systems with Applications: An International Journal
Text representation in multi-label classification: two new input representations
ICANNGA'11 Proceedings of the 10th international conference on Adaptive and natural computing algorithms - Volume Part II
Unsupervised feature weighting based on local feature relatedness
PAKDD'11 Proceedings of the 15th Pacific-Asia conference on Advances in knowledge discovery and data mining - Volume Part I
Nonlinear transformation of term frequencies for term weighting in text categorization
Engineering Applications of Artificial Intelligence
An empirical study on various text classifiers
Proceedings of the Second International Conference on Computational Science, Engineering and Information Technology
The impact of conceptualization on text classification
WISE'12 Proceedings of the 13th international conference on Web Information Systems Engineering
An Ontology Based Model for Document Clustering
International Journal of Intelligent Information Technologies
Class-indexing-based term weighting for automatic text classification
Information Sciences: an International Journal
Comparison of text feature selection policies and using an adaptive framework
Expert Systems with Applications: An International Journal
Matching semi-structured documents using similarity of regions through fuzzy rule-based system
ICDM'13 Proceedings of the 13th international conference on Advances in Data Mining: applications and theoretical aspects
A study of supervised term weighting scheme for sentiment analysis
Expert Systems with Applications: An International Journal
Hi-index | 0.15 |
In vector space model (VSM), text representation is the task of transforming the content of a textual document into a vector in the term space so that the document could be recognized and classified by a computer or a classifier. Different terms (i.e. words, phrases, or any other indexing units used to identify the contents of a text) have different importance in a text. The term weighting methods assign appropriate weights to the terms to improve the performance of text categorization. In this study, we investigate several widely-used unsupervised (traditional) and supervised term weighting methods on benchmark data collections in combination with SVM and kNN algorithms. In consideration of the distribution of relevant documents in the collection, we propose a new simple supervised term weighting method, i.e. tf.rf, to improve the terms' discriminating power for text categorization task. From the controlled experimental results, these supervised term weighting methods have mixed performance. Specifically, our proposed supervised term weighting method, tf.rf, has a consistently better performance than other term weighting methods while other supervised term weighting methods based on information theory or statistical metric perform the worst in all experiments. On the other hand, the popularly used tf.idf method has not shown a uniformly good performance in terms of different data sets.