New feature selection and weighting methods based on category information

Authors:
Gongshen Liu;Jianhua Li;Xiang Li;Qiang Li
Affiliations:
School of Information Security Engineering, Shanghai Jiaotong University, Shanghai, China;School of Information Security Engineering, Shanghai Jiaotong University, Shanghai, China;School of Information Security Engineering, Shanghai Jiaotong University, Shanghai, China;School of Information Security Engineering, Shanghai Jiaotong University, Shanghai, China
Venue:
ICADL'04 Proceedings of the 7th international Conference on Digital Libraries: international collaboration and cross-fertilization
Year:
2004

Citing 10
Cited 0

Trading MIPS and memory for knowledge engineering

Communications of the ACM
Probabilistic information retrieval as a combination of abstraction, inductive learning, and probabilistic assumptions

ACM Transactions on Information Systems (TOIS)
Improving text retrieval for the routing problem using latent semantic indexing

SIGIR '94 Proceedings of the 17th annual international ACM SIGIR conference on Research and development in information retrieval
Training algorithms for linear text classifiers

SIGIR '96 Proceedings of the 19th annual international ACM SIGIR conference on Research and development in information retrieval
BoosTexter: A Boosting-based Systemfor Text Categorization

Machine Learning - Special issue on information retrieval
An Evaluation of Statistical Approaches to Text Categorization

Information Retrieval
Introduction to Modern Information Retrieval

Introduction to Modern Information Retrieval
Naive (Bayes) at Forty: The Independence Assumption in Information Retrieval

ECML '98 Proceedings of the 10th European Conference on Machine Learning
Text Categorization with Suport Vector Machines: Learning with Many Relevant Features

ECML '98 Proceedings of the 10th European Conference on Machine Learning
A Comparative Study on Feature Selection in Text Categorization

ICML '97 Proceedings of the Fourteenth International Conference on Machine Learning

Quantified Score

Hi-index	0.00

Visualization

Abstract

The traditional methods of feature selection and weighting make the best of document information, but despise or ignore the category information. The new feature selection and weighting methods use category information as a factor, which make up the disadvantages of traditional methods. Using new methods, the features distributed equally on a single category are more important than using old methods. It is proved by the experiment that four famous classifiers based on new feature selection and weighting methods are more effective than those based on traditional methods.