An evaluation of phrasal and clustered representations on a text categorization task
SIGIR '92 Proceedings of the 15th annual international ACM SIGIR conference on Research and development in information retrieval
SIGIR '94 Proceedings of the 17th annual international ACM SIGIR conference on Research and development in information retrieval
The nature of statistical learning theory
The nature of statistical learning theory
Induction of fuzzy decision trees
Fuzzy Sets and Systems
On the Optimality of the Simple Bayesian Classifier under Zero-One Loss
Machine Learning - Special issue on learning with probabilistic representations
Machine Learning - Special issue on learning with probabilistic representations
Inductive learning algorithms and representations for text categorization
Proceedings of the seventh international conference on Information and knowledge management
BoosTexter: A Boosting-based Systemfor Text Categorization
Machine Learning - Special issue on information retrieval
Text databases & document management
Introduction to Modern Information Retrieval
Introduction to Modern Information Retrieval
The use of bigrams to enhance text categorization
Information Processing and Management: an International Journal
Maximizing Text-Mining Performance
IEEE Intelligent Systems
Text Categorization with Suport Vector Machines: Learning with Many Relevant Features
ECML '98 Proceedings of the 10th European Conference on Machine Learning
Second Order Features for Maximising Text Classification Performance
EMCL '01 Proceedings of the 12th European Conference on Machine Learning
A Comparative Study on Feature Selection in Text Categorization
ICML '97 Proceedings of the Fourteenth International Conference on Machine Learning
Feature Engineering for Text Classification
ICML '99 Proceedings of the Sixteenth International Conference on Machine Learning
Pattern Classification (2nd Edition)
Pattern Classification (2nd Edition)
ECIR'03 Proceedings of the 25th European conference on IR research
The impact of conceptualization on text classification
WISE'12 Proceedings of the 13th international conference on Web Information Systems Engineering
Hi-index | 0.04 |
Nearly all text classification methods classify texts into predefined categories according to the terms appeared in texts. State-of-the-art of text classification prefer to simplely take a word as a term since it performs good on some famous datasets; some experts even pointed out that phrases don't improve or improve only marginally the classifiction accuracy. However, we found out that this is not always true when we try to categorize texts about similar topics in the same domain. With words only we can not categorize those texts effectively since they nearly share the same word set. Then we suppose the results might be improved if we also use phrases as terms. To testify our supposition, we propose our own phrase extraction way as well as select proper feature selection method and classifier by conducting experimental study on a data set which comes from paper abstracts in the field of Databases . Accordingly, we also develop a system called AutoPCS which can be used to help experts in choosing relevant topics for newly coming papers from a predefined topic list only by their abstracts.