Training algorithms for linear text classifiers
SIGIR '96 Proceedings of the 19th annual international ACM SIGIR conference on Research and development in information retrieval
Feature selection, perceptron learning, and a usability case study for text categorization
Proceedings of the 20th annual international ACM SIGIR conference on Research and development in information retrieval
Data mining with decision trees and decision rules
Future Generation Computer Systems - Special double issue on data mining
Hierarchical classification of Web content
SIGIR '00 Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval
BoosTexter: A Boosting-based Systemfor Text Categorization
Machine Learning - Special issue on information retrieval
An Evaluation of Statistical Approaches to Text Categorization
Information Retrieval
Proceedings of the 10th international conference on World Wide Web
Machine Learning
Text Categorization with Suport Vector Machines: Learning with Many Relevant Features
ECML '98 Proceedings of the 10th European Conference on Machine Learning
Improving Text Classification by Shrinkage in a Hierarchy of Classes
ICML '98 Proceedings of the Fifteenth International Conference on Machine Learning
Building Hierarchical Classifiers Using Class Proximity
VLDB '99 Proceedings of the 25th International Conference on Very Large Data Bases
The VLDB Journal — The International Journal on Very Large Data Bases
An iterative approach for web catalog integration with support vector machines
AIRS'05 Proceedings of the Second Asia conference on Asia Information Retrieval Technology
Learning to integrate web catalogs with conceptual relationships in hierarchical thesaurus
AIRS'06 Proceedings of the Third Asia conference on Information Retrieval Technology
Hi-index | 0.00 |
In this paper, we study the problem of integrating documentsfrom different sources into a comprehensive topic hierarchy.Our objective is to develop efficient techniques thatimprove the accuracy of traditional categorization methodsby incorporating categorization information providedby data sources into categorization process. Notice thatin the World-Wide Web, categorization information is oftenavailable from information sources. We present severalenhancing techniques that use categorization informationto enhance traditional methods such as naive Bayes andsupport vector machines. Experiment on collections fromOpenfind and Yam, and Google and Yahoo!, well-knownpopular web sites in Taiwan and USA, respectively, showsthat our techniques significantly improve the classificationaccuracy from, for example, 55% to 66% for Naive Bayes,and from 57% to 67% for SVM for the data set collectedfrom Yam and Openfind.