On the limited memory BFGS method for large scale optimization
Mathematical Programming: Series A and B
A maximum entropy approach to natural language processing
Computational Linguistics
Enhanced hypertext categorization using hyperlinks
SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Text Classification from Labeled and Unlabeled Documents using EM
Machine Learning - Special issue on information retrieval
Introduction to Modern Information Retrieval
Introduction to Modern Information Retrieval
Web classification using support vector machine
Proceedings of the 4th international workshop on Web information and data management
Reducing multiclass to binary: a unifying approach for margin classifiers
The Journal of Machine Learning Research
In Defense of One-Vs-All Classification
The Journal of Machine Learning Research
Semi-supervised learning for multi-component data classification
IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
Hi-index | 0.00 |
Designing high performance classifiers for structured data consisting of multiple components is an important and challenging research issue in the field of machine learning. Although the main component of structured data plays an important role when designing classifiers, additional components may contain beneficial information for classification. This paper focuses on a probabilistic classifier design for multiclass classification based on the combination of main and additional components. Our formulation separately considers component generative models and constructs the classifier by combining these trained models based on the maximum entropy principle. We use naive Bayes models as the component generative models for text and link components so that we can apply our classifier design to document and web page classification problems. Our experimental results for three test collections confirmed that the proposed method effectively combined the main and additional components to improve classification performance.