A statistical learning learning model of text classification for support vector machines
Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
A Tutorial on Support Vector Machines for Pattern Recognition
Data Mining and Knowledge Discovery
RCV1: A New Benchmark Collection for Text Categorization Research
The Journal of Machine Learning Research
Learning from little: comparison of classifiers given little training
PKDD '04 Proceedings of the 8th European Conference on Principles and Practice of Knowledge Discovery in Databases
An analysis of the relative hardness of Reuters-21578 subsets: Research Articles
Journal of the American Society for Information Science and Technology
Mercer’s theorem, feature maps, and smoothing
COLT'06 Proceedings of the 19th annual conference on Learning Theory
A comparison of methods for multiclass support vector machines
IEEE Transactions on Neural Networks
Hi-index | 0.00 |
Service discovery plays an important role in service composition. In order to achieve better performance of service discovery, often, service classification should be in place to group available services into different classes. While having a powerful classifier at hand is essential for the task of classification, existing methods usually assume that the class-labels of services are available in prior, which is not true. Traditional clustering methods consume a great deal of time and resources in processing Web service data and result in poor performance, because of the high-dimensional and sparse characteristics of WSDL documents. In this paper, Latent Semantic Analysis (LSA) is combined with the Expectation-Maximization (EM) algorithm to compensate for the poor performance of a single learning model. The obtained class-labels are then used by the Support Vector Machine (SVM) classifier for further classification. We evaluate our approach based on real world WSDL files. The experimental results reveal the effectiveness of the proposed method in terms of accuracy and quality of service clustering and classification.