The design and analysis of efficient lossless data compression systems
The design and analysis of efficient lossless data compression systems
A language modeling approach to information retrieval
Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
Suffix arrays: a new method for on-line string searches
SODA '90 Proceedings of the first annual ACM-SIAM symposium on Discrete algorithms
A Space-Economical Suffix Tree Construction Algorithm
Journal of the ACM (JACM)
Stochastic Complexity in Statistical Inquiry Theory
Stochastic Complexity in Statistical Inquiry Theory
Information Retrieval
Modern Information Retrieval
Flexible Interface Matching for Web-Service Discovery
WISE '03 Proceedings of the Fourth International Conference on Web Information Systems Engineering
A compression-based algorithm for Chinese word segmentation
Computational Linguistics
A Vector Space Search Engine forWeb Services
ECOWS '05 Proceedings of the Third European Conference on Web Services
Inverted files for text search engines
ACM Computing Surveys (CSUR)
A Framework for XML Web Services Retrieval with Ranking
MUE '07 Proceedings of the 2007 International Conference on Multimedia and Ubiquitous Engineering
Information Processing and Management: an International Journal
Easy web service discovery: A query-by-example approach
Science of Computer Programming
Investigating web services on the world wide web
Proceedings of the 17th international conference on World Wide Web
Taming Web Services from the Wild
IEEE Internet Computing
On score distributions and relevance
ECIR'07 Proceedings of the 29th European conference on IR research
Improving Web Service descriptions for effective service discovery
Science of Computer Programming
Paper: Modeling by shortest data description
Automatica (Journal of IFAC)
Error bounds for convolutional codes and an asymptotically optimum decoding algorithm
IEEE Transactions on Information Theory
IEEE Transactions on Information Theory
A Survey of Approaches to Web Service Discovery in Service-Oriented Architectures
Journal of Database Management
International Journal of Web Engineering and Technology
Hi-index | 0.00 |
The IR-style Web services discovery represents an important approach that applies proven techniques developed in the field of Information Retrieval (IR). Many studies exploited the Web Services Description Language (WSDL) syntax to extract useful service metadata for building indexes. However, a fundamental issue associated with this approach is the WSDL term tokenization. This paper proposes the application of three statistical methods for WSDL term tokenization-MDL, TP, and PPM. With the increasing need for effective IR-style Web services discovery facilities, term tokenization is of fundamental importance for properly indexing WSDL documents. We compare our applied methods with two baseline methods. The experiment suggests the superiority of MDL and PPM methods based on IR evaluation metrics. To the best of our knowledge, our work is the first to systematically investigate the issue of WSDL term tokenization for Web services discovery. Our solution can benefit source coding mining, in which a key step is to tokenize names (i.e. terms) of variables, functions, classes, modules, etc. for semantic analysis. Our methods could also be used for solving Web-related string tokenization problems such as URL analysis and Web scripts comprehension.