Optimum polynomial retrieval functions based on the probability ranking principle
ACM Transactions on Information Systems (TOIS)
Probabilistic models in information retrieval
The Computer Journal - Special issue on information retrieval
Combining labeled and unlabeled data with co-training
COLT' 98 Proceedings of the eleventh annual conference on Computational learning theory
Semi-supervised support vector machines
Proceedings of the 1998 conference on Advances in neural information processing systems II
Modern Information Retrieval
The VLDB Journal — The International Journal on Very Large Data Bases
Optimizing search engines using clickthrough data
Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Searching XML documents via XML fragments
Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval
XRANK: ranked keyword search over XML documents
Proceedings of the 2003 ACM SIGMOD international conference on Management of data
Texquery: a full-text search extension to xquery
Proceedings of the 13th international conference on World Wide Web
FleXPath: flexible structure and full-text querying for XML
SIGMOD '04 Proceedings of the 2004 ACM SIGMOD international conference on Management of data
XSeq: an indexing infrastructure for tree pattern queries
SIGMOD '04 Proceedings of the 2004 ACM SIGMOD international conference on Management of data
Ctree: a compact tree for indexing XML data
Proceedings of the 6th annual ACM international workshop on Web information and data management
Structure and content scoring for XML
VLDB '05 Proceedings of the 31st international conference on Very large data bases
XSEarch: a semantic search engine for XML
VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
Mixed mode XML query processing
VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
Hi-index | 0.00 |
In this paper, we study the use of XML tagged keywords (or simply key-tags) to search an XML fragment in a collection of XML documents. We present techniques that are able to employ users' evaluations as feedback and then to generate an adaptive ranked list of XML fragments as the search results. First, we extend the vector space model as a basis to search XML fragments. The model examines the relevance between the imposed key-tags and identified fragments in XML documents, and determines the ranked result as an output. Second, in order to deal with the diversified nature of XML documents, we present four XML Rankers (XRs), which have different strengths in terms of similarity, granularity, and ranking features. The XRs are specially tailored to diversified XML documents. We then evaluate the XML search effectiveness and quality for each tailored XR and propose a meta-XML ranker (MXR) comprising the four XRs. The MXR is trained via a machine learning training scheme, which we term the ranking support vector machine (RSVM) in a co-training framework (RSCF). The RSCF takes as input two sets of labelled fragments and feature vectors and then generates as output adaptive rankers in a learning process. We show empirically that, with only a small set of training XML fragments, the RSCF is able to improve after a few iterations in the learning process. Finally, we demonstrate that the RSCF-based MXR is able to bring out the strengths of the underlying XRs in order to adapt the users' perspectives on the returned search results. By using a set of key-tag queries on a variety of XML documents, we show that the precision of the result of the RSCF-based MXR is effective.