Structured answers for a large structured document collection
SIGIR '93 Proceedings of the 16th annual international ACM SIGIR conference on Research and development in information retrieval
Dempster-Shafer's theory of evidence applied to structured documents: modelling uncertainty
Proceedings of the 20th annual international ACM SIGIR conference on Research and development in information retrieval
XIRQL: a query language for information retrieval in XML documents
Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
Querying and ranking XML documents
Journal of the American Society for Information Science and Technology - XML
Cumulated gain-based evaluation of IR techniques
ACM Transactions on Information Systems (TOIS)
Searching and Browsing Collections of Structural Information
ADL '00 Proceedings of the IEEE Advances in Digital Libraries 2000
XML retrieval: what to retrieve?
Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval
A Fusion Approach to XML Structured Document Retrieval
Information Retrieval
Generalized contextualization method for XML information retrieval
Proceedings of the 14th ACM international conference on Information and knowledge management
XQuery full-text extensions explained
IBM Systems Journal
eXtended cumulated gain measures for the evaluation of content-oriented XML retrieval
ACM Transactions on Information Systems (TOIS)
TopX: efficient and versatile top-k query processing for semistructured data
The VLDB Journal — The International Journal on Very Large Data Bases
Parameter estimation for a simple hierarchical generative model for XML retrieval
INEX'05 Proceedings of the 4th international conference on Initiative for the Evaluation of XML Retrieval
SIRIUS: a lightweight XML indexing and approximate search system at INEX 2005
INEX'05 Proceedings of the 4th international conference on Initiative for the Evaluation of XML Retrieval
XML information retrieval through tree edit distance and structural summaries
AIRS'11 Proceedings of the 7th Asia conference on Information Retrieval Technology
DTD based costs for tree-edit distance in structured information retrieval
ECIR'13 Proceedings of the 35th European conference on Advances in Information Retrieval
Hi-index | 0.00 |
The goal of an XML retrieval system is to select from a set of XML documents all elements (nodes) that fit the user information need, usually expressed by a set of keywords with some structural conditions. Structural conditions are simply given by an ordered list of tag names that gives the target element where to search for relevant content. Consequently a potential relevant node should not only contain similar text to the query but also its localization path should fit the structural conditions. We describe in this paper a new approach for ranking XML content-and-structure queries based on a probabilistic combination of two independent scores assigned to each XML element: content score and structural score. Content score measures the content similarity between an element and a query, the structural score measures the path similarity between an element path and the structural conditions of a query. We showed experimentally that both scores follow well-known distributions. We then proposed a probabilistic combination of these distributions in order to assign a final score to each node. Some experiments have been undertaken on a dataset provided by INEX to show the effectiveness of our approach. We emphasize our experiments on the VVCAS task which is appropriate to our model.