Exploiting semantic tags in XML retrieval

Authors:
Qiuyue Wang;Qiushi Li;Shan Wang;Xiaoyong Du
Affiliations:
School of Information, Renmin University of China and Key Laboratory of Data Engineering and Knowledge Engineering, MOE, Beijing, P.R. China;School of Information, Renmin University of China and Key Laboratory of Data Engineering and Knowledge Engineering, MOE, Beijing, P.R. China;School of Information, Renmin University of China and Key Laboratory of Data Engineering and Knowledge Engineering, MOE, Beijing, P.R. China;School of Information, Renmin University of China and Key Laboratory of Data Engineering and Knowledge Engineering, MOE, Beijing, P.R. China
Venue:
INEX'09 Proceedings of the Focused retrieval and evaluation, and 8th international conference on Initiative for the evaluation of XML retrieval
Year:
2009

Citing 13
Cited 2

A study of smoothing methods for language models applied to Ad Hoc information retrieval

Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
Two-stage language models for information retrieval

SIGIR '02 Proceedings of the 25th annual international ACM SIGIR conference on Research and development in information retrieval
Searching XML documents via XML fragments

Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval
Structured queries in XML retrieval

Proceedings of the 14th ACM international conference on Information and knowledge management
Semantic search via XML fragments: a high-precision approach to IR

SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
Why structural hints in queries do not help XML-retrieval

SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
A general optimization framework for smoothing language models on graph structures

Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
Statistical Language Models for Information Retrieval A Critical Review

Foundations and Trends in Information Retrieval
A Probabilistic Retrieval Model for Semistructured Data

ECIR '09 Proceedings of the 31th European Conference on IR Research on Advances in Information Retrieval
Building enriched document representations using aggregated anchor text

Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval
Parameter estimation for a simple hierarchical generative model for XML retrieval

INEX'05 Proceedings of the 4th international conference on Initiative for the Evaluation of XML Retrieval
Narrowed extended XPath i (NEXI)

INEX'04 Proceedings of the Third international conference on Initiative for the Evaluation of XML Retrieval
Hierarchical language models for XML component retrieval

INEX'04 Proceedings of the Third international conference on Initiative for the Evaluation of XML Retrieval

BUAP: a first approach to the data-centric track of INEX 2010

INEX'10 Proceedings of the 9th international conference on Initiative for the evaluation of XML retrieval: comparative evaluation of focused retrieval
Rewarding term location information to enhance probabilistic information retrieval

SIGIR '12 Proceedings of the 35th international ACM SIGIR conference on Research and development in information retrieval

Quantified Score

Hi-index	0.00

Visualization

Abstract

With the new semantically annotated Wikipedia XML corpus, we attempt to investigate the following two research questions. Do the structural constraints in CAS queries help in retrieving an XML document collection containing semantically rich tags? How to exploit the semantic tag information to improve the CO queries as most users prefer to express the simplest forms of queries? In this paper, we describe and analyze the work done on comparing CO and CAS queries over the document collection at INEX 2009 ad hoc track, and we propose a method to improve the effectiveness of CO queries by enriching the element content representations with semantic tags. Our results show that the approaches of enriching XML element representations with semantic tags are effective in improving the early precision, while on average precisions, strict interpretation of CAS queries are generally superior.