A Bayesian Framework for XML Information Retrieval: Searching and Learning with the INEX Collection

Authors:
Benjamin Piwowarski;Patrick Gallinari
Affiliations:
Center for Web Research, DCC, Universidad de Chile, Santiago, Chile;LIP6, Paris, France 75015
Venue:
Information Retrieval
Year:
2005

Citing 12
Cited 7

Probabilistic reasoning in intelligent systems: networks of plausible inference

Probabilistic reasoning in intelligent systems: networks of plausible inference
Effective retrieval of structured documents

SIGIR '94 Proceedings of the 17th annual international ACM SIGIR conference on Research and development in information retrieval
A belief network model for IR

SIGIR '96 Proceedings of the 19th annual international ACM SIGIR conference on Research and development in information retrieval
Dempster-Shafer's theory of evidence applied to structured documents: modelling uncertainty

Proceedings of the 20th annual international ACM SIGIR conference on Research and development in information retrieval
A flexible model for retrieval of SGML documents

Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
Introduction to Bayesian Networks

Introduction to Bayesian Networks
Threshold Setting and Performance Optimization in Adaptive Filtering

Information Retrieval
HySpirit - A Probabilistic Inference Engine for Hypermedia Retrieval in Large Databases

EDBT '98 Proceedings of the 6th International Conference on Extending Database Technology: Advances in Database Technology
Improving the efficiency of the Bayesian network retrieval model by reducing relationships between terms

International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems - Intelligent information systems
Learning probabilistic networks

The Knowledge Engineering Review
The overlap problem in content-oriented XML retrieval evaluation

Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval
Providing consistent and exhaustive relevance assessments for XML retrieval evaluation

Proceedings of the thirteenth ACM international conference on Information and knowledge management

Information retrieval and applications of graphical models (IRGM 2007)

ACM SIGIR Forum
The Garnata Information Retrieval System at INEX'07

Focused Access to XML Documents
Web document modeling

The adaptive web
Relaxed global term weights for XML element search

INEX'10 Proceedings of the 9th international conference on Initiative for the evaluation of XML retrieval: comparative evaluation of focused retrieval
An algebra for structured queries in bayesian networks

INEX'04 Proceedings of the Third international conference on Initiative for the Evaluation of XML Retrieval
Machine learning ranking for structured information retrieval

ECIR'06 Proceedings of the 28th European conference on Advances in Information Retrieval
Fast and incremental indexing in effective and efficient XML element retrieval systems

Proceedings of the 14th International Conference on Information Integration and Web-based Applications & Services

Quantified Score

Hi-index	0.00

Visualization

Abstract

Most recent document standards like XML rely on structured representations. On the other hand, current information retrieval systems have been developed for flat document representations and cannot be easily extended to cope with more complex document types. The design of such systems is still an open problem. We present a new model for structured document retrieval which allows computing scores of document parts. This model is based on Bayesian networks whose conditional probabilities are learnt from a labelled collection of structured documents--which is composed of documents, queries and their associated assessments. Training these models is a complex machine learning task and is not standard. This is the focus of the paper: we propose here to train the structured Bayesian Network model using a cross-entropy training criterion. Results are presented on the INEX corpus of XML documents.