Expressive retrieval from XML documents

Authors:
Taurai Tapiwa Chinenyanga;Nicholas Kushmerick
Affiliations:
Univ. College Dublin, Dublin, Ireland;Univ. College Dublin, Dublin, Ireland
Venue:
Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
Year:
2001

Citing 14
Cited 19

Automatic text processing

Automatic text processing
Mediators in the Architecture of Future Information Systems

Computer
The merge/purge problem for large databases

SIGMOD '95 Proceedings of the 1995 ACM SIGMOD international conference on Management of data
Dempster-Shafer's theory of evidence applied to structured documents: modelling uncertainty

Proceedings of the 20th annual international ACM SIGIR conference on Research and development in information retrieval
A first course in database systems

A first course in database systems
Integration of heterogeneous databases without common domains using queries based on textual similarity

SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Storing semistructured data with STORED

SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
A query language for XML

WWW '99 Proceedings of the eighth international conference on World Wide Web
Integrating keyword search into XML query processing

Proceedings of the 9th international World Wide Web conference on Computer networks : the international journal of computer and telecommunications netowrking
SilkRoute: trading between relations and XML

Proceedings of the 9th international World Wide Web conference on Computer networks : the international journal of computer and telecommunications netowrking
WHIRL: a word-based information representation language

Artificial Intelligence - Special issue on Intelligent internet systems
Proximity Search in Databases

VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
Relational Databases for Querying XML Documents: Limitations and Opportunities

VLDB '99 Proceedings of the 25th International Conference on Very Large Data Bases
Adding Relevance to XML

Selected papers from the Third International Workshop WebDB 2000 on The World Wide Web and Databases

Schema-Driven Evaluation of Approximate Tree-Pattern Queries

EDBT '02 Proceedings of the 8th International Conference on Extending Database Technology: Advances in Database Technology
Interactive Query Formulation in Semistructured Databases

FQAS '02 Proceedings of the 5th International Conference on Flexible Query Answering Systems
Information Alert in Distributed Digital Libraries: The Models, Languages, and Architecture of DIAS

ECDL '02 Proceedings of the 6th European Conference on Research and Advanced Technology for Digital Libraries
Data Models and Languages for Agent-Based Textual Information Dissemination

CIA '02 Proceedings of the 6th International Workshop on Cooperative Information Agents VI
Web retrieval of XML documents: practice and challenges

Web-enabled systems integration
Four-valued knowledge augmentation for structured document retrieval

International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems
Selective information dissemination in P2P networks: problems and solutions

ACM SIGMOD Record
XIRQL: An XML query language based on information retrieval concepts

ACM Transactions on Information Systems (TOIS)
The effectiveness of automatically structured queries in digital libraries

Proceedings of the 4th ACM/IEEE-CS joint conference on Digital libraries
Filtering algorithms for information retrieval models with named attributes and proximity operators

Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval
Web Searching and Information Retrieval

Computing in Science and Engineering
Searching structured documents

Information Processing and Management: an International Journal
Choosing document structure weights

Information Processing and Management: an International Journal
An efficient and versatile query engine for TopX search

VLDB '05 Proceedings of the 31st international conference on Very large data bases
Logic and Computational Complexity for Boolean Information Retrieval

IEEE Transactions on Knowledge and Data Engineering
Information filtering and query indexing for an information retrieval model

ACM Transactions on Information Systems (TOIS)
Retrieving meaningful relaxed tightest fragments for XML keyword search

Proceedings of the 12th International Conference on Extending Database Technology: Advances in Database Technology
VIREX: visual relational to XML conversion tool

Journal of Visual Languages and Computing
Construction of a test collection for the focussed retrieval of structured documents

ECIR'03 Proceedings of the 25th European conference on IR research

Quantified Score

Hi-index	0.00

Visualization

Abstract

The emergence of XML as a standard interchange format for structured documents/data has given rise to many XML query language proposals. However, some of these languages do not support information retrieval-style ranked queries based on textual similarity. There have been several extensions to these query languages to support keyword search, but the resulting query languages cannot express queries such as``find books and CDs with similar titles''. Either these extensions use keywords as mere boolean filters, or similarities can be calculated only between data values and constants rather than two data values. We propose ELIXIR, an \textbf{\underline{e}}xpressive and \textbf{\underline{e}}fficient\textbf{\underline{l}}anguage for \textbf{\underline{X}}ML \textbf{\underline{i}}nformation \textbf{\underline{r}}etrieval that extends the query language XML-QL \cite{deutsch-www8,deutsch-deb99} with a textual similarity operator. ELIXIR is a general-purpose XML information retrieval language, sufficiently expressive to handle the above query. Our algorithm for answering ELIXIR queries rewrites the original ELIXIR query into a series of XML-QL queries that generate intermediate relational data, and uses relational database techniques to efficiently evaluate the similarity operators on this intermediate data, yielding an XML document with nodes ranked by similarity. Our experiments demonstrate that our prototype scales well with the size of the XML data and complexity of the query.