An effective and efficient approach for keyword-based XML retrieval

Authors:
Xiaoguang Li;Jian Gong;Daling Wang;Ge Yu
Affiliations:
School of Information Science and Engineering, Northeastern University, Shenyang, P.R.China;School of Information Science and Engineering, Northeastern University, Shenyang, P.R.China;School of Information Science and Engineering, Northeastern University, Shenyang, P.R.China;School of Information Science and Engineering, Northeastern University, Shenyang, P.R.China
Venue:
WAIM'05 Proceedings of the 6th international conference on Advances in Web-Age Information Management
Year:
2005

Citing 10
Cited 1

On supporting containment queries in relational database management systems

SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
XIRQL: a query language for information retrieval in XML documents

Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
EquiX---a search and query language for XML

Journal of the American Society for Information Science and Technology - XML
The Index-Based XXL Search Engine for Querying XML Data with Relevance Ranking

EDBT '02 Proceedings of the 8th International Conference on Extending Database Technology: Advances in Database Technology
An XML Indexing Structure with Relative Region Coordinate

Proceedings of the 17th International Conference on Data Engineering
Querying XML Documents Made Easy: Nearest Concept Queries

Proceedings of the 17th International Conference on Data Engineering
Indexing and Querying XML Data for Regular Path Expressions

Proceedings of the 27th International Conference on Very Large Data Bases
XRANK: ranked keyword search over XML documents

Proceedings of the 2003 ACM SIGMOD international conference on Management of data
XSEarch: a semantic search engine for XML

VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
Schema-free XQuery

VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30

Evaluating interconnection relationship for path-based XML retrieval

WISE'06 Proceedings of the 7th international conference on Web Information Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

IR-style keyword-based search on XML document has become the most common tool for XML query, as users need not to know the structural information of the target XML document before constructing a query. For a keyword-based search engine for XML document, the key issue is how to return some sets of meaningfully related nodes to user’s query efficiently. An ordinary solution of current approaches is to store the relationship of each pair of nodes in an XML document to an index. Obviously, this will lead to serious storage overhead. In this paper, we propose an enhanced inverted index structure (PN-Inverted Index) that stores path information in addition to node ID, and import and extend the concept of LCA to PLCA. Efficient algorithms with these concepts are designed to check the relationship of arbitrary number of nodes. Compared with existing approaches, our approach need not create additional relationship index but just utilize the existing inverted index that is much common for IR-style keyword search engine. Experimental results show that with the promise of returning meaningful answers, our search engine offers great performance benefits. Although the size of the inverted index is increased, the total size of indices of search engine is smaller than the existing approaches.