A survey in indexing and searching XML documents
Journal of the American Society for Information Science and Technology - XML
A structured documents retrieval method supporting attribute-based structure information
Proceedings of the 2002 ACM symposium on Applied computing
Adding Flexibility to Structure Similarity Queries on XML Data
FQAS '02 Proceedings of the 5th International Conference on Flexible Query Answering Systems
Indexing and Querying XML Data for Regular Path Expressions
Proceedings of the 27th International Conference on Very Large Data Bases
Content and structure in indexing and ranking XML
Proceedings of the 7th International Workshop on the Web and Databases: colocated with ACM SIGMOD/PODS 2004
Beyond information searching and browsing: acquiring knowledge from digital libraries
Information Processing and Management: an International Journal - Special issue: An Asian digital libraries perspective
Choosing document structure weights
Information Processing and Management: an International Journal
Structure and content scoring for XML
VLDB '05 Proceedings of the 31st international conference on Very large data bases
Integrating Structure in the Probabilistic Model for Information Retrieval
WI-IAT '08 Proceedings of the 2008 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology - Volume 01
Flexible document-query matching based on a probabilistic content and structure score combination
Proceedings of the 2010 ACM Symposium on Applied Computing
Field-weighted XML retrieval based on BM25
INEX'05 Proceedings of the 4th international conference on Initiative for the Evaluation of XML Retrieval
No tag, a little nesting, and great XML keyword search
AIRS'06 Proceedings of the Third Asia conference on Information Retrieval Technology
KCAM: concentrating on structural similarity for XML fragments
WAIM '06 Proceedings of the 7th international conference on Advances in Web-Age Information Management
Ranked retrieval of structured documents with the s-term vector space model
INEX'04 Proceedings of the Third international conference on Initiative for the Evaluation of XML Retrieval
Hi-index | 0.00 |
This paper proposes a new approach to querying collections of structured textual information such as SGML/XML documents. Knowledge about the structure of documents is an additional resource that should be exploited during retrieval since the semantics of the different textual objects can be used to specify information need much more precisely. However, the traditional probabilistic retrieval model lacks the ability to handle structural information. We define a new retrieval function based on the probabilistic model, which overcomes this drawback. The presented query language allows the assignment of structural roles to individual terms. The efficient evaluation of queries in this framework requires appropriate index structures. We design text and structure indexes and show how their information is combined during evaluation. The implementation supports additional functionalities such as a table of contents for browsing. First evaluation results show the feasibility of the approach on collections of unstructured documents.