Term-weighting approaches in automatic text retrieval
Information Processing and Management: an International Journal
Simple BM25 extension to multiple weighted fields
Proceedings of the thirteenth ACM international conference on Information and knowledge management
Effective keyword search in relational databases
Proceedings of the 2006 ACM SIGMOD international conference on Management of data
Introduction to Information Retrieval
Introduction to Information Retrieval
Overview of the INEX 2008 Ad Hoc Track
Advances in Focused Retrieval
Overview of the INEX 2010 ad hoc track
INEX'10 Proceedings of the 9th international conference on Initiative for the evaluation of XML retrieval: comparative evaluation of focused retrieval
Field-weighted XML retrieval based on BM25
INEX'05 Proceedings of the 4th international conference on Initiative for the Evaluation of XML Retrieval
Fast and incremental indexing in effective and efficient XML element retrieval systems
Proceedings of the 14th International Conference on Information Integration and Web-based Applications & Services
Hi-index | 0.00 |
XML element search engines return XML elements which are part of XML documents as search results. Existing studies related to XML element search are brought from the information retrieval techniques for document search. There are some ways to calculate global weights of each term from statistics of XML elements with 1) the same path expression or 2) the same tag. In the first approach, the more complex a path expression is, the less the number of XML elements with the path expression becomes. This is a problem that global term weights may be calculated using statistics of a few XML elements. Such global weights are never global. The second approach also has a problem that it does not consider document structures of XML elements. To resolve the problems, we propose a method for calculating accurate global weights. In our method, we regard a path expression as an array of tags. We relax the restriction of appearance order and appearance frequency of tags in a path expression to gather similar path expressions into the same class. Therefore, we try to decrease the number of classes which hardly contain elements. Our experimental results show that our method can integrate path expressions without decreasing search accuracy with a certain test collection.