Relaxed global term weights for XML element search

  • Authors:
  • Atsushi Keyaki;Kenji Hatano;Jun Miyazaki

  • Affiliations:
  • Graduate School of Information Science, Nara Institute of Science and Technology and Graduate School of Culture and Information Science, Doshisha University, Kyoto, Japan;Faculty of Culture and Information Science, Doshisha University, Kyoto, Japan;Graduate School of Information Science, Nara Institute of Science and Technology, Ikoma, Nara, Japan

  • Venue:
  • INEX'10 Proceedings of the 9th international conference on Initiative for the evaluation of XML retrieval: comparative evaluation of focused retrieval
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

XML element search engines return XML elements which are part of XML documents as search results. Existing studies related to XML element search are brought from the information retrieval techniques for document search. There are some ways to calculate global weights of each term from statistics of XML elements with 1) the same path expression or 2) the same tag. In the first approach, the more complex a path expression is, the less the number of XML elements with the path expression becomes. This is a problem that global term weights may be calculated using statistics of a few XML elements. Such global weights are never global. The second approach also has a problem that it does not consider document structures of XML elements. To resolve the problems, we propose a method for calculating accurate global weights. In our method, we regard a path expression as an array of tags. We relax the restriction of appearance order and appearance frequency of tags in a path expression to gather similar path expressions into the same class. Therefore, we try to decrease the number of classes which hardly contain elements. Our experimental results show that our method can integrate path expressions without decreasing search accuracy with a certain test collection.