Structural similarity search for mathematics retrieval

  • Authors:
  • Shahab Kamali;Frank Wm. Tompa

  • Affiliations:
  • David R. Cheriton School of Computer Science, University of Waterloo, Waterloo, ON, Canada;David R. Cheriton School of Computer Science, University of Waterloo, Waterloo, ON, Canada

  • Venue:
  • CICM'13 Proceedings of the 2013 international conference on Intelligent Computer Mathematics
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

Retrieving documents by querying their mathematical content directly can be useful in various domains, including education, engineering, patent research, physics, and medical sciences. As distinct from text retrieval, however, mathematical symbols in isolation do not contain much semantic information, and the structure of an expression must be considered as well. Unfortunately, considering the structure to calculate the relevance scores of documents results in ranking algorithms that are computationally more expensive than the typical ranking algorithms employed for text documents. As a result, current math retrieval systems either limit themselves to exact matches, or they ignore the structure completely; they sacrifice either recall or precision for efficiency. We propose instead an efficient end-to-end math retrieval system based on a structural similarity ranking algorithm. We describe novel optimizations techniques to reduce the index size and the query processing time, and we experimentally validate our system in terms of correctness and efficiency. Thus, with the proposed optimizations, mathematical contents can be fully exploited to rank documents in response to mathematical queries.