Structural similarity search for mathematics retrieval

Authors:
Shahab Kamali;Frank Wm. Tompa
Affiliations:
David R. Cheriton School of Computer Science, University of Waterloo, Waterloo, ON, Canada;David R. Cheriton School of Computer Science, University of Waterloo, Waterloo, ON, Canada
Venue:
CICM'13 Proceedings of the 2013 international conference on Intelligent Computer Mathematics
Year:
2013

Citing 15
Cited 0

Simple fast algorithms for the editing distance between trees and related problems

SIAM Journal on Computing
Searching techniques for integral tables

ISSAC '95 Proceedings of the 1995 international symposium on Symbolic and algebraic computation
A Query Language for a Metadata Framework about Mathematical Resources

MKM '03 Proceedings of the Second International Conference on Mathematical Knowledge Management
MathFind: a math-aware search engine

SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
Methods of Relevance Ranking and Hit-content Generation in Math Search

Calculemus '07 / MKM '07 Proceedings of the 14th symposium on Towards Mechanized Mathematical Assistants: 6th International Conference
Communicating Mathematics via Pen-Based Interfaces

SYNASC '08 Proceedings of the 2008 10th International Symposium on Symbolic and Numeric Algorithms for Scientific Computing
A new mathematics retrieval system

CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
System description: EgoMath2 as a tool for mathematical searching on wikipedia.org

MKM'11 Proceedings of the 18th Calculemus and 10th international conference on Intelligent computer mathematics
The art of mathematics retrieval

Proceedings of the 11th ACM symposium on Document engineering
Math Spotting: Retrieving Math in Technical Documents Using Handwritten Query Images

ICDAR '11 Proceedings of the 2011 International Conference on Document Analysis and Recognition
Information retrieval and rendering with MML query

MKM'06 Proceedings of the 5th international conference on Mathematical Knowledge Management
A search engine for mathematical formulae

AISC'06 Proceedings of the 8th international conference on Artificial Intelligence and Symbolic Computation
XML information retrieval through tree edit distance and structural summaries

AIRS'11 Proceedings of the 7th Asia conference on Information Retrieval Technology
A math-aware search engine for math question answering system

Proceedings of the 21st ACM international conference on Information and knowledge management
Retrieving documents with mathematical content

Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval

Quantified Score

Hi-index	0.00

Visualization

Abstract

Retrieving documents by querying their mathematical content directly can be useful in various domains, including education, engineering, patent research, physics, and medical sciences. As distinct from text retrieval, however, mathematical symbols in isolation do not contain much semantic information, and the structure of an expression must be considered as well. Unfortunately, considering the structure to calculate the relevance scores of documents results in ranking algorithms that are computationally more expensive than the typical ranking algorithms employed for text documents. As a result, current math retrieval systems either limit themselves to exact matches, or they ignore the structure completely; they sacrifice either recall or precision for efficiency. We propose instead an efficient end-to-end math retrieval system based on a structural similarity ranking algorithm. We describe novel optimizations techniques to reduce the index size and the query processing time, and we experimentally validate our system in terms of correctness and efficiency. Thus, with the proposed optimizations, mathematical contents can be fully exploited to rank documents in response to mathematical queries.