Methods of Relevance Ranking and Hit-content Generation in Math Search

Authors:
Abdou S. Youssef
Affiliations:
Department of Computer Science, The George Washington University, Washington DC, 20052, USA
Venue:
Calculemus '07 / MKM '07 Proceedings of the 14th symposium on Towards Mechanized Mathematical Assistants: 6th International Conference
Year:
2007

Citing 8
Cited 7

Searching techniques for integral tables

ISSAC '95 Proceedings of the 1995 international symposium on Symbolic and algebraic computation
Modern Information Retrieval

Modern Information Retrieval
Introduction to Modern Information Retrieval

Introduction to Modern Information Retrieval
Technical Aspects of the Digital Library of Mathematical Functions

Annals of Mathematics and Artificial Intelligence
A Query Language for a Metadata Framework about Mathematical Resources

MKM '03 Proceedings of the Second International Conference on Mathematical Knowledge Management
Design of a Digital Mathematical Library for Science, Technology and Education

ADL '99 Proceedings of the IEEE Forum on Research and Technology Advances in Digital Libraries
Roles of math search in mathematics

MKM'06 Proceedings of the 5th international conference on Mathematical Knowledge Management
Information retrieval and rendering with MML query

MKM'06 Proceedings of the 5th international conference on Mathematical Knowledge Management

Augmenting Presentation MathML for Search

Proceedings of the 9th AISC international conference, the 15th Calculemas symposium, and the 7th international MKM conference on Intelligent Computer Mathematics
A Review of Mathematical Knowledge Management

Calculemus '09/MKM '09 Proceedings of the 16th Symposium, 8th International Conference. Held as Part of CICM '09 on Intelligent Computer Mathematics
A new mathematics retrieval system

CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
Indexing and searching mathematics in digital libraries: architecture, design and scalability issues

MKM'11 Proceedings of the 18th Calculemus and 10th international conference on Intelligent computer mathematics
MathWebSearch 0.5: scaling an open formula search engine

CICM'12 Proceedings of the 11th international conference on Intelligent Computer Mathematics
Retrieving documents with mathematical content

Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval
Structural similarity search for mathematics retrieval

CICM'13 Proceedings of the 2013 international conference on Intelligent Computer Mathematics

Quantified Score

Hi-index	0.00

Visualization

Abstract

To be effective and useful, math search systems must not only maximize precision and recall, but also present the query hits in a form that makes it easy for the user to identify quickly the truly relevant hits. To meet that requirement, the search system must sort the hits according to domain-appropriate relevance criteria, and provide with each hit a query-relevant summary of the hit target.The standard relevance measures in text search, which rely mostly on keyword frequencies and document sizes, turned out to be inadequate in math search. Therefore, alternative relevance measures must be defined, which give more weight to certain types of information than to others and take into account cross-reference statistics. In this paper, new, multidimensional relevance metrics are defined for math search, methods for computing and implementing them are discussed, and comparative performance evaluation results are presented.Query-relevant hit-summary generation is another factor that enables users to quickly determine the relevance of the presented hits. Although the hit title accompanied by a few leading sentences from the target document is simple to produce, this often fails to convey to the user the document's relevant excerpts. This shifts the burden onto the user to pursue many of the hits, and read significant portions of their target documents, to finally locate the wanted documents. Clearly, this task is too time-consuming and should be largely automated. This paper presents query-relevant hit-summary generation methods, outlines implementation strategies, and presents performance evaluation results.