A dynamic window based passage extraction algorithm for genomics information retrieval

Authors:
Qinmin Hu;Xiangji Huang
Affiliations:
Department of Computer Science & Engineering, York University, Toronto, Ontario, Canada;School of Information Technology, York University, Toronto, Ontario, Canada
Venue:
ISMIS'08 Proceedings of the 17th international conference on Foundations of intelligent systems
Year:
2008

Citing 5
Cited 1

Boosting performance of bio-entity recognition by combining results from multiple systems

Proceedings of the 5th international workshop on Bioinformatics
Concept-based biomedical text retrieval

SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
Applying Data Mining to Pseudo-Relevance Feedback for High Performance Text Retrieval

ICDM '06 Proceedings of the Sixth International Conference on Data Mining
Knowledge-intensive conceptual retrieval and passage extraction of biomedical literature

SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
An empirical study of tokenization strategies for biomedical information retrieval

Information Retrieval

Genomics information retrieval using a Bayesian model for learning and re-ranking

Proceedings of the First ACM International Conference on Bioinformatics and Computational Biology

Quantified Score

Hi-index	0.00

Visualization

Abstract

Passage retrieval is important for the users of the biomedical literature. How to extract a passage from a natural paragraph presents a challenge problem. In this paper, we focus on analyzing the gold standard of the TREC 2006 Genomics Track and simulating the distributions of standard passages. Hence, we present an efficient dynamic window based algorithm with a WordSentenceParsed method to extract passages. This algorithm has two important characteristics. First, we obtain the criteria for passage extraction through learning the gold standard, then do a comprehensive study on the 2006 and 2007 Genomics datasets. Second, the algorithm we proposed is dynamic with the criteria, which can adjust to the length of passage. Finally, we find that the proposed dynamic algorithm with the WordSentenceParsed method can boost the passage-level retrieval performance significantly on the 2006 and 2007 Genomics datasets.