Learning to combine representations for medical records search

Authors:
Nut Limsopatham;Craig Macdonald;Iadh Ounis
Affiliations:
University of Glasgow, Glasgow, United Kingdom;University of Glasgow, Glasgow, United Kingdom;University of Glasgow, Glasgow, United Kingdom
Venue:
Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval
Year:
2013

Citing 13
Cited 2

Optimal document-indexing vocabulary for MEDLINE

Information Processing and Management: an International Journal - Special issue: history of information science
Predicting query performance

SIGIR '02 Proceedings of the 25th annual international ACM SIGIR conference on Research and development in information retrieval
Voting for candidates: adapting data fusion techniques for an expert search task

CIKM '06 Proceedings of the 15th ACM international conference on Information and knowledge management
Query performance prediction

Information Systems
Effective pre-retrieval query performance prediction using similarity and variability evidence

ECIR'08 Proceedings of the IR research, 30th European conference on Advances in information retrieval
Estimating the Query Difficulty for Information Retrieval

Estimating the Query Difficulty for Information Retrieval
A cross-lingual framework for monolingual biomedical information retrieval

CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
Parallel boosted regression trees for web search ranking

Proceedings of the 20th international conference on World wide web
Bagging gradient-boosted trees for high precision, low variance ranking models

Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval
Disambiguating biomedical acronyms using EMIM

Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval
A novel local patch framework for fixing supervised learning models

Proceedings of the 21st ACM international conference on Information and knowledge management
A task-specific query and document representation for medical records search

ECIR'13 Proceedings of the 35th European conference on Advances in Information Retrieval
Inferring conceptual relationships to improve medical records search

Proceedings of the 10th Conference on Open Research Areas in Information Retrieval

Learning to handle negated language in medical records search

Proceedings of the 22nd ACM international conference on Conference on information & knowledge management
Learning to selectively rank patients' medical history

Proceedings of the 22nd ACM international conference on Conference on information & knowledge management

Quantified Score

Hi-index	0.00

Visualization

Abstract

The complexity of medical terminology raises challenges when searching medical records. For example, 'cancer', 'tumour', and 'neoplasms', which are synonyms, may prevent a traditional search system from retrieving relevant records that contain only synonyms of the query terms. Prior works use bag-of-concepts approaches, to deal with this by representing medical terms sharing the same meanings using concepts from medical resources (e.g. MeSH). The relevance scores are then combined with a traditional bag-of-words representation, when inferring the relevance of medical records. Even though the existing approaches are effective, the predicted retrieval effectiveness of either the bag-of-words or bag-of-concepts representation, which may be used to effectively model the score combination and hence improve retrieval performance, is not taken into account. In this paper, we propose a novel learning framework that models the importance of the bag-of-words and the bag-of-concepts representations, combining their scores on a per-query basis. Our proposed framework leverages retrieval performance predictors, such as the clarity score and AvIDF, calculated on both representations as learning features. We evaluate our proposed framework using the TREC Medical Records track's test collections. As our proposed framework can significantly outperform an existing approach that linearly merges the relevance scores, we conclude that retrieval performance predictors can be effectively leveraged when combining the relevance scores.