Term-weighting approaches in automatic text retrieval
Information Processing and Management: an International Journal
Coefficients of combining concept classes in a collection
SIGIR '88 Proceedings of the 11th annual international ACM SIGIR conference on Research and development in information retrieval
Evaluation of an inference network-based retrieval model
ACM Transactions on Information Systems (TOIS) - Special issue on research and development in information retrieval
Combining multiple evidence from different properties of weighting schemes
SIGIR '95 Proceedings of the 18th annual international ACM SIGIR conference on Research and development in information retrieval
Query expansion using local and global document analysis
SIGIR '96 Proceedings of the 19th annual international ACM SIGIR conference on Research and development in information retrieval
Analyses of multiple evidence combination
Proceedings of the 20th annual international ACM SIGIR conference on Research and development in information retrieval
SIGDOC '86 Proceedings of the 5th annual international conference on Systems documentation
AZuRE, a Scalable System for Automated Term Disambiguation of Gene and Protein Names
CSB '04 Proceedings of the 2004 IEEE Computational Systems Bioinformatics Conference
Gene name ambiguity of eukaryotic nomenclatures
Bioinformatics
An empirical evaluation of knowledge sources and learning algorithms for word sense disambiguation
EMNLP '02 Proceedings of the ACL-02 conference on Empirical methods in natural language processing - Volume 10
Exploring Species-Based Strategies for Gene Normalization
IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
Combining multiple disambiguation methods for gene mention normalization
Expert Systems with Applications: An International Journal
Hi-index | 0.00 |
Gene names and symbols are important biomedical entities, but are highly ambiguous. This ambiguity affects the performance of both information extraction and information retrieval systems in the biomedical domain. Existing knowledge sources contain different types of information about genes and could be used to disambiguate gene symbols. In this paper, we applied an information retrieval (IR) based method for human gene symbol disambiguation and studied different methods to combine various types of information from available knowledge sources. Results showed that a combination of evidence usually improved performance. The combination method using coefficients obtained from a logistic regression model reached the highest precision of 92.2% on a testing set of ambiguous human gene symbols.